Feature #8416

Document MAT and metadata

Added by sajolida over 5 years ago. Updated about 2 months ago.

Target version:
Start date:
Due date:
% Done:


Feature Branch:
Type of work:
End-user documentation
Affected tool:


Our main documentation index lacks a page about metadata and MAT.

It should also mention pdf-redact-tools, tesseract-ocr, and ffmpeg. See #17178.

The homepage of MAT ( is pretty good at introducing what metadata are. So we should definitely point to it, and maybe reuse parts of it.


Feature #8417: Mention MAT on warning pageRejected

Feature #5542: Help MAT upstream to write documentation Rejected

Feature #7368: Point to documentation about metadata from doc/sensitive_documents/graphicsResolved

Feature #17449: Instructions on MATConfirmed

Related issues

Related to Tails - Bug #15398: Document pdf-redact-tools Rejected 03/12/2018


#1 Updated by BitingBird about 5 years ago

when documented, it should also be linked from the "features" page.

#2 Updated by ArgySan almost 5 years ago

  • Assignee set to ArgySan

If it's OK, I'd like to pump out some documentation for this, utilizing the MAT ( page, and other, maybe more 'general' wording for non-techies.

#3 Updated by BitingBird almost 5 years ago

Great! Ask for help or reviews along the way if you need any, I'll be glad to help :)

#4 Updated by ArgySan almost 5 years ago

Let me know, is this way too simplistic?


What is MAT?

MAT (Metadata Anonymisation Toolkit) is a toolkit that anonymizes and removes metadata from your files. It does this utilizing a library, a GUI application and, if you prefer, a CLI application.

How does it work?

Simply put, MAT removes all metadata from files leaving them empty. Unfortunately, watermarks and steganographic tags won’t be removed but unlike metadata being added by default by many utilities, watermarks are not usually inadvertently added and the original author will likely be aware of their existence. Basically, MAT will protect you from accidental metadata leakage, but not customized metadata specifically included to track down you, the author.

Why do we care?

Because just about every file being uploaded to the internet contains metadata. From Office documents to .flac audio files and beyond, they all have metadata embedded, and that metadata tells the world where, when, and most crucially, who uploaded it. This defeats the purpose of Tails and our ‘privacy for anyone, anywhere’ mantra.

So, to ensure you stay anonymous, Tails comes with MAT included.

Currently supported files include:

  • Portable Network Graphics (.png)
  • JPEG (.jpg, .jpeg, etc)
  • Open Documents (.odt, .odx, .ods, etc)
  • Office OpenXml (.docx, .pptx, .xlsx, etc)
  • Portable Document Fileformat (.pdf)
  • Tape ARchives (.tar, .tar.bz2, etc)
  • MPEG AUdio (.mp3, .mp2, .mp1, etc)
  • Ogg Vorbis (.ogg, etc)
  • Free Lossless Audio Codec (.flac)
  • Torrent (.torrent)

MAT can be accessed via: Applications > System Tools in the Tails GUI.


#5 Updated by sajolida almost 5 years ago

  • Description updated (diff)

#6 Updated by sajolida almost 5 years ago

  • QA Check set to Dev Needed

Hi, thanks a lot for working on this.

I don't think it's too simplistic and rather think that it should be further simplified :) You're providing the right information and it is well structured but here are a bunch of recommendations that will make you text even better:

  • I would restructure the page to remove the first two titles and merge their content as an intro in a single block without title, like we do on /doc/advanced_topics/paperkey/.
  • Explain metadata in simple words, cf. /doc/sensitive_documents/metadata/ and /doc/about/warning/. If you manage to only insert and not modify parts of /doc/sensitive_documents/metadata/, then I can maybe create an inline to include the same text in both places.
  • Maybe some examples here would explain the best what are metadata. Maybe you can reuse stuff that's already on the MAT homepage:
  • I'm not sure about the need of the title for the section "Why do we care?". Maybe reusing stuff from those warnings might be enough. But we can see later.
  • I like the list of supported formats. Maybe we could make it a bit more compact by having only one line for each of image formats, documents, audio formats.
  • Maybe we can further simplify this list by listing only the corresponding file name extensions when they really bring something. For example, I'm not sure it's needed to specify for JPEG, Ogg Vorbis, Torrent, and maybe others. Now I see that you copied this list from MAT (which was a good idea). But maybe our objectives here are a bit different. In Tails we want to make our users understand as quickly as possible if MAT will work on their files, while in MAT they have to provide a complete and technically unambiguous list for reference.
  • Can you find a title for the page looking at the style we usually use for other tools? That would go in the [[!meta title=""]] directive.
  • Check how we write the markup for menu paths in /contribute/how/documentation/guidelines/.
  • Add a link to upstream (
  • We try to write our doc as consise as possible. It's ok to have as very few words on a page as long as they provide the information we need. People will get it faster and translators will have less work to do. In the light on this, I suggest you edit your text again once or two and look for words or parts of sentences that can be removed without loosing meaning. You can probably lower your total word count by 30% or 50% while still improving the quality of the text. For example, I could spot " and our ‘privacy for anyone, anywhere’ mantra", "So, to ensure you stay anonymous", " in the Tails GUI".
  • Try to stay neutral regarding what people might or might not find "easy" or "simple". For example I would replace or get rid of "Simply put".
  • Avoid jargon, use only words that people are likely to know already or that you want them to learn by providing an explanation. I'm not sure of "watermarks", "steganographic", "embedded" are likely to be understood. Remember than many (if not most) of our readers are not native English speakers.
  • Avoid the future tense ("will") which most of the time introduces unecessary ambiguity about when the action will happen. For example "the original author will likely be aware" → "the original author is likely aware", "MAT will protect you" → "MAT protects you".
  • Limit your sentence to 20-25 words in general. Even if providing the same content, breaking it into several sentences make it easier to read and understand.
  • "Toolkit" → "tool"
  • "Utilize" → "use", see "Never use a long word where a short one will do."
  • Avoid abbreviations like "GUI" (maybe "graphical" or nothing will do) and "CLI" (see what we do on /doc/advanced_topics/paperkey/).
  • If you want to go deeper into these consideration, the GNOME Documentation Style Guide is a very good start. See

Good luck with all that :)

Once you're done you can assign the ticket to me and mark it's "QA Check" as "Ready for QA".

#7 Updated by sajolida almost 5 years ago

  • Blueprint set to

Oh, and I forgot. I create you a blueprint on which you can work from the website: Click on the wrench icon and then Edit.

#8 Updated by BitingBird over 4 years ago

ArgySan, do you still plan to work on this? If not, please deassign yourself. If yes, youpi :)

#9 Updated by BitingBird over 3 years ago

  • Assignee deleted (ArgySan)
  • QA Check deleted (Dev Needed)

No answer in a year -> unassigning

#10 Updated by Anonymous about 2 years ago

We actually do have a page about metadata which links to MAT in doc/sensitive_documents/metadata.mdwn

Based on this and on the work previously done here, we could make a nice documentation.
Anybody up for this, don't hesitate to work on this ticket.

The code of our wiki lives here:

#11 Updated by Anonymous about 2 years ago

  • Starter set to Yes

#12 Updated by sajolida about 2 years ago

  • Blocks Feature #14758: Core work 2017Q4 → 2018Q1: Technical writing added

#13 Updated by sajolida about 2 years ago

  • Related to Bug #15398: Document pdf-redact-tools added

#14 Updated by sajolida about 2 years ago

  • Blocks Feature #15411: Core work 2018Q2 → 2018Q3: Technical writing added

#15 Updated by sajolida about 2 years ago

  • Blocks deleted (Feature #14758: Core work 2017Q4 → 2018Q1: Technical writing)

#16 Updated by sajolida almost 2 years ago

  • Assignee set to cbrownstein

Assigning to Cody as proposed when we met last week.

#17 Updated by sajolida over 1 year ago

  • Assignee deleted (cbrownstein)

#18 Updated by sajolida over 1 year ago

  • Blocks deleted (Feature #15411: Core work 2018Q2 → 2018Q3: Technical writing)

#19 Updated by sajolida 3 months ago

  • Subject changed from Document MAT to Document metadata analysis tools
  • Description updated (diff)

#20 Updated by sajolida about 2 months ago

  • Subject changed from Document metadata analysis tools to Document MAT and metadata

Also available in: Atom PDF