Hidden Content In Your Documents: What You Don't Know Can Be Dangerous

Tuesday, October 4, 2011 - 01:00

Hidden information in documents can pose serious risks in all types of legal proceedings, yet many people are totally unaware of this danger. A potent example of hidden information is document metadata - hidden information contained in Microsoft®Office documents, including Microsoft Word, Microsoft Excel and Microsoft PowerPoint files. Whenever a document is created, edited or saved, metadata is automatically added to it at the system and application level. For example, Microsoft Word's popular collaboration features, such as comments and track changes, result in a significant amount of metadata being included in documents through the actions of the author. And this information can be unwittingly transmitted every time a document is emailed to someone either inside or outside the organization.

Originally conceived to make it easier to track and find document data, no one argues that metadata is useful when used properly and with an appropriate level of awareness. However, if counsel is careless or ignores the fact that metadata exists, it gives unauthorized people access to privileged information that can be used against you.

What Is The Risk?

The key to the problem is not that metadata is added to a document, but that it is difficult to fully identify and remove. For example, in Microsoft Word, adding comments and tracking changes are very helpful to users collaborating on a document. However, when a change is not accepted, it remains with the document, even though it appears to be invisible. These changes can easily be displayed by turning on the "show markup" view, which can result in embarrassing situations where external parties see information that was not intended for their eyes.

Metadata risks may be compounded when documents are attached from within Microsoft Outlook and sent to outside parties. Used as the de facto way to electronically exchange documents in most organizations, Microsoft Outlook does not offer warnings about the existence of metadata in attached documents or zipped files. Thus, the potential to accidentally send documents containing harmful metadata (and thus expose sensitive information with embarrassing or negative consequences) is amplified tremendously as documents are sent back and forth during the collaborative process.

High-Profile Cases Of Metadata

Metadata poses risks in legal scenarios when information disclosed is meant for internal use only. This can range from confidential statistics to exposing proprietary comments from document reviewers. There have been a number of surprising, high-profile cases involving the metadata in documents.

Here are examples of situations where metadata caused problems:

• Google: Private financial forecasting was revealed when hidden data was left in a PowerPoint presentation before posting it for the Wall Street community.

• Microsoft: Hidden data in Microsoft Office documents was discovered by the Associated Press and showed that Microsoft's advertising campaign highlighting a customer that switched from Apple to Microsoft was actually a member of their PR firm.

• Whole Foods: Hidden information in court documents disclosed that Whole Foods planned to close stores, revealed how Whole Foods negotiates with suppliers as well as other closely guarded marketing strategies.

• ATT: Confidential information contained in a PDF file revealed that ATT was spying on their customers.

• The British Prime Minister's Office: Hidden data in the UK government's "Dodgy Dossier," the document that helped propel the country into war, revealed a student paper was the source of the document.

• Barclays: An Excel spreadsheet contained 179 contracts within hidden columns that were then accidentally submitted in Barclays' buyout offer of Lehman Brothers assets.

• Alcatel: A security vulnerability in Alcatel's DSL modems was revealed in document metadata.

• The United Nations: Metadata revealed the UN office doctored a report on the murder of former Lebanese Prime Minister Rafik Hariri.

• Democratic National Committee: Judge Sam Alito inadvertently revealed his true beliefs on immigration laws and other issues when memos were released containing blacked-out data.

• Westpac: This oldest bank inAustralia revealed a full year of profit results via metadata before it was finalized and lodged with the Australian Stock Exchange.

• Merck: Metadata revealed that the company deleted vital information concerning the arthritis drug Vioxx, resulting in users having false information on heart attack risk associated with taking the drug.

• Sun Life Financial: Hidden data found in a document forced the company to release its fourth-quarter and year-end results ahead of schedule.

The bottom line: Metadata can put you at financial risk, a competitive disadvantage, or in an embarrassing situation with costly consequences.

Types Of Document Metadata And Their Associated Risks

Document metadata comes in many forms in Microsoft Office documents, each with its own risks. For example, risks may involve revealing information on file names and addresses that can make it easy for hackers to gather sensitive information, email addresses that are supposed to be private, or proprietary pricing information. One particularly troublesome and embarrassing area is the scenario where document statistics may show more hours than a document's total billing time.

Following are examples of areas in Microsoft Office documents where metadata problems may arise:

Document Properties

Applies to: Microsoft Word, Microsoft Excel, and Microsoft PowerPoint documents.

Document properties are details about a file that help identify that it includes a descriptive title, subject, author, manager, company, category, keywords, comments, and hyperlink base. Document properties display information about a file to help users organize the files so that they can be easily found at a later date.

Risks: The names of authors and the name of the organization can display sensitive information. If a document has been sent outside your own organization, the author name and company name contained in the built-in properties could be a name other than your own. Also, if documents are re-purposed or used as a template for a new document, information specific to a previous client (for example, pricing, terms or client's name) can be stored as hidden information within the new document.

Document Statistics & File Dates

Applies to: Microsoft Word documents only.

Document statistics include information on when the document was created, modified, accessed and printed. In addition, document statistics display the name of the person it was last saved by, the revision number and the total editing time. Other statistics include number of pages, paragraphs, lines, words and characters.

Risks: Document statistics can create embarrassing situations. For example, the "last saved by" metadata shows the last person who edited the document and can create discrepancies over who worked on a document.

Document Reviewers

Applies to: Microsoft Word documents only.

Document reviewers consist of a list of users that have added or accepted document changes.

Risks: Document reviewers' metadata exposes who has suggested what changes. Removing the names of reviewers can be as important as removing the changes they have suggested.

Custom Properties:

Applies to: Microsoft Word, Microsoft Excel, and Microsoft PowerPoint documents.

Custom Properties includes any property fields added manually to a document or by various programs to help manage and track files.

Risks: Custom Properties are normally specific to an organization. Common types of custom properties are document ID, department and status. Custom properties can reveal proprietary information or competitive business practices.

Other metadata problems in Microsoft Word, Microsoft Excel and Microsoft PowerPoint documents are found in Headers and Footers and Comments. In Microsoft Word, metadata issues arise in Hidden Text, Footnotes, White Text, Small Text, Macros, Previous Versions and Fast Saves. In Microsoft Word and Excel, metadata can be problematic with Track Changes and Document Revisions, Routing Slips and Hyperlinks. And one troubling metadata area in Microsoft PowerPoint is with Hidden Slides.

Metadata Legal Opinions

The handling of metadata has created conflicting legal opinions. While some suggest that metadata should be available to the public, others stand by the idea that it should be the owner's choice as to whether they should make metadata public information or not. Here are some examples of the various opinions:

• American Bar Association Ethics Opinion: Lawyers receiving electronic documents are free to examine hidden metadata.

Nova Measuring Instruments Ltd. v. Nanometrics, Inc .: Metadata should be produced.

Wyeth v. Impax Laboratories, Inc .: Production of metadata is not required absent a strong showing of particularized need.

• The Alabama State Bar Commission believes that it is acceptable to hide metadata if this means that clients' secrets are kept confidential.

• The Arizona State Bar stands by the notion that it is the duty of the lawyer to protect his or her client by making sure that metadata does not fall in the hands of the wrong person.

• The West Virginia Bar Association does not believe that lawyers have the right to view metadata that is meant to be hidden

Big Pond Communications v. Kennedy 70 O.R. (3d) 115 held that the path-and-filename metadata found at the bottom of the statement of claim, even though irrelevant, came within the ambit of the pleading, was privileged and was therefore not actionable.


Metadata can serve useful purposes for identifying, indexing and managing documents. It is critical to understand how metadata is created, where it is stored in a document, and how it changes, especially when collaborating. All types of metadata can reveal confidential information that may result in embarrassing incidents, competitive disadvantage or outright legal action against your organization.

Metadata can and generally should be cleaned before distributing a document, though it is surprisingly easy to send a document with confidential information outside the organization, particularly with the proliferation of smart phones and tablet devices that enable one to view, edit and redistribute documents via email, further increasing the risk of revealing confidential information outside a system's firewall.

Although there are conflicting legal opinions about the handling of document metadata, it is clear that this is an issue that deserves attention by every organization producing electronic documentation.

Scott Smull is CEO of Workshare and has nearly 30 years of proven executive leadership, most notably guiding companies through periods of double-digit growth. As CEO he ensures Workshare's document collaboration software enables business professionals to accurately create, collaborate and control high-value content efficiently.

Please email the author at scott.smull@workshare.com with questions about this article.