From Zubulake To Information Governance: Enabling Big Data Projects By Applying Use Cases


Ms. Zubulake presented first. She was the plaintiff in the groundbreaking Zubulake case (dated from 2002 to 2005) that resulted in several legal opinions concerning electronic discovery (collectively “Zubulake”). Zubulake transformed the legal industry, established precedent in the field of e-discovery, and provided an early example of information governance (“IG”). Not an attorney, she brought to the case a real-world perspective that stemmed from 20 years of experience on Wall Street covering mutual, hedge, and pension funds in the areas of convertible securities, derivatives, macro-economics, asset allocation, risk management, and Asian equities. Thus, her litigation strategy was based on managing risk and deriving value from assets (in this case, data), and in the process, she essentially created her own IG program. Today, Ms. Zubulake is a sought-after consultant in the areas of e-discovery and IG, including as a speaker for organizations such as the U.S. Department of Justice and the American Health Information Management Association (AHIMA).

While Zubulake is renowned for e-discovery, it was really about IG. In this instance, pertaining to Ms. Zubulake’s search for critical discovery, the electronic evidence predominantly consisted of emails that she was led to believe did not exist. She already understood how information was created and organized, so she knew what should have been available. Thus, she could accurately assess the value of this “electronic discovery” and, above all, be persistent in efforts to recover it.

At the time, big data was an unknown concept. Applicable technology had not yet been envisioned, much less developed, so Ms. Zubulake adopted a more analogue solution and took a businessperson’s approach to organizing the information she ultimately obtained. She created a spreadsheet and categorized emails by different parameters, which then could be sorted and accessed in contexts such as writing motions and deposing witnesses. Her activist strategy was pivotal to the historic outcome of Zubulake. Importantly, it also represents the advent of IG practices as we know them today.


Today’s business reality is surrounded by the fact that big data is fast becoming “huge data.” Information – meaning data that has been processed, organized, interpreted and contextualized – is increasingly considered to be a corporate asset. Because disorganized information is unproductive and poses risks, companies are looking to develop practices and technology that transform data into information and then organize that information to generate “useful information,” which provides the basis for business insight. This process is known as the information value chain.

As data continues to demonstrate business value, corporations are looking to leverage it across various platforms, resulting in a transition from siloed structures to enterprise architectures that address key issues, from data availability and performance to regulatory compliance, security, privacy, and cross-functionality across the organization. As a result, firms are integrating legacy data with new sources and implementing governance programs.

Information Governance

By its nature, data makes IG more difficult. It is fraught with risk and ever-more complex forms of liability, both as volumes increase exponentially and as the types and sources of data continue to expand. Data is a reality that many enterprises don’t have the knowledge, vision and resources to manage, much less leverage productively. Enter Information Governance.

Briefly, IG refers to the proactive management of information through the entire data lifecycle, from creation to destruction, so as to maximize return and minimize risk. Proactive data management requires firms to establish policies, assign accountability and educate staff, all through an organic process that can adapt to the latest technology and become fundamental to the corporate culture.

Many people like to visualize IG according to the Information Governance Reference Model, which identifies three primary stakeholders: business users, IT and legal. The model encourages dialogue across all three, including records management and focusing on the dependencies across functions and stakeholders.

The ABCs Of Data

The challenges of data are multi-dimensional and dynamic, as are methods of expressing them. First there were the “Vs” of data: volume, variety, velocity, veracity, viability and value. Newer schools of thought identified the “Cs” to more aptly examine the role of big data in business: culture, competency, capability, connectivity, complexity and convergence.

The dark side of data is expressed with the “Ps”: practicality, privacy, power and privilege, which warrant some expansion (“The Dark Side of Big Data,” Sept 12, 2014, University of Pennsylvania). Practicality encompasses costs and expresses the need for talented engineers, analysts and scientists to handle data; privacy pertains to the need to leverage data through viable and compliant access; power implies the dangers of gamification as it relates to manipulating data, and begs the question of how data will be used; and privilege highlights social issues that permeate all forms of access to things of value.

Data v. IG

If you’re tempted to draw the conclusion that data is at odds with IG, you’re on the right track. Data wants to run wild, and IG wants to tame it. Therefore, the point behind implementing an IG strategy is to unlock the value of data and improve decision making, while institutionalizing policies about risk management, accountability and budget optimization. Your data will not wait; it’s time to be proactive. The question is how.

* * *

Enabling Big Data Projects By Applying IG Use Cases

Mr. Brudz picked up the discussion with a question: Why is IG important? Put simply, data is the new oil, and organizations are realizing that there’s money in it. Tapping this resource, however, requires that they face a sobering reality: IG is not cookie cutter. It is terribly specific to corporate needs and workflows. Enter Use Cases.

Use cases are essential to the development of applications that implement IG initiatives. They describe how information is used across the enterprise and what institutional challenges ensue, and they are defined by business users, tax users, marketing users, and even “random users.” Typical challenges they present relate to compliance, privacy and security, as well as the general goal of enhancing business processes and profitability.

Comprising a series of questions, use cases facilitate a productive understanding of corporate data. All contingencies must be covered regarding the substance and location of data, its use restrictions, its relevance to business goals, and which stakeholders should get involved in the IG planning process.

 Getting Started

The first step is to kick off a steering committee and start generating use cases for your software developers. Specifically, map out which actions produce which outcomes. There is an entire body of theory about use cases, with its own mark-up language and special diagrams, which is used by designers to ensure that their software covers all situations. Relevant issues are framed simply: If the user does A, then the software goes here; if B, then it will go there. But what happens when the user does none of the above? Writing good code depends on this practical input, rather than a mere conceptual understanding of expectations. Further, your IT developers are more likely to buy into the IG process if they are engaged on terms that are organic to their work, so use cases also build credibility and strengthen the project.

The legal analogy is writing a contract. For instance, drafting boilerplate terms and conditions in the retail space is a nightmare because the contract must accommodate every contingency. There are no constraints, and the consumer input is random. Further, you may spend hours in fist-pounding negotiations only to learn that your business people don’t care about a disputed provision. Knowing their needs in advance will save time and deliver better results.

Use Cases

On its own, e-discovery is a relatively simple example because it involves only four use cases: identifying potentially relevant information, preserving data, extracting data, and releasing the hold when the litigation is done. Adding an IG overlay complicates the process with the need to address issues like security, privacy and business-continuity planning. IG demands use cases that are context-specific and variable.

To provide just a sampling, use cases can help developers implement data-access privileges throughout an organization. Privacy use cases are concerned with the storage, retrieval and strategic use of personally identifiable information, all complicated by the need to work within variable global jurisdictions. Business development use cases may cover how consumer information is used for prospecting and may involve contextual questions relating to past purchases, response rates on other communications, membership in loyalty programs, or analytics from online banner advertising. Marketing use cases may involve the Telephone Consumer Protection Act (“TCPA”) when a company sends promotional codes to customers. And what happens when the rules change? Before recent TCPA updates, customers checked a box to consent generically to being contacted. That consent may no longer be sufficient, so a new use case will cover how to map old consent to the new system.

A more esoteric use case might address how to implement the “right to be forgotten,” which implies the need to destroy data. A socially conscious company may take this right very seriously; however, for database managers, deleting information is simply anathema. Data destruction breaks referential integrity; it’s the database equivalent of dividing by zero. The issues go deeper when you realize that, in order to “forget” people, they have to consent being “remembered” just long enough to be forgotten. So you have to manage a forward-looking need to stop future tracking, as well as a backward-looking need to forget what’s already been tracked, constituting two distinct use cases.


Data is the next frontier for companies looking to stay competitive by mining and making the most of their information assets. Practical solutions require a proactive approach to data management that includes use cases that pose relevant questions during the process-development and software-design phases. In doing so, effective IG initiatives will comprehend governance strategies, day-to-day business needs, risk contingencies, and the technological challenges in developing applications that help companies succeed. 

Status and Options
Other Topics: 
Information Governance