The Cost Of eDiscovery: What Drives It, How To Reduce It

Tuesday, May 31, 2011 - 01:00

In 1980, IBM produced the world's first gigabyte hard drive. It weighed 550 pounds and was the size of a refrigerator.1Commercially, that same year, the first 51 /4-inch floppy drive was released, with a capacity of five megabytes. It wasn't until 1991 that a commercial 100 megabyte hard drive was available.Even at the turn of the millennium, commercial hard drive size was in the megabyte range. But then technology ran ahead: in 2005, the first 500 gigabyte drive shipped. In 2006, drive size was up to 750 megabtyes. By 2007, it was 1 terabyte. Between 2008 and 2010, drive size rose from 1.5 terabytes to two terabytes to three. Today, drive size is at four terabytes.

This exponential increase in storage has had a disproportionate effect on litigation. The former "gold standard" of document review, manual linear review, is no longer up to the task. From a pure mathematical perspective, human beings cannot look at the volume of documents that exist in today's age given other limiting factors like time and, more importantly, cost. In fact, an AmLaw 100 firm recently estimated that document review costs account for almost one-half of a typical proceeding's budget.2

Beyond the arithmetic, human review injects cost into the process because it is remarkably inaccurate. In one study, human review was found to be, at best, 75 percent accurate, and at worst, 25 percent accurate.3Another inefficiency that often leads to increased cost is the lack of ownership of eDiscovery. As Craig Ball famously said, "the reality of electronic discovery is it starts off as the responsibility of those who don't understand the technology and ends up as the responsibility of those who don't understand the law."4

Costs related to eDiscovery are significant. A recent case cited an instance where eDiscovery costs were assessed at over $4.6 million.5There are also potential case-ending sanctions, instances that are reported more and more often, and lesser sanctions, such as loss of privilege.6

If these factors - increasing data volumes, human review limitations and inefficiencies, and inefficiencies created by lack of ownership - are the cause of the high cost of review, what are some strategies companies can use to reduce these costs?

Pre-Litigation Strategies

Sun Tzu wrote that the best victory is that which is achieved before the fight. Those ancient words are just as true today: the best way to avoid eDiscovery costs is to address your information governance issues now. Companies don't know where their data is, or how it is organized, backed-up, or stored. Nightmare scenarios of "we just found a warehouse full of backup tapes," are usually accompanied by a knowing smile from their counsel, for whom this is an increasingly frequent occurrence. The best way to reduce the costs of eDiscovery is now, before an issue arises. Pre-litigation efforts can include sufficient document retention policies as well as using existing technology to the fullest.

A Stitch In Time Saves Nine

Even before a company addresses how and when it destroys documents, it should recognize that there is rarely a time when the creation of a document is mandated. Usually, employees sign up for their own defeat by creating documents during a crisis. These documents are often damaging, and are just as often factually inaccurate. In one company, after an unfortunate incident, an email was circulated describing in detail why the company was at fault. The email contained numerous factual errors, and there was no business need that spurred its author to create it. The existence of this email changed the reaction of the company to the incident, despite its falsehoods. To avoid similar situations, a company's crisis management policy should specifically discourage the creation of "summary" emails, and encourage in-person or telephone communication.

The term "document retention policy" is a euphemism. Alongside its crisis management policy, a company should have a policy describing how and when that company may destroy a document. The best "retention" policies will describe how documents are used within the company and set reasonable time periods after which different classes of documents will be destroyed. A company must then follow the policy . Little looks worse than a situation where a company has disobeyed its own policy and prematurely destroyed documents sought by an opposing party. On the other side of the equation, companies that keep documents beyond their policy expiration dates risk turning their retention policy into a sham and raising spoliation questions in every case.

Using Technology Pre-Litigation

The most difficult part of a document's lifecyle is categorization. In a perfect world, a document would be automatically categorized at its creation. That tag would then allow the company to take variability among document creators out of the equation for records management. Additionally, auto-categorization would allow the company to better implement its document retention policy as documents could have "expiration dates" assigned to them behind the scenes as soon as they are created. Every document destroyed according to policy is a document that will not have to be preserved, identified, reviewed or produced.

This is especially true for emails. Reviewing email is a necessary, time-consuming, cost-wasting effort. The signal-to-noise ratio for email is notoriously bad. Technology that allows users to effectively search, or auto-categorize emails is a significant value addition for companies.

Litigation Efforts

Once litigation is reasonably anticipated, the duty to preserve attaches, and the eDiscovery process begins. Document "retention" policies must be suspended, and depending on what jurisdiction applies, written litigation holds must be issued.7After ensuring that no relevant documents are destroyed going forward, companies engage in a process to understand the universe of those relevant documents. This process is called "early case assessment."

Beginning Stages

It is during the ECA phase of litigation that technology must take over. The natural result of early case assessment is the meet-and-confer meeting between counsel.8In that past, before technology allowed parties to understand the totality of their documents at such an early stage, the meet-and-confer was a patchwork of guessing - guessing which custodians might have relevant documents, guessing which keywords would return relevant and admissible documents, guessing how much the whole process would cost. But guessing is no longer necessary. Early case assessment technology provides such an advantage to effective users that in-house and outside counsel cannot afford not to use it.

Elementary ECA includes custodian identification, date-range culling, and data analytics. Conducted prior to the meet-and-confer, ECA allows counsel to know what search terms generate significant results, which custodians are most likely to have relevant documents, and whether there are "hot documents" within the corpus of documents the company has preserved.

In order to achieve this feat, technology has moved beyond keywords. Modern sophisticated search technology allows for "give me what I want, not what I asked for" searches. Often called "concept searching," this technology puts advanced mathematics to work, understanding the relationship between words in a document to assess their meaning, and the relevance of other documents within the corpus. For example, the use of its patented Probabilistic Latent Semantic Analysis allows Recommind, Inc.'s software to determine the use of the term "java" in different contexts. eDiscovery search is much more accurate when the technology being used is able to distinguish between documents containing the word "java" that are related to the computer language as opposed to those that relate to the Indonesian island or coffee, and can also decipher when both concepts are involved. What's most important for this technology is that it is specific-word agnostic. Searching for "car" would bring up a document about "Toyota" even though no terms are exact matches in the document.

By entering the meet-and-confer armed with actual data and a preliminary identification of hot documents, counsel can make much more efficient use of their time and can actually use the information proactively, challenging opposing counsel to use similar advanced technology to support its search efforts.

During Review

Review is the most expensive stage of eDiscovery. Simply put, lawyers are expensive. Any efficiencies created during the review process have significantly positive return on investment. It is also during the modern review process that a mixture of technology and human review yield the best results. Modern review incorporates technology. Failing to use sophisticated review software is like playing a modern tennis match using a wooden racket. At one point, wooden rackets were de rigueur at Wimbledon. Today, they would be seen as ludicrous, and non-competitive in the extreme. Technology has simply moved us past that era. Today's search technology has moved us past linear human review and past simple keyword searching.

The modern review workflow must incorporate a seed set of relevant documents identified by subject-matter experts.Technology must then be used to utilize those seed set documents as a guide to pull out of the rest of the document set other documents similar to the seed set. Then human reviewers are used to review the computer-identified potentially relevant documents. Those human decisions are then used to pull additional documents. This process replicates until the computer believes there are no more relevant documents. Using this computer-human-computer process, counsel can identify all relevant documents after reviewing as little as 8 percent of the total documents. This represents a 92 percent savings and can be completed in a fraction of the time.9


Finally, any documents that have never been reviewed by a human being must be sampled. This involves reviewing a statistically significant number of unreviewed documents to achieve a high (95-99 percent) certainty that no documents were missed. The higher the certainty score, the more documents that must be reviewed. Courts have not only said that sampling is allowable, but even that sampling is necessary.10

Using a combination of people, process, and modern technology is the best way to reduce eDiscovery costs. Companies should "work the problem" from both ends to achieve the greatest cost savings. Documents should be properly categorized so that policy can be automatically applied, and technology should be used in the early stages of a case, as well as during the discovery phase itself, in order to maximize efficiency.

1This reference and the ones that follow are taken from PC World's "Timeline: 50 Years of Hard Drives," at, last accessed May 18, 2011.

2Anonymous AmLaw 100 Recommind customer, January 2011.

3Maura Grossman & Gordon Cormack, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient than Exhaustive Manual Review ," XVII Rich. J.L. & Tech. 11 (2011),

4Craig Ball, The Perfect Preservation Letter , 2005, available at (last accessed May 18, 2011).

5 Race Tires of America v. Hoosier Racing Tire Corp. , No. 2:07-cv-1294 (WD Pa. May 6, 2011) citing Lockheed Martin Idaho Technologies Co. v. Lockheed Martin Advanced Environmental Systems, Inc. , 2006 WL 2095876 at *2 (D. Idaho July 27, 2006).

6See, for example, DL v. District of Columbia , No. 1:05-cv-01437-RCL (DDC May 9, 2011).

7 Pension Committee of the University of Montreal Pension Plan v. Banc of America Securities, LLC , 2010 WL 184312 (SDNY Jan. 15, 2010).

8See FRCP Rule 26(f). State courts often have similar meetings mandated by statute or court rules.

9This time-saving result, even more than the cost-savings, has an important impact on internal and regulatory investigations, which are often conducted on a time-sensitive basis.

10See, e.g., Victor Stanley v. Creative Pipe, Inc. , 250 FRD 251 (D. Md. 2008); Mt. Hawley Insurance Co. v. Felman Production, Inc.

Howard Sklar is Senior Corporate Counsel at Recommind, Inc. Mr. Sklar represents Recommind to corporations and law firms. Prior to joining Recommind, he was Global Trade and Anti-Corruption Strategist at Hewlett-Packard Co., running HP's global anti-corruption compliance program and providing counsel on compliance with U.S. sanctions laws. Before HP, Mr. Sklar was Vice President, Compliance and Global Anti-Corruption Leader at American Express Co.

Please email the author at with questions about this article.