Litigation has transformed over the past two decades from discovering documents in file cabinets to discovering a smoking gun in an email or Twitter posting. Nearly all information in today's digital world is created and maintained exclusively in electronic form. This deluge of data has created a document review tsunami.
Currently, keyword search is the primary means of document retrieval even though various studies have concluded that keyword search has limited reliability at returning the most responsive documents in a litigation matter. However, condemning keyword search technology is unfair in this context, as keyword search was until recently the only viable means of retrieval. Furthermore, Rule 1 of the Federal Rule of Civil Procedure does not require perfection; instead, a balance of efficiency, cost, and justice are taken into account.
Fortunately, a new toolset called predictive coding is coming online. It stands to offer a solution for the overwhelming amount of data and is likely to transform some components underlying the economics of discovery. With the right business model, law firms and corporate counsel can implement an e-discovery or litigation readiness program using predictive coding. This approach can simultaneously act as a source of revenue while also cutting out the middleman vendors and reducing expenses for the client.
Predictive coding is software that is trained by a user to predict which documents in a document set will be responsive and which will be non-responsive. Predictive coding goes by many names, including computer-assisted review and technology-assisted review.
Predictive coding aims to reduce the number of documents reviewed by ranking the documents according to a calculated level of responsiveness. Instead of looking at every email written by a custodian over a three-year time period, predictive coding uses a number of factors including keywords, writing style, subject matter of the writing, and even punctuation style to determine the chain of documents that are most relevant to the matter. These underlying programmable algorithms vary between software brands.
Technically speaking, predictive coding software is trained by a senior attorney or partner to look for documents similar to documents that the training attorney deems responsive. This is done by the attorney reviewing a relatively small “seed set” of documents and coding the documents as either responsive or not responsive. The properties within these documents are identified by the computer, which then uses those properties to determine the relevance of other documents.
No technology is perfect, and the success of predictive coding hinges on a strong litigation support team and counsel that know the case and how to use the technology.
A senior attorney needs to get engaged early in the e-discovery process when using predictive coding software.
The effectiveness of predictive coding software hinges on the initial training of the software. This training must be performed by either a senior associate or a partner who is intimately aware of the underlying facts and litigation strategy. While the notion of learning how to train such a system may send partners fleeing in the other direction, their concerns are misplaced, as some predictive coding software is intuitive to learn.
The reason partners or senior associates should be responsible for training the software is because they are best situated with the facts and legal strategy – components critical to ensuring the training is being performed in a defensible and reasonable fashion. Counsel should be warned that if you permit a junior associate to manage and run the predictive coding software, you risk losing the case before you start. Why? Unlike manual discovery, every tag is amplified in training, so counsel must be savvy about the issues and legal strategy in order to properly train the predictive coding platform. Limited experience and legal acumen by the trainer translates into software that is likely to return non-responsive documents as responsive, which can result in substantial costs, sanctions, or defeat.
New skills are required by neutral or experts when using predictive coding software.
The role of both neutrals and experts in the field of e-discovery is deviating from the current norms of e-discovery. A skilled lawyer or a retired judge with IT expertise and a mathematics or statistics background brings to the predictive coding process a skill set that benefits the parties, their attorneys and the court.
While such an expert may be invaluable, the need for such expertise on a day-to-day basis is not necessary. My experience suggests that a law firm or in-house team that has a strong litigation support staff can utilize and deploy said software with skill and comfort. Success often hinges on whether or not the firm changes its own e-discovery process and culture so that the partner and/or senior associate is at the front lines working with litigation support.
Just as the role of the partner in discovery is changing with predictive coding, so too can the way the partner views the role of discovery in the larger business strategy of the firm. While the art of the litigator and trial lawyer certainly is not going anywhere, the economic model upon which many law firms have been built is shifting. Pay scales are changing; new billing models are being tested; and the pillar of document review is starting to shift.
Discovery has evolved in the past 30 years to include a variety of technology and personnel that were previously unnecessary. This has transformed paper-based discovery into computer-based e-discovery. The model relies heavily on vendors whose role is the provision of imaging, hosting, processing and outputting of data into reviewable format for other vendors who specialize in providing document review teams.
Prior to the existence of predictive coding, keyword search was the only reasonable means of culling large electronic document sets. The results from keyword search had to be reviewed by low-cost document review teams, as this was the only economically viable option for law firms. The document sets returned from keyword search were (and are) generally very large. Law firms have high man-hour costs that a document review vendor can easily undercut. The option to review in-house simply was not cost effective or time efficient. But the methods and means of predictive coding are giving law firms a competitive edge over e-discovery vendors and document review teams.
This edge is a two-fold result of predictive coding. First, the training of software by a senior associate or partner familiar with the case is an initial higher cost input. But those few hours upfront result in a much smaller responsive document set to review. This document set can then be reviewed by an in-house team of associates in a smaller time frame. This allows the law firm to bypass the e-discovery vendor and the outside document review teams. The same services can be provided to the client for a much lower cost. (Fewer documents to review means less time, and less time means it becomes a task that the law firm can manage in-house.)
Second, by bypassing the outside vendors, the law firm can now reap the revenue stream that comes from providing this service to its clients. This revenue stream allows the partner who specializes in discovery and litigation strategy to turn his or her role into one of revenue-generating instead of a necessary cost. For example, in a larger commercial litigation involving hundreds of gigabytes of data, the partner over the case can redirect the client expense of processing and review to a revenue stream of billable hours. The charge to the client will cover the hours the partner or senior associate trains the software, and then the review of final responsive documents.
Other benefits include more control over the e-discovery process. As many attorneys have seen, the back-and-forth that comes with communicating with outside vendors can range from effective to time consuming to costly. Not only do the outside vendors have to communicate to the lawyers the information they have discovered, but lawyers will have to ask a multitude of questions to fully understand what was and was not discovered. In addition, the attorneys at the firm still need to review the most responsive documents themselves. All of these tasks add up in time and money for the client that can be streamlined with the use of predictive coding. It will also mean one less conversation with the client explaining why it is critical that they hire an expensive outside vendor and why they should pay for the added costs of review, when they are already paying the firm for the attorneys’ time.
Predictive coding can be implemented in a variety of ways, and it is not an all-or-nothing process. As attorneys and clients get used to the technology, many firms will find that they prefer to implement it in a step-by-step basis prior to installing it as a full-service, enterprise-wide method of managing discovery. This incremental adoption will give attorneys and litigation support staff time to understand the usefulness and revenue streams that can come from the implementation of predictive coding.
This space is sponsored by Equivio as part of its “Predictive Coding Educational Initiative.” Equivio invites industry observers and practitioners to submit articles that contribute to the understanding and use of predictive coding technology. Proposed topics for articles should be submitted to firstname.lastname@example.org.
Daniel B. Garrie, Esq. has a BA and MA in computer science. He is an e-Discovery Neutral and Special Master with Alternative Resolution Centers (www.arc4adr.com), and is a Partner at Law & Forensics LLC (www.lawandforensics.com), a boutique legal strategy and forensics firm consulting across industries to address privacy, e-discovery and forensic issues. He is a thought leader in the fields of information security, forensics, e-discovery, information governance and digital privacy. Mr. Garrie has published over 100 articles and is recognized by several Supreme Court justices for his legal scholarship. He is the editor-in-chief of the Journal of Law and Cyber Warfare. Mr. Garrie also co-authored the treatise Dispute Resolution and e-Discovery, published by Thomson Reuters in 2011. He is admitted to practice law in New York and New Jersey. The author would like to thank Yoav Griver and Candice Lang for their assistance with this article.