Feedback From The Predictive Coding Trenches At LegalTech® 2013: Moving From “Is It Defensible?” To “What Are Best Practices?” In 12 Months

Tuesday, February 26, 2013 - 08:40

At LegalTech® New York 2013, D4 hosted several Executive Roundtable discussions, including one on predictive coding. The predictive coding discussion drew a lot of attention and featured Warwick Sharp from Equivio, Jay Leib from kCura and Tom Groom from D4.

The Editor reconvened the panelists for a Q&A session, focusing on ways this intriguing technology is transforming the way lawyers conduct electronic discovery.

Editor: Please provide us with a brief definition of “predictive coding” for those who may not be familiar with it.

Groom: “Predictive coding” is a technology-assisted review workflow that efficiently ranks documents in terms of relevance. A subject-matter expert (SME) reviews a small number of documents until the system is “trained” – that is, until it learns how to disambiguate documents that are relevant from those that are not relevant. A “relevance score” is then assigned to each document, and informed decisions can be made regarding how to manage those documents.

Editor: How predominant is the use of this technology?

Groom: We’ve been following predictive coding closely since it was first introduced to the e-discovery market in 2009. Initially, usage was slow. Then, in 2012, court approval and positive case studies of early adopters changed the usage landscape. This dramatic development promoted predictive coding to the e-discovery mainstream. Just anecdotally, of the 20 or so law firm and corporate lawyers who were present at our LegalTech roundtable, more than half had used predictive coding in at least one project, and a few had used it more than a half-dozen times. A year ago the picture would have looked very different. Given the rapid upsurge in usage rates, I expect the picture will be very different once again at LegalTech next year.

Editor: How are lawyers using predictive coding?

Leib: Lawyers are using predictive coding in a number of ways, but at this point, the most common uses are culling and prioritized review – that is, lawyers make decisions about which documents to subject to an eyes-on review based on the ranking of the documents. Similarly, they may decide not to review those documents with a very low ranking in order to save time and costs. This approach allows firms to concentrate their efforts on the documents with a high ranking that are more likely to be relevant and important to the case.

Editor: How do last year’s concerns about using predictive coding compare with what you heard this year?

Sharp: Last year, everyone at LegalTech was asking “Is predictive coding defensible?” It was all about defensibility. Courts had just started to address predictive coding in published opinions, and we weren’t sure which direction things were going to go. A year ago, many people were still hesitant to use the technology for fear of challenges from the court or opposing counsel, and the potential risk of satellite litigation around e-discovery procedures. Today a very different question is being asked. With the acceptance of predictive coding technology by the courts, the market has transitioned very quickly from the “whether to” question to the “how to” question.  We had a session on best practices in predictive coding at LegalTech that was standing-room only. Rather than defensibility concerns, people are asking about how to conduct a predictive coding project. They want to understand the basic concepts, and they want practical advice – project checklists, technology differentiators, best practices, skill requirements, critical success factors and so on.

Editor: The legal market has traditionally been slow in adopting new technologies. Is the attitude toward predictive coding different?

Groom: In my view, the mindset in the legal industry regarding predictive coding is different from what we've seen during previous waves of technology. The key reason for this is probably that predictive coding is very intuitive. Once people come to terms with the concept of software that has the ability to learn and imitate human decisions, it simply makes a lot of sense to encode the system with the understanding and intelligence of those most knowledgeable about the matter at hand. In addition, people seem to recognize, perhaps more than in the past, that technology is just one part of the puzzle.

At our LegalTech roundtable, one litigation support manager at an AmLaw 100 firm stated this very succinctly. He said that his charter is not to figure out ways to use technology, but to re-engineer the discovery process. This is a very insightful comment because it acknowledges a key truth about predictive coding – namely, that without an appropriate process in place to facilitate the correct and appropriate use of the technology, the technology itself is of no use to anyone. This is where e-discovery service providers have a key role to play. End users are looking to leverage the predictive coding experience of leading service providers and to receive guidance, templates and methodologies to ensure the success of their predictive coding initiatives. To the extent that service providers are able to deliver on this consulting promise, the way will be cleared for the more rapid uptake of the technology.

Sharp: I would add that the “proof of the pudding is in the eating.” E-discovery professionals and litigators are comfortable with the idea of a “safe sandbox” in which they can try the technology. The sandbox gives people an environment in which they can run the technology against historical or even live data, allowing potential users to become familiar with the flow and to test outcomes in a controlled manner, before moving forward with broader formalized adoption.  The sandbox approach, which is supported by innovative and understanding providers, recognizes that end users – namely lawyers themselves – need  to try and test predictive coding technology, and work through the methodology for its use and application, before adopting it as the standard in their environment.

Editor: Is there a path of least resistance in adopting predictive coding?

Sharp: Predictive coding is not a monolith – it’s not all or nothing. For instance, we can distinguish between use scenarios that are court-facing – such as culling, followed by decisions about what documents will not be reviewed – and other scenarios, such as early case assessment, review of an in-coming production or prioritized review, that are not court-facing.  First-time users will often initially deploy the software in these low-profile scenarios. This strategy minimizes risk and allows users to assimilate knowledge and test their methodology.  Typically the firms progressively broaden their usage and leverage of the technology as they gain familiarity and confidence.

Editor: Are there any words of wisdom for what not to do if you are trying to build a case within your firm or company for using predictive coding?

Leib: One needs to be cautious about using predictive coding on a case previously reviewed using a traditional, document-by-document linear review or search terms. This needs to be handled carefully because it can potentially reveal problems that existed in the original review protocol and thus confuse the results of the predictive coding technology. Many people find that a smaller case or an inbound production can be a good place to start. These situations can be ideal arenas for predictive coding novices who are using the technology for the first time.

Editor: Must the subject matter expert or “SME” who reviews the sample set be a senior partner?

Leib: No, not necessarily, although the SME assigned to review the initial sample must “have authority” and be a “domain expert” – that is, he or she must be empowered by the client and law firm to make the relevance decisions that will feed the predictive coding system. Given the ability of the software to imitate the SME’s decisions, the SME’s input can have a significant impact on the subsequent handling of the case.

The SME might be anyone from a contract-review lawyer or a senior partner, as long as the authority exists, irrespective of the title or role. Nonetheless, there is strong argument for having senior lawyers on the case act as SMEs in reviewing the initial set. We have observed excitement for predictive coding build during an SME review, when senior lawyers actually prolonged the document-by-document relevance discussion so they could continue talking about the documents. This is an intriguing development and could mean that the SME review is an opportunity for case lawyers to refine their thinking about the dispute. It is also an important response to those who worry that the initial SME review takes senior lawyers out of circulation for too long, because it may turn out that this is a critical and even essential use of senior lawyer time.

Warwick Sharp is Vice President of Marketing and Business Development at Equivio and a Co-Founder of the company. Equivio is a software technology company that develops analytical solutions for e-discovery. Jay Leib is kCura’s resident computer-assisted review expert. He has been a speaker on computer-assisted review for a number of events and has authored several articles and white papers on the process. Tom Groom is Vice President and Senior Discovery Engineer at D4. He is a recognized e-discovery expert with more than three decades of experience in information technology, litigation support, document review and e-discovery methodologies.

Please email the interviewees at,, or with questions about this interview.