Predictive Coding And Patented Workflow: A Defensible E-Discovery System

Monday, March 26, 2012 - 14:11

The Editor interviews Howard Sklar, Senior Counsel, Recommind Inc.

Editor: Please tell us about Recommind’s Predictive Coding Education, Training and Certification Program. We understand it is the industry’s first and only formal education program in this area.

Sklar: We’re very excited about this program, which is the first such program to bring together a wide range of resources to educate your readers on Predictive Coding. The program includes bimonthly webinars, including a recent webinar via Virtual LegalTech. We have an ongoing, highly successful seminar series taking place in major U.S. cities, featuring prominent guest speakers like Duke Alden from Aon Corporation. Recommind will offer in-house counsel-focused CLE credit-approved events with Morgan Lewis & Bockius in May and we’re going to host executive roundtable discussions in Boston, New York and Atlanta. And finally, we have in-person training scheduled in Boston and San Francisco with more cities to come as well as a certification program starting this spring.

Our emphasis on education is a natural extension of having successfully established Predictive Coding as the “go-to” technology for document review, and Recommind’s preeminence and experience in this field is unmatched. The legal community needs qualified resources that enable practitioners to understand the substance and benefits of Predictive Coding – both as a technology and as part of our broader, patented workflow designed to pair with knowledgeable attorneys and create defensible, accurate and efficient e-discovery processes.

Editor: How important is it for judges, attorneys and litigation support professionals to have hands-on experience with Recommind’s system?

Sklar: It’s important for a couple of reasons. The first derives from Judge Peck’s recent decision in Monique Da Silva Moore, et al., v. Publicis Groupe and MSL Group, which highlights the importance of a defensible workflow and the need for judges and attorneys to understand the operation and inherent efficiencies of Predictive Coding. Hands-on experience is a major goal of our outreach and education program.

Our seminar series is intended for judges, litigators and their support staff, all of whom have varying degrees of familiarity with e-discovery processes. Unlike certain members of the judiciary like federal judges Andrew J. Peck, Paul W. Grimm or Shira A. Scheindlin, there appears to be a need for education on Predictive Coding at other levels of the judiciary such as at the state level, where a majority of litigation occurs. The full complement of our educational efforts includes CLE courses, seminars and webinars for in-house counsel, and we view education not only as a critical goal unto itself but also as an excellent business tool. When you see the technology and workflow at work you can’t help but be impressed.

Editor: Will Judge Peck’s decision in Moore create convergence and hence greater order in the explosive e-discovery landscape?

Sklar: While this decision is very important, it only solidifies what Recommind has been espousing and our long-term clients already know, i.e., that Recommind has a defensible workflow and that Predictive Coding technology allows them to do a significantly better job of review than the current gold standard: keyword search in linear review. One colleague at a law firm calls it the “fool’s gold standard”; but nevertheless, the keyword method remains the accepted workflow.

We’ve maintained for a long time that a validating case, such as Moore, would be a great asset, and it might drive new users toward Predictive Coding technology. It’s important because for better or for worse, some people, particularly late adopters, want to see some sort of judicial imprimatur on the workflow. We’ve heard of judges asking difficult litigators or adversaries who can’t agree: why aren’t you using Predictive Coding? The Peck opinion certainly has raised awareness about the technology, and it broke ground for those who needed a judicial opinion in order to feel comfortable with adoption.

Editor: Will companies and law firms have enough confidence to establish policies and procedures adopting Predictive Coding technologies?

Sklar: They will, and we already have a large client base that uses Predictive Coding. The legal world is a risk-averse place; thus, companies and firms can take comfort in the fact that that they are in good company and that there’s now an official stamp on Predictive Coding. It is important to remember that Predictive Coding is not just a technology, but rather one that is combined with knowledgeable people and a defensible workflow. This combination plus increased judicial acceptance should drive higher confidence in Recommind’s complete system.

Editor: Recommind has the distinction of having patented its Predictive Coding process. Have other firms sought licenses or tried to engineer around it?

Sklar: We’ve seen companies trying to catch up by cobbling together a preexisting capability, such as de-duping technology, and simply re-labeling it as Predictive Coding. Such practice presents some obvious disconnects and often reflects a company’s lack of foresight. For instance, in late 2010, some eDiscovery providers were questioning whether Predictive Coding was ready for prime-time, and those same companies were promoting their Predictive Coding solutions about a year later at LegalTech 2012. Thus, there is increasing industry recognition that Predictive Coding is the future, and our competitors are working on ways to get in – even if they can’t yet offer the “real thing.”

Editor: How can client companies looking for e-discovery solutions recognize the real thing?

Sklar: The simplest answer is to say that if you want a true Predictive Coding solution, then you need to have Recommind on the label. We already have spent significant time in development and then years in production, and while other companies are scrambling to mimic our services, we remain far ahead of the pack in terms of experience in Predictive Coding and in the critical workflow that accompanies the technology.

In distinguishing and evaluating e-discovery solutions, we recommend that clients ask pointed questions about specific capabilities to potential vendors, for example, how do you handle new documents that arrive during the middle of a review? Our review technology, called Axcelerate®, can handle incoming documents on the fly; whereas competing technology may be less sophisticated and thus may require a re-indexing of all the documents, from scratch, each time new documents are received.

Another strategy for assessing options is to force potential vendors to go off-script during product demos. Here, they might be at a loss to explain what will happen in a given situation, so ask them to talk about a detailed aspect, perhaps of their underlying technology, that’s not on their own agenda for the demo. Recommind’s underlying and patented technology – called Probabilistic Latent Semantic Analysis (PLSA) – was designed to improve on other existing search technologies. Other companies’ underlying technologies simply aren’t as robust, and this shortcoming is not something they will want to highlight – unless asked. Thus, the devil is in the details when making vendor assessments. When you force people off script, you see what’s real and what’s not.

Editor: Can you take us through the steps of a complex document review process using Recommind’s technology and patented workflow?

Sklar: The process starts with running standard tools, such as de-duping and filtering by custodian or date, to limit the corpus as much as possible before the documents are entered into a review platform. The initial goal is to create a seed set, which is a group of documents that are likely to be relevant within identified criteria. This seed set may be created in order to respond to a litigation subpoena or perhaps to gather documents in a subject area of interest within enforcement or regulatory proceedings, such as for anticorruption or antitrust.

Creating this seed set of documents can be accomplished by using our robust search and analytics technology to do a keyword-agnostic search, i.e., a non-keyword-dependent search that can find the concept of bribery within documents, rather than identify only those containing the specific word “bribery.” I’ve never encountered an email containing the word “bribe,” so any effective system must be able to comprehend the imagination of people who are writing emails with the intention of hiding illicit activity. Therefore, the keyword method, in this case, may eliminate better, more subtle options that are critical to success, and creating effective seed sets very often requires involvement by knowledgeable counsel.

From there, the process involves asking the system to find and then prioritize relevant documents based on concepts established in the seed set. We employ an iterative process wherein the computer makes initial suggestions and prioritizes documents, and then the subject matter expert reviews the result and makes yes/no decisions that are fed back in to the system. The computer now has a more refined seed set, based on knowledgeable human decisions, from which to suggest additional relevant documents. This iterative process has proven to be incredibly efficient.

At some point, there will be dramatic drop-off in new relevant documents produced by the system – from sets of 80 percent relevance to those with less than one percent relevance. The workflow involves validation components with statistical sampling that can give clients the confidence required to determine that they accomplished a reasonable, defensible search for documents.

Editor: Can each firm identify its own statistical endpoint?

Sklar: Yes, and it is typically within legal counsel’s purview  to make this decision. We’re always careful about this part of the process, though our high level of client integration enables us to be very supportive. While we will advise from a technical perspective, identifying the endpoint of an e-discovery process is a legal decision.

Editor: At what point does it require human intervention in order to proceed?

Sklar:  Knowledgeable and experienced attorneys are integral throughout the process, both with seed set creation and working with quality assurance and control processes that are inherent to the system. The technology makes human beings more accurate by reducing time spent on reviewing irrelevant documents, thereby circumventing the natural human tendency to lose focus when an activity becomes less productive. Put differently, a hit rate of  80-90 percent relevance will keep a reviewer alert and more actively engaged; thus, even we were a little surprised to witness how our system, which is known to produce batches with over 90 percent relevance, can make the reviewers themselves more accurate.

Editor: What other discovery-related matters can leverage Recommind’s services?

Sklar: We’ve talked about outgoing discovery for production to another party, but there’s also incoming discovery, which involves processing data received on a disk or via access to a data base. Here the goal is to discover what’s in your opponent’s documents in a very quick, efficient manner and whether the production met your discovery requests to conduct a review and to develop legal strategies and case analysis.

Our processes also serve from an internal compliance or investigation perspective, i.e., for reviewing your own documents and not for production externally. A number of enforcement and regulatory agencies use our technology, which may result in the agencies knowing more about corporate documents that the company itself knows. Thus, from risk mitigation or defensive position, some of our clients use us because their regulators use us. Corporate clients recognize that there are inherent dangers when a regulator says “we’ve found this in your documents, why haven’t you?”

Editor: Will Predictive Coding become the industry standard?

Sklar: We’ve seen significant adoption already, as companies and firms who are using Recommind’s Predictive Coding understand that this is defensible technology. The Moore decision can only help the industry as a whole. It’s an important decision from a legal perspective and may also open the proverbial flood gate toward greater adoption of Predictive Coding. 

Please email the interviewee at with questions about this interview.