Predictive Coding Applications In Managed Review

Monday, August 1, 2011 - 01:00

The Editor interviews Sanjay Manocha , Landmark Discovery.

Editor: Sanjay, tell us about Landmark Discovery.

Manocha: Landmark Discovery is a unique managed review company that designs and delivers premier review solutions bringing advanced technologies together with specialized attorney review teams to drastically reduce the costs associated with document review. We primarily work with both Fortune 500 companies and Amlaw 200 law firms.

Not only do we execute on managed document review engagements for our clients, but we also partner with them in the early stages, in an advanced, early case assessment-type capacity, to assist them in planning the most effective, comprehensive and defensible review strategy possible for their particular matter. Technology is critical to all this, but it has to be the right technology. Landmark Discovery has a very technology-savvy team, and we place a lot of emphasis on selecting the best technologies out there and on building them into the discovery process. Our deployment of Equivio's predictive coding in a recent DOJ antitrust matter is a case in point. As we saw in this matter, this approach, properly applied, can save tremendous amounts of time and money.

Editor: What do you see to be the primary pitfalls with the current approaches to managed document review?

Manocha: Primarily, at the very outset of current approaches, counsel bears the risk of imperfect knowledge of their matter, which invariably leads to unnecessary review costs and potentially severe sanctions. Sophisticated counsel might expend tremendous effort in trial and error towards reducing review costs based on their limited knowledge about key custodians and keywords; others may not even bother and send the entirety of data to review. While the intent of the former is well placed, they run the risk of a poorly planned review that will constantly require costly adjustments as new information is discovered as well as potentially severe sanctions for failure to produce responsive documents; the latter will face enormous review costs. Neither will have repose as to ultimate costs associated with the review nor the merits of the case until the review is nearly complete.

Recent examples illustrate another more easily avoidable pitfall of the current approach concerning failure to supervise a document review. Often in-house attorneys with little or no experience supervising or managing a document review with contract attorneys find themselves in the unenviable position of doing so. Working with a managed review provider allows law firms to delegate management to professionals, and so this is a step in the right direction. However, doing so is not also a delegation of a law firm's ethical obligation to supervise the work performed. It is vital therefore, when law firm attorneys decide to enlist a managed review provider, they have a conversation with their provider about how best to address their supervisory obligations and follow through on that plan.

Editor: In the face of these challenges - out of control costs and ineffective document review - is there a way forward?

Manocha: Current technologies, such as de-duplication, near-duplicate detection and email threading, have become and will continue to be an indispensible means of reducing data volumes and discovery costs. But the next generation of technology - predictive coding - offers the ability to inform overall discovery strategy. This is where the industry needs to go. Predictive coding technologies are a powerful step forward because they inform a more holistic strategy based not just on the byte-level information obtainable from a document group, which a machine is certainly better equipped to handle than a human, but also on a substantive level, that is, a likelihood of responsiveness - which document sets are likely to have probative value and which sets are likely to have none? And for the documents that are likely to have probative value, what can counsel learn from them? What can counsel extrapolate from the features exhibited by likely responsive document populations and how do those features comport with counsel's understanding of the matter? And most importantly, what can counsel learn by looking at selected responsive documents?

Beyond early case assessment, the way forward addresses the bottom line - select predictive coding technologies, folded into a carefully designed process, are capable of overhauling the current document review model and of bringing unprecedented flexibility, strategy and cost savings. What are the projected costs versus benefits associated with retrieving the responsive document population? How can we invoke reasonableness and proportionality to minimize costs before review even commences?

Our Structured Review approach is based on widely accepted, repeatable and transparent processes. Essentially, it is about breaking down data collections into subsets and allowing counsel to apply the rule of proportionality by assessing proposed expenditures against the potential benefits of reviewing a particular subset, while also providing counsel with the statistical foundation to support his or her decisions. Results are almost always in line with expectations and savings are always achieved, anywhere between 15 percent and 70 percent, depending on client objectives and the nature of the data.

Editor: How has the Structured Review Program helped your clients? Can you provide some real-life examples?

Manocha: Our Structured Review Program overhauls the current document review process, and can be applied in almost any review context - in civil litigation, for outgoing productions and productions received, as well as in government and internal investigations.

For example, in a recent DOJ antitrust investigation, we designed a Structured Review solution, utilizing Equivio>Relevance, for one of our clients who was looking for a more reliable and efficient means than keywords to cull non-responsive data from the collection. In a previous related investigation, it had taken weeks of testing and sampling keywords in negotiations with regulators to narrow down the collection to an agreeable review population. Having collected an even larger population under a significantly broader production request, the client proposed and obtained the approval of the DOJ to enlist all documents returned by the DOJ's proposed keywords in a Structured Review approach, thus obviating the need for weeks of ad-hoc sampling and negotiations over the keywords list. Nearly 500,000 documents were identified for the SRP engagement, and an aggressive production deadline was set.

In less than a week, we had completed data structuring, identifying the cull population, and ultimately settled on a plan, documented with statistical foundation in a Structured Review Program report, to execute the review strategy in three pieces: an internal review by law firm associates, a prioritized review managed by Landmark Discovery, and finally, review of representative document samples to cull the collection by more than 30 percent.

The results projected in the Structured Review Program report were confirmed in our post-mortem analysis. In the second week of the six-week review, our teams had discovered nearly 50 percent of the responsive documents in the population. More than 77 percent of the documents reviewed by the client's in-house attorneys were responsive. Less than 0.5 percent of the documents reviewed in the representative samples were responsive. On top of all this, we achieved significant cost savings for the client, having reduced the number of non-responsive documents to review, increased reviewer productivity through specialized workflows, eliminated a number of redundancies, such as the need for first-pass review of the documents reviewed by the client's in-house attorneys.

In another example, our law firm client was engaged to conduct an internal investigation that would have otherwise been like squaring the circle. The corporate client delivered 1.6 million documents to the law firm with instructions to start custodian interviews within 10 days. The client enlisted Landmark Discovery's Structured Review Program, and in particular, our Advanced Data Prioritization service, to quickly further their knowledge of the underlying transactions. In this particular matter, they estimated capacity to conduct an in-house review of approximately 15,000 documents. Working with the analysis and foundation set forth in our Advanced Data Prioritization report, the client found that review of just one percent of the population - three percent with families reconciled - provided them with adequate assurance they had conducted a thorough investigation, having seen more than 25 percent of the responsive documents expected in the collection and identified custodians not previously understood as key players. In a post-mortem analysis with the client, they confirmed savings of nearly 70 percent, and having completed the review piece of the investigation in three months less time than it would have taken in an approach leveraging keyword-based prioritization and a linear first-pass review.

Editor: What technologies are important in enabling this approach?

Manocha: Our Structured Review Program involves a number of different technologies, from review platforms, to productivity enhancements, such as near-duplicate detection and email-threading. However, predictive coding technologies are the key. Most predictive coding technologies combine some form of iterative, statistical sampling with a mathematical classification engine. Some do so better than others; some are more defensible than others. Some classification engines are widely accepted and have been used for decades across other technical and forensic applications, such as biometric identification and in the legal industry, with some success, in the form of clustering-type tools.

The right predictive coding technologies are capable of far outperforming the common approach to searching and recalling responsive document populations, using, for example, a limited understanding of keywords and key custodians. As a general rule, we try to stay away from technologies that we view as "black-box," meaning not easily understood and lacking transparency. Many black-box applications use classification engines of tremendous complexity to achieve the same result achievable with less complex models. And also, with tremendous complexity comes tremendous costs; many predictive coding technologies require clients to make huge capital investments and employ specialized personnel to train and operate the systems. Still others force clients into a particular review environment or platform.

In addition to selecting tools that employ an efficient, but easily understood classification engine, we also look for tools that are entirely expert-attorney-defined, rather than injecting system-defined semantics into the "relevance" criteria, as well as tools that allow us to work in whatever review or technology environment the client is comfortable with. Often, our clients host data in their own review environment, and we are more than happy to customize and execute on a Structured Review strategy that leverages the technology investments that they have already made.

Editor: Are you saying that experience in predictive coding has already become a key criterion for corporations or law firms in selecting managed review vendors?

Manocha: Absolutely, especially in the current environment, where technology-driven discovery practice is developing at a much faster rate than most of the legal industry can keep up with. It starts with selecting the right technology. And it continues with applying the technology in the right manner. Proper utilization of predictive coding technology requires a thorough understanding of the technological options to determine which options are better but still consistent with widely accepted standards, and, finally, how best to design, document and execute on an effective review strategy using the selected technology. It is vital that law firms and corporations partner with a managed review company that understands how best to fold the strategy into review and how phenomena associated with the technology can be used in project management to design specialized workflows to maximize cost savings.

Editor: Are there any limitations to the broader application of a Structured Review approach?

Manocha: The only limitation we see to widespread acceptance is the need for further education. There is no denying that the results produced in review by a proper application of predictive coding technologies are superior to those produced by current approaches to document review. Ultimately, one cannot argue with the results, and it will be corporate counsel and their forward-thinking attorneys and law firms who will push the standard forward for the industry, with the help of companies like ours.

Please email the interviewee at sanjay.manocha@landmarkdiscovery.com with any questions about this article.