Not Just TAR: Other Analytics Tools Can Help Reduce The Cost Of Document Review

Adam Wright Strayer, Esq. - As the Director of Advisory Services at CDS Legal, Mr. Strayer advises clients on eDiscovery best practices, workflow design, and the use of analytics and predictive coding.  Before joining CDS, Mr. Strayer was an antitrust associate with the New York law firm Axinn, Veltrop & Harkrider. He began his legal career in Washington, DC as an attorney in the Mergers II division of the Federal Trade Commission, where he spent five years reviewing proposed mergers and acquisitions as well as investigating anticompetitive business practices, primarily in the high-tech and chemical industries. Mr. Strayer holds a BA from The University of Michigan and graduated summa cum laude from American University’s Washington College of Law.

Please email the author at with questions about this article.


Complete Discovery Source (CDS Legal)


In the era of Big Data, simple legal disputes routinely result in six-figure legal bills, and tens of thousands of dollars are regularly spent on data processing and document review. As a result, both in-house and outside counsel must find more efficient, yet effective and thorough, ways to manage discovery. For the last several years, Technology Assisted Review (TAR) has been hailed as the solution that can help curtail these ever-increasing costs, yet adoption has been slow. Even though these tools can quickly identify responsive documents while substantially reducing the volume of non-relevant documents subjected to manual review and saving thousands of dollars in reviewer and data-processing costs, lawyers still remain hesitant to fully embrace TAR.

Essentially, the legal tech industry has reached far ahead of where most lawyers are comfortable. The highly specific workflows, the unfamiliar statistics, and the still uncertain judicial acceptance of these technologies leave many lawyers hesitant or outright cold. TAR, while statistically defensible, still suffers from “black box” syndrome, and many lawyers remain unconvinced that predictive coding-type tools are effective or wise. 

In our experience, when correctly deployed, these software tools are remarkably good at filtering out non-relevant materials. They quickly learn to recognize the most relevant documents, and they have saved our clients many thousands of dollars and review time. Yet, despite the potential savings, many clients continue to select far more expensive linear review workflows over TAR.

Lawyers uncomfortable with TAR, however, should consider using other analytical tools that, while related to – and in some case nearly identical to – TAR-type software, use workflows that do not eliminate or cull documents from a review. Rather, these tools can be used to organize document reviews in ways that are considerably more efficient than traditional linear reviews. While the cost savings may not be as great as those generated by TAR, these oft-overlooked tools can dramatically reduce costs while also improving the review and saving time.

Traditional Document Review

In a traditional document review, data typically is culled or filtered using keywords and date restrictions. The documents are then batched and reviewed linearly in chronological order, often custodian-by-custodian. Unlike in a typical TAR-style workflow, in which some number of “unlikely to be responsive” documents is set aside and not reviewed, a traditional linear review generally will look at all documents that make it through the keyword and date filters.

While custodians may be organized by priority, little can be done to prioritize specific issues. An individual reviewer therefore might review documents related to any number of issues or concepts at any point during the review. Teams generally are unspecialized, and documents related to particularly sensitive or pertinent issues may be reviewed at any time, by any person. Certainly, one can front-load documents belonging to important custodians, but within any individual custodian, little can be done to ensure that relevant materials are reviewed or identified before any personal emails or SPAM that might have survived the keyword searches.

As a result, in a traditional review the same level of attention (and expense) is paid to a SPAM email relating to Nigerian bank accounts as to a potentially relevant email discussing the custodian’s fraudulent bank transfers or insider trading.

Enter Analytics: Threading

In the suite of analytics products related to TAR, a number of software vendors offer other tools clients can use to organize their document reviews more efficiently. For example, rather than organize documents chronologically, in which each email may have no relation to any other email or document in a reviewer’s assigned folder, some tools will link all emails in a particular conversation chain. “Threading” in this way allows a single reviewer to see all parts of a particular conversation. 

In a traditional document review, individual reviewers might see only one or two parts of a lengthy email exchange between two relevant custodians. No single reviewer is likely to see the entire thread, and no one is likely to see the entire context of the relevant conversation. Moreover, each contract attorney would come to the conversation thread de novo and would have to spend time reading, evaluating, analyzing and then tagging each individual email in that exchange. Instead, if the conversation were to be threaded, then only one reviewer would review and tag the entire conversation, allowing that person to move more quickly through the entire chain and, hopefully, helping him or her to consistently code the entire exchange in the same way, saving both time and money while improving the overall review.

Conceptual Searching And Clustering

A more advanced analytics-based tool that also can help improve reviewer efficiency and accuracy involves conceptual searching and conceptual clustering. Using algorithms similar to those used by TAR programs, many software programs analyze the actual content of individual documents, allowing them to be sorted into related “clusters” or groups. 

Like the much-discussed TAR platforms, these software tools evaluate the semantic content of each document and, by cross-referencing against a specialized index, identify recurring concepts. Documents dealing with discrete concepts can then be batched to individual reviewers. Unlike batches based on email threading, each reviewer would then see all of the documents related to a particular concept, and not only a single email chain. This approach gives the reviewer additional context and enables him or her to quickly move through each conceptual batch, ideally coding with more accuracy and, again, more consistency.  By the time he or she finishes a particular batch, a reviewer should be an “expert” on whatever concept was grouped into that batch.

In addition to providing single-concept batches, conceptual batching also allows case teams to structure a review in such a way as to better meet its needs. While the conceptual groups are generally software-created, once generated, a quick check of each cluster allows the case team to select those that are most relevant or most interesting for priority review. Similarly, conceptual clusters that are clearly irrelevant (e.g., SPAM or daily email distributions) can be de-prioritized or bulk-tagged as non-responsive.

Simply by shifting the traditional linear workflow to this type of concept-based workflow, lawyers can save considerable time and money. Large numbers of documents can often be quickly eliminated or mass-coded, yet all documents are still going through the review process. Nothing is being culled or eliminated from the review universe as they would be in a TAR workflow, but the review is still benefiting from the software’s sophisticated analytical abilities.


Lastly, lawyers who remain uncomfortable with the defensibility of the TAR process – where documents are excluded from the review workflow based exclusively on a computer’s determination – might consider a hybrid approach in which all documents are still reviewed, yet considerable savings can be found.

In a typical TAR workflow, a software tool analyzes example documents known to be responsive to specific issues and then extrapolates that analysis across the rest of a defined document universe. The software’s accuracy is verified using statistical sampling, and the documents determined by the computer to be non-responsive are set aside, un-reviewed by human eyes. 

In a hybrid model, examples are still provided, and the software’s analysis is still run against the entire document universe. The resulting categorization is still evaluated, but rather than setting aside any non-responsive documents, all materials – including those predicted to be non-responsive – are reviewed by contract attorneys.

In this variation of the TAR workflow, the documents most likely to be responsive are reviewed first, and the remainder are either set aside for later review or are outsourced to a less expensive review team. In the first option, the team reviews materials that are highly likely to be responsive, and the subsequent batches of less responsive materials can be reviewed much more quickly. In the second workflow, less responsive materials can be outsourced to less expensive review teams, perhaps even one overseas. In either instance however, all documents are reviewed as they would be in a traditional linear review, and any TAR-specific defensibility issues are overcome. While the cost savings are not the same as in a full-TAR workflow, the ability to quickly review documents most likely to be responsive, combined with the option of running a second, cheaper review track for the less responsive materials, can save significant monies.

While the legal profession remains cautious about adopting fully automated TAR-type workflows, the fact remains that litigation and discovery costs continue to increase dramatically. Traditional document review projects are extremely expensive, and vendors and law firms must find solutions that maintain or improve quality while also bringing down the cost of litigation. Fortunately, there are a number of technology-based options that, though perhaps not as efficient as the most sophisticated TAR solutions available, can be used to structure the review process in ways that save clients both time and money today. 

Other Topics: