Beyond Cost Savings: How Advanced Analytics In E-Discovery May Be Pivotal To Successful Case Outcomes

Friday, April 19, 2013 - 17:00

The Editor interviews Dean Kuhlmann, Vice President of Business Development, Lateral Data, a Xerox company.

Editor: Please tell us about your background, as well as your role at Xerox.

Kuhlmann: I’ve been in the e-discovery business for about 25 years, and for the last four I’ve been with Lateral Data, where I am the vice president of business development. I work closely with clients to help solve their business problems leveraging the Viewpoint all-in-one e-discovery platform. In addition to collection, pre-processing, processing and review functionality, Viewpoint contains an array of advanced analytical tools. Part of my job consists of helping clients understand which are best to utilize based on the specifics of a matter.

I have an engineering background, so when working with clients, I tend to approach their particular needs and issues from a technical perspective. I also serve as a liaison for clients that don’t yet have a solid grasp of today’s e-discovery technologies.

Editor: With Xerox a major global e-discovery provider, you’re no doubt seeing your clients struggle with Big Data challenges on a daily basis. What does Big Data mean in an e-discovery context?

Kuhlmann: While the termBig Data” has had a lot of buzz lately, the concept itself, along with many of its challenges – that is, how to extract value from data to meet various business or legal purposes – has been in existence for many years. We’ve long had too much data – it’s the context and the volume that have changed. When everything was on paper, we read every document, but as the volume and data types demanded by discovery have been brought to a new level, we have had to figure out ways to accomplish the task without physically setting eyes on every document. Over time, clients have become more and more willing to accept and trust technological approaches that aid in finding and assessing information. This is where our advanced analytical tools come into play. Clients have multiple tools at their fingertips, and we help them to understand which are appropriate to apply and when.

If you look back, the first method to become widely accepted was keyword searching. It’s funny to think that, initially, many of us thought keyword searches were a ploy from the other side to limit discovery. We’re well beyond that now. Keyword searches are fully acceptable, but we also now know that they aren’t the answer to everything. Today, we’re looking at a host of advanced methodologies – in fact, often used in conjunction with keyword searching – to tackle Big Data. We’re still trying to solve the central problem though, which is how to find key evidence among all of the noise without having to actually review all documents. There is now a shift toward the application of advanced analytics to cull down data volumes (hence decreasing costs), and also, importantly, to more quickly pinpoint the information needed to support a case. 

Editor: What Big Data challenges do you see surfacing for your clients?

Kuhlmann: Some of the problems are the same problems we’ve always seen, only on a different scale. For example, people still tend to over-collect data during the initial phase of discovery, casting an overly wide net only to have too much information to manage later, much of which isn’t relevant to the matter. Before the process even starts, it’s critical to develop a targeted strategy for your case’s discovery and to include your e-discovery team early in the planning process. Unfortunately, case teams sometimes wait until they’re far down the path before bringing in help, and then they are left scrambling for answers and taking shortcuts to comply with deadlines, when formalizing a better plan upfront could have led to more optimal results downstream. However, over-collection can still happen (if not always on a large scale), so clients are challenged with finding a way to analyze and to identify efficiently and effectively the information that is relevant to their case.   

Editor: What are ways clients can effectively manage their data for discovery purposes?

Kuhlmann: First, there’s no one-size-fits-all solution. I’m an advocate for using technology wisely, not just throwing it at the problem. The parameters of the case itself will determine what data sets are needed. In order to decide on the benefits of certain technologies, including certain analytical tools, we have to understand what type of case it is – whether it’s an SEC investigation, an antitrust second request or an IP infringement matter.

One of the criteria for choosing a technology is and should be cost savings, but also keep in mind that, ultimately, finding the information that will help you make your case is the primary goal. If you take the wrong approach, you might save money, but you may end up taking a case in a different direction based on the facts that you uncovered – rather than the best facts, which perhaps you never did find because the inappropriate technology was applied. So it’s important to look at the entire matter, the goals of the legal team, cost concerns and where you are in the case when assessing which analytical tools are the best fit for a particular case.

Editor: Can you provide some examples of advanced analytical tools your clients are using?

Kuhlmann: There are two primary challenges we see with virtually every matter. First, attorneys and legal review teams need help in selecting keywords. At the beginning of a case, they don’t yet know what keywords might be relevant. If a massive culling effort is performed using keywords, then the results that emerge will change the scope of the content that will be used to support the case. If the wrong set of keywords is chosen – for example, if they are either off-topic, over-inclusive or under-inclusive – then the case team may end up pursuing the wrong issues. So, in advance of selecting keywords, one advanced analytical tool that clients have been leveraging for quite a while now is concept analysis. Concept analysis is an automated process that assesses and organizes every document by the key content found on each page, then groups documents that generally are about the same topic or touch on the same primary concepts. Because concept analysis is unbiased, the process yields a pristine list of potentially responsive documents contained in the data. Now, because the case team no longer has to guess at the contents of their data, they can more accurately identify valuable keywords to employ in their searches.

The second challenge that legal teams face is the volume of near-duplicate documents in a collection. The ability to systematically remove the duplicates is a proven, acceptable process both technically and in the courts. Near duplicates however, are harder to understand and harder to find as the documents are not exact matches. Left unmanaged, the case team could find themselves reviewing nearly identical documents (perhaps with only one word difference) repeatedly and spending countless numbers of hours doing so.  For example, let’s assume there is a 10-page contract under review that is emailed as an attachment to 10 people. The document is created in Microsoft Word and emailed with track changes enabled. Assuming five rounds of edits, this one scenario alone will produce at least 50 near-duplicate documents. And if keyword culling alone is used, more than likely all 50 copies would have a positive hit and therefore be brought into the collection to be reviewed. But, because these are not exact duplicates, a human needs to review them individually.

Near-duplicate analysis is an advanced tool that can be deployed to identify and group these multiple versions together, highlighting the differences in each document. Then one person can compare and review them side by side. The result is much more accurate and efficient than dispersing all of these documents individually across a team of reviewers, and can result in significant time savings.

Editor: What are some of the advantages of the Viewpoint all-in-one solution?

Kuhlmann: I think it’s important to know that we originally architected and developed Viewpoint for our own use. Lateral Data began as a service provider, working with law firms and corporations involved in major lawsuits, SEC investigations and other litigation matters. Rather than buying disparate pieces of technology and trying to get them to work together, we decided to build our own. Because we knew we were going to use the product ourselves, we built it with efficiency, rather than profit, in mind. We engineered an end-to-end platform to most effectively leverage technology and workflow, using one single technology. The many tools needed, including our aforementioned advanced analytics as well as robust pre-processing, processing, review, technology-assisted review, production and case administration and reporting tools, are bundled into this single product, so the client doesn’t need to manage the data or import, export and copy it between various software technologies – probably using three, four or even five different products from beginning to end. Using one product versus several accomplishes the same objective for our clients – greater efficiencies throughout the process, defensibility and better cost predictability.

Editor: How do you see the acceptance of advanced analytics evolving? 

Kuhlmann: As technological advancement continues in our industry, there will always be early adopters who understand the value of innovation and apply it to address their Big Data problems. For example, clients have been using Viewpoint’s email redundancy, thread management and relationship analysis tools, as well as our visual index tool (used for search term refinement and document reduction) for a while. These advanced analytics are proven to help case teams find the information they need in the most efficient way possible, and present results in an easy-to-understand graphical interface that even non-technical users can quickly understand. But, just as last year’s seminal court opinions on technology-assisted review showed, litigants often opt to wait until there is broader judicial acceptance for a particular method. Only late last year did we see an uptick in adoption of our own assisted review technology.  As analytics go through the same acceptable standards process that keywords went through, we will continue to see higher adoption rates of the use of these tools.

Editor: Are companies and law firms adequately informed about available technologies for e-discovery and data management functions?

Kuhlmann: There is an overwhelming amount of information available, so that isn’t the problem. Rather, well-informed companies are choosing to test e-discovery solutions ahead of any large client matters, vetting the process as they go, weeding out the technologies and workflow that don’t work for them. We recommend planning ahead and knowing what you’re going to do before a new case begins. Actually putting your data in a real product and testing it – in a low-cost, low-risk way, such as a hosted proof of concept – will provide you with more clarity on what solution fits your needs. In addition, with the overwhelming amount of information in the marketplace, be sure you select a vendor that offers both the best technology and the best services that meet your needs, including knowledgeable consultants with the expertise to address both simple and complex issues that arise in e-discovery. Consultants can look at your case in advance and tell you, for instance, whether you’ll save money by using technology-assisted review before you go down that path, or whether another set of tools will yield better results. They can also leverage specific expertise throughout the process, such as statistics and linguistics, to better inform your strategy. So rather than simply throwing technology at the problem, companies are increasingly choosing to work with a team that knows how and under what circumstances to use appropriate tools and methodologies. 

Please email the interviewee at with questions about this interview.