Man And Machine: A Look Forward Into The Future Of eDiscovery

Monday, August 1, 2011 - 01:00
Erik Laykin

Erik Laykin

Is the question "man vs. machine" or "machine helps man" or "man helps machine"? I tend to think that the latter holds the most promise. Due to the increasing complexity and volume of data facing litigants and corporations involved in disputes, the notion that man alone could manage the gargantuan task of reviewing and coding millions of documents is no longer the accepted practical reality.

However, the notion that a machine could replace the human quotient in document analysis and contextual interpretation is still for the most part the stuff of science fiction. There are even those of us who anticipate that HAL, the ever-knowing computer onboard the space transport ship in Stanley Kubrick's classic, 2001: A Space Odyssey, is on our doorstep and both willing and able to successfully challenge the modern review attorney's full abilities. This super machine challenge, akin to a "Deep Blue vs. Garry Kasparov" showdown is titillating and perhaps even functionally possible in the abstract or in tightly controlled circumstances; however, in practice, the fundamentals of human reason and understanding still trump our greatest artificial computational abilities, or at least those that are available to the private sector and eDiscovery practitioners in particular.

The real question is: will "machine help man" or "man help machine"? Who will lead the way? While the utility of the machine has enabled man in countless ways to leverage his time, available energy and resources, there have been many cases in which the machine's benefit continues to grow while the necessary human element remains static. Think: cart which becomes wagon which becomes carriage which becomes automobile which becomes the space shuttle. Throughout this paradigm, the machine provides increasing utility to the human. The age of transportation and the machines that transport humans will continue to evolve while the role of the human remains essentially the same: "Guide the machine to take me from point A to point B."

On the other hand, the computational powers of computers are fast closing the gap on the cognitive advantages that humans possess. There are those who believe that the commercially available computer of the not too distant future will not only surpass the billions of computations the human brain now uses to maintain its evolutionary edge but that the structural human advantage of allowing for fully distributed parallel processing will be duplicated as well. This promise has been in the works for years and is often referred to as Artificial Intelligence (AI) or Neural Networks.

The present state of affairs, however, is still far from the final destination of providing machines that truly learn, feel and, most anxiously anticipated of all, achieve consciousness. That being said, humanity has had to acquiesce to the superlative accomplishments and capabilities of machines that are designed for a singular or relatively narrow set of tasks in which their sheer processing power outstrips anything remotely possible by our own three pounds of grey matter. Examples abound, including Google and its ability to search and retrieve information across cross-indexed data sets numbering in the billions in a fraction of a second - a feat that by any stretch of the imagination would be pure fantasy as recently as the era of the Earl Warren Supreme Court.

The capabilities of these systems continue to evolve, and following the tenets of Moore's law, which has since 1958 proven relatively accurate and essentially states that computational processing power will double every two years, it would not seem unreasonable if one were to forecast out another human generation or two, that we should expect to one day spend time with "Artificial Lawyers." These "legal machines" would provide analysis, advice and counsel in a conversational manner based on the deepest possible analysis of the law, legal theory, actual case facts and a real time interpretation of the forces governing the parties involved, including even the judge whom the case has been assigned, factoring in a weighted analysis of all of his past decisions, personal proclivities and general temperament. Welcome to the world of HAL? Perhaps. Will the Supreme Court of 2111 AD, one hundred years from now, consist of nine machines equally of supreme intelligence but each slightly tuned or adjusted for varying degrees of liberal or conservative interpretation of the constitution?

With all of these extraordinary technical capabilities, possibilities and the promise of ground breaking efficiencies which could free up human time and energy for the pursuit of even loftier creative and intellectual missions than the practice of law, we should however still temper our enthusiasm for turning over all of the tough stuff to the integrated circuits and spinning disks in the box on our desk. Today's reality and the reality for the coming decade will still allow the billable hour of human cognitive function to trump that of the machine, or at the very least hold its place in par.

While it is true that much of humanity has given up the pickaxe and shovel for the efficiencies of the backhoe and earthmover, is it equally wise to shelve our contextual and analytical gifts in favor of machines that offer an equivalent or better? By letting our skills in one area fade away, do we gain something more significant in other areas?

By example, when was the last time you took out a pencil and paper and tried some old-fashioned multiplication and division? If you are like me, you may have found that you are a little rusty these days.

When it comes to the promise that machines can do much of the heavy lifting in eDiscovery, I am a proponent of leveraging these platforms as much as one reasonably can but only to the extent that the process is fully defensible, that is, without sacrificing the credibility of the process or the utility and relevancy of the final output.

The eDiscovery technology landscape has benefited hugely from the advances in computing power and ever more sophisticated software tools. These tangible realizations of the computing dream now allow practitioners to carve up, slice and dice, purge and extract, filter and deduplicate ever larger data sets for lower and lower costs.

Reliance on one system or vendor alone, however, may not provide all of the advantages that the leaps in technology can offer. By example, a significant trend in the United States today is for corporations to bring many of the traditional eDiscovery processes in-house. While there is an argument for keeping some of these tasks out of the enterprise, those that do successfully implement them internally allow their office of the general counsel and their information governance staff to gain meaningful tactical advantages in understanding the composition, structure and minor nuances of the enterprise's corpus of data. Systems including Early Case Assessment (ECA) tools combined with enterprise-wide data categorization and archiving systems as well as legal hold tools have proven to offer overall reductions in legal spend and have shown efficiencies in the handling of repetitive data requests from multiple unrelated legal or regulatory actions.

From practice, I continue to be left with the impression that the prudent contemporary course of action for most enterprises, when considering complex data management and identification systems in support of its eDiscovery obligations, is to interpolate multiple technologies throughout departments and systems. This will allow the enterprise to leverage best-of-breed hardware and software for various applications and objectives while simultaneously maintaining the flexibility to jettison the ill-performing tools without compromising the overall platform.

If anything is certain in the world of technology, it is that it is ever evolving, and what works as solutions to today's challenges may not be sufficient to meet tomorrow's challenges. By example, in my own practice, our teams leverage multiple computer forensic software applications and hardware solutions, which on first pass would appear to be duplicative of one another. However, in practice, when in the field confronted with an endless array of possible technical challenges, each of these tools offers certain strengths and weaknesses. Unlike our three-pound human brains, they are for the most part stand-alone tools that have a very defined purpose and meet that need well, but when required to perform outside of their sweet spot, they fail. Thus, maintaining overlapping capacity, which takes into account the uncertainties of actual practice, can prove to offer both security and assurance that the right tools can be used at the right time to meet the need at hand.

The significant strides made by electronic discovery software companies coupled with ever-more powerful platforms for these complex systems to reside on has enabled a new generation of tools that the litigator, investigator, eDiscovery professional or corporate executive can now leverage to drastically reduce the reliance on human input during the processing, analysis and review phases of eDiscovery. Much of the attention of technical firms in recent years has been focused on the world of review because of the massive cost burden that this phase of eDiscovery places on litigants. The pressures to reduce the reliance on human interaction and analysis of individual electronic documents has been enhanced by the exponentially expanding document sets contained within corporate environments and even further increased by economic pressures facing companies and litigants of all stripes.

Many of the processing and analysis tasks have already been well automated, and it is through the combination of the automation of each of these primary eDiscovery tasks, as well as their integration into the more fundamental enterprise-wide records and information management systems, that we will eventually witness the final consolidation of the eDiscovery technology marketplace. The eDiscovery technology firms that emerge in the future from this consolidation will almost certainly be fully integrated components of knowledge management firms that have the ability to provide end-to-end holistic solutions for the entire data lifecycle.

The last humans standing will be those that not only maintain ubiquity behind the firewall but also have managed to integrate uniformly throughout the cloud and onto the initial point of entry on the global networks, the end-users' device of choice.

At present, we are witnessing the evolution of one of the important steps toward these goals, which is manifested in the terms "Machine assisted review," "Predictive Coding," "Intelligent Automated Review" and in some cases buttresses the "man helps machine" concept. For this discussion I use the term "Machine" to describe a combined system of hardware and software that is specifically designed to interpret data sets contextually so as to limit, reduce or remove the necessity of interpretive human decisions being made on each or most of the documents in the data set. The actual term "Machine assisted review" implies that this is a case of "Machine helping Man." But as the machines continue to evolve and learn, we will come to a point where we have "Man helping Machine" whereas the machine is not only capable of the brute force labor but is also found to be trustworthy on a limited basis when forming opinions or conclusions as to the context and meaning of data sets on a particular matter and within a specific set of limitations.

Challenges remain, and there will be tough lessons learned as we leverage these new technology platforms. eDiscovery companies will continue to be granted patents for their breakthroughs, and litigants will learn the hard way which systems work and which do not. Law firms and corporations will continue to experiment with finding the right balance between leveraging technology and humans while balancing cost, efficiency and risk. Practitioners will learn that "Predictive Coding" is just that - predictive, and that all predictions contain the inherent risk of being wrong. This is why they are predictions, not facts.

Weather predictions, stock predictions, fertility predictions and election predictions all share the commonality that some percentage of those predictions will be accurate, some will be inaccurate. This reality, however, should not prevent certain litigants from leveraging predictive coding as its overall benefits may very well outweigh the risks. Proceed with caution and pay heed to Ronald Reagan's adage, "trust but verify." In fact, Predictive Coding is an example of "Human helping Machine," as the sampling and testing process following a predictive coding exercise may very well represent the "trust but verify" process that only a bona fide human can undertake.

The unique technical, geographic, logistic, legal and contractual challenges represented by the necessity of identifying, preserving, processing and reviewing data in the enterprise, on the desktop, on the tablet and the phone and especially in the cloud are only now starting to be fully appreciated across the full spectrum of the legal and corporate worlds. In the coming years, we will witness a third significant growth spurt in the eDiscovery technology space as companies move to adapt the evolving intelligent machines capable of accomplishing a number of formerly "human only" tasks to the global networked data eco-system, which is both quickly expanding and rapidly gaining ubiquity and which we politely call "the cloud."

From this place until we finally at some point in the future learn how to upload our own consciousness onto a machine, thereby providing us with intellectual immortality by separating our being from our fragile and expiration prone-bodies, we will continue to witness the evolution of man and machine. It may come to pass as an interesting footnote in the history of the future that the requirements of eDiscovery were one of the catalysts for this transformation of the human experience.

Please email the interviewee at with questions about this article.