IBM Watson might transform, but will it fix the law itself?

It happened recently as I was looking at demonstrations of two very interesting new technology-based companies that help automate due diligence processes.

eBrevia and Diligence Engine use machine-leaning technology to help lawyers review large sets of contracts in M&A situations, looking for risks and pitfalls.

They are both classic examples of the type of technological innovation that’s happening at lightning speed in the legal industry today. They are addressing an enormous pain point in the industry: the mind-numbing burden of manually reviewing thousands of contracts as part of a due diligence mandate. It’s the kind of work that has provided a good living for generations of young law firm associates, but it is not efficient and, being human-based, not always very accurate.

These two new companies are using technology to make that review more efficient and accurate, by analyzing, summarizing, and extracting structured data from big masses of unstructured and wildly inconsistent documents. To the extent the technology works, it’s because it imposes some kind of order on the non-standard work of human lawyers.

That’s what struck me during one of those demonstrations. The problem that eBrevia and Diligence Engine are trying to solve is exacerbated by the messiness of the data. The data is messy because lawyers tend not to draft agreements according to consistent standards. If anything, the application of analytics to this particular form of legal data will only serve to expose how messy those contracting practices are.

And that was the big “a-ha” moment that day. These products address the short-term pain point that tedious review of thousands of contracts presents. But there may be a long-term benefit as well — using these technologies to extract the essence of documents and impose order where little exists will only serve to highlight how difficult lawyers have made things for themselves.

Lawyers might also start to see how much easier they can make things if they start imposing that order and standardization a little more on the front-end of legal processes, rather than expending so much effort downstream trying to make sense of documents after a mission-critical legal event occurs (litigation or an M&A deal).

In that longer term, perhaps the most lasting impact of legal analytics and machine-learning will be not so much in the remediation of the immediate problem (the burden of contract review), as it will be in forcing more order and standardization into the work that lawyers already do.

And I’m not the only one who is thinking this way.

Paul Lippe of Legal OnRamp has some good thoughts on IBM’s Watson, which is pushing its way into legal applications (See the piece Lippe and Michigan State University College of Law’s Daniel Martin Katz wrote, entitled “10 predictions about how IBM’s Watson will impact the legal profession.” Watson presents the prospect of a fundamental change from legal search that simply returns relevant documents to one that actually answer questions, leveraging semantics across multiple unstructured data sources to access deeper levels of meanings.

The question of whether or not Watson will be able to effectively be applied in the law is on many legal technologists’ minds. A team of technologists at the University of Toronto has recently announced its intent to build a new legal research application leveraging Watson, and recently finished second among 10 finalists in a competition among Watson applications.

Their application of the Watson technology, called Ross, is expected to be able to take on legal research tasks: “Basically, what we built is the best legal researcher available,” explains Ross co-founder Andrew Arruda. “It’s able to do what it would take lawyers hours to do in seconds.” Lippe’s own firm Legal OnRamp is using Watson in a compliance context, to analyze and predict worst case scenarios that would require documented recovery plans.

But Lippe also thinks not so much about what Watson will to do to legal research or compliance, but what it will do to the law itself. The key passage: “Watson will force a much more rigorous conversation about the actual structure of legal knowledge. Statutes, regulations, how-to-guides, policies, contracts and of course case law don’t work together especially well, making it challenging for systems like Watson to interpret them. This Tower of Babel says as much about the complex way we create law as it does about the limitations of Watson.”

And a little later: “Watson (or something like it) will likely become a standard authoring/query model. Just as most companies today write their Web information to optimize for Google’s search, professional knowledge (which is published in a multi-tier structure) will want to be better synthesized through a system like Watson and will adopt new authoring and publishing norms.”

In other words, Watson and other new technologies aren’t just helping us make sense out of the messiness of the legal domain; it is opening our eyes to just how messy it really is. And they might go a long way toward prompting us to fix the messiness in the longer term. The reason we’ll fix it is that we want better inputs for systems like Watson (and eBrevia and Diligence Engine) to work on. The law needs to be “optimized” for analytics and machine-learning.

There are other examples of where the application of analytics to messy, unstructured content will likely lead to more standardization and better data hygiene upstream. The successful use of predictive analytics in e-discovery has already spawned a movement to apply the same technologies upstream in corporate data depositories, in order to discover risks and liabilities before they become the subject of litigation. See the work of the Information Governance Initiative, which had a strong presence at the recent LegalTech show, which is otherwise fairly dominated by e-discovery technology.

The point is this: The law is not exempt from the garbage in/garbage out rule.

The application of today’s analytics and machine-learning to law is exposing just how much “garbage” is out there. Dealing with the unstructured and non-standard nature of legal materials, and extracting meaning from them, will keep many lawyers and many machines busy for the foreseeable future. But I would expect that ultimately all this analytics work will result in long-term changes in the way law is created, structured, organized and communicated in the first place.

Lawyers working in this environment — whether they are in private practice, corporations, courts, legislatures, or governments — can expect a future in which not only their work-processes but also their work-product will be subject to more rigorous standardization. The need for sound legal reasoning and advice will remain, but the containers in which that work is delivered will continue to standardize.


David Curle is the director of strategic competitive intelligence at Thomson Reuters Legal, providing research and thought leadership around the competitive environment and the changing legal services industry. This article originally appeared on the Legal Executive Institute blog.