Natural language processing

From The Jolly Contrarian
Jump to navigation Jump to search
The JC pontificates about technology

An occasional series.

Get in touch
Comments? Questions? Suggestions? Requests? Sign up for our newsletter? Questions? We’d love to hear from you.
BREAKING: Get the new weekly newsletter here Old editions here

A great hope of reg tech is natural language processing, which presents itself in a handful of varieties of the same thing: a machine that reads contracts for you.


  • Data extraction: Crawling over your portfolio of 40,000 ISDA Master Agreements[1] to extract the 60 key trading and credit terms out of them that the firm neglected to collect over the last 25 years while it was signing them up;
  • Legal agreement review: algorithmically scanning standard-form contracts[2] to identify key terms and risk provisions and save human lawyers from that tedious chore;
  • Chat bots: An online, chat buddy to whom Sales can basic legal questions, thereby saving Sales the aggravation of having to talk to the legal eagles, and legal the utter tedium of having to answer the exact same question to the exact Sales person three or four times daily.

Now reading any text involves judgment, interpretation and negotiation of ambiguity — and bringing to the text the reader’s own understanding of the legal background — while legal language is crafted to avoid ambiguity — there are no metaphors in a trust deed — there are still infinite ways of expressing the same idea, and if there is one part of the imagination a lawyer loves to stretch, it is the part that invents burlesque ways of saying simple things.

In any case, to understand a well-formed English sentence is not just a matter of applying basic rules of language. It is a dynamic process. So expect natural language processing to be easier said than done.

And so it proves.

Legal agreement review

AI can only follow instructions. The meatware can make a call that the instructions are stupid.

There is a well-known and widely-feted natural language processing application[3] which purports to save resources and reduce risk by performing a preliminary review of, say, confidentiality agreements against a preconfigured playbook.

The idea is triage. The application scans the agreement and, using its natural language processing, will pick up the policy points, compare them with the playbook and highlight them so the poor benighted lawyer can quickly deal with the points and respond to the negotiation. The software vendor proudly points to a comparison of their software against human equivalents in picking up policy points in a sample of agreements. The software got 94% of the points. The meatware only got 67%. The Software was quicker. And — chuckle — it needed less coffee. Headline: dumb machine beats skilled human.

But this may highlight a shortfall, not a feature, in the application. The day a palaver of risk controllers set their playbook parameters at their exact hard walkaway point is the day Mr. Gorsky gets to the moon. So, not everything in the playbook says is a problem really is a problem. Much of a playback will be filled with nice-to-haves and other paranoid ramblings of a chicken licken somewhere in a controller group. The very value a lawyer brings is to see a point and say, “yeah, that’s fine, jog on, nothing to see here”. That is the one thing a natural language-processing AI can’t do: the AI can’t make that value judgment and will recommend that you negotiate all playbook points, regardless of how stupid they are.[4] Now if the person operating the AI is an experienced lawyer, she can override the AI’s fecklessness, and just ignore it.

But the point here is to down-skill and save costs, remember. The operator will not be an experienced lawyer. It will be an out-of-work actor in downtown Bratislava who is juggling some ISDA work with a bar job and and an Uber gig. He will be possessed of little common sense, no legal training, and will neither know nor care for “the sensible thing to do”. He will follow the machine’s recommendations slavishly — he is, after all, its slave.

Hence: a wildly elongated, pointless negotiation that will waste time and aggravate the client.

AI can only follow instructions.The meatware can make a call that the instructions are stupid.

Division of labour

And besides, having the AI spot the issues and asking the meatware to fix the drafting gets the triage squarely backwards. Picking up the points — and recognising the large stupid tracts in the playbook[5] — is the “high value work”. That is what the meatware should be doing. Fixing the drafting is the dreary detail. That is where you want your chatbot. But contextually amending human language — you know, actual “natural language processing” — is hard. No AI that we have seen just yet can do it.

Did I miss something?

And how comfortable can we really be that the AI has spotted everything? If we assume — colour me cynical — the “natural language processing” isn’t quite as sophisticated as its marketers would have you believe[6] then it is a bit reckless to put your faith in the reg tech. Is there no human wordsmith who could fool the AI?[7] what if there is an odious clause not anticipated by the playbook?[8] If the meatware can’t wholly trust the AI to have identified all salient points the lawyer must still read the whole agreement to check. Ergo, no time or cost saving.

But this software is designed to facilitate “right-sourcing” the negotiation to cheaper (ergo less experienced) negotiators who will rely on the playbook as guidance, will not have the experience to make a commercial judgement unaided and will therefore be obliged either to escalate, or to engage on a slew of nice-to-have but bottom-line unnecessary negotiation points with the counterparty. Neither are good outcomes. Again, an example of reg tech creating waste in a process where investment in experienced human personnel would avoid it.

The basic insight here is that if a process is sufficiently low in value that experienced personnel are not justified, it should be fully automated rather than partially automated and populated by inexperienced personnel

What is it with Confis?

Note this natural language processing only every seems to work with confidentiality agreements — surely the most pointless legal contracts — wait, wait: hear me out folks — one will encounter in a daily grind. They are well known for only containing the same 6 points — an infinite means of saying them. One never sues[9] under a confidentiality agreement because the loss you would suffer under them is by definition a consequential loss shot through with your own contributory negligence.


  1. You know, the ones printed on faded waxy fax paper and languishing in filing cabinets around the trading floor; the ones scanned into a 57 MB tiff file along with three amendments, forty pages of specimen signatures, a power of attorney, hand-annotated emails from Credit and the five key pages of the Schedule missing; the ones that are misfiled as Swiss rahmenvertrags; the ones that are just not there at all.
  2. To date, only one any one has successfully managed is the one that no-one really cares about: the confidentiality agreement
  3. Which shall remain nameless, though you don’t have to be a total nerd to know who we have in mind.
  4. True: this isn’t the AI’s fault, but it is inevitable, and it is the AI’s limitation.
  5. Much of the playbook will be non-essential “perfect world” recommendations (“nice-to-haves”) which an experienced negotiator would quickly be able to wave through.
  6. That is is a glorified key-word search, in other words.
  7. I bet I could. It is hardly challenging to insert an indemnity which does not use the words “indemnity”, “hold harmless” or “reimburse”.
  8. Given how fantastically paranoid a gathering of risk controllers can be this seems a remote risk, I grant you, but risks are fractal, remember. And emergent in unexpectable ways. The collective noun for a group of risk controllers is a Palaver, by the way.
  9. Well: have you ever sued, or been sued under one?