gray concrete road

NLG Demystified

The road to commercially deployable Natural Language Generation is straighter than might first appear. The four prerequisites are simple enough: structured data, subject matter expertise, NLG writing talent, and the technology to pull it all together. 

Current NLG evangelism tends to muddy this clear recipe by implying that nascent AI technologies are substitutable for the four basic ingredients: AI can structure your data, study it to become an expert in your field, and then reel off insights like a professor in a leather armchair. That may become true someday, but in the near term we don’t expect our cars to fly over traffic jams. Futurism makes for lively panel discussions and rich academic research, but it is of no use in creating commercially deployable NLG within your project timeline. 

When NLG is depicted as a black box that instantly transforms raw data into perfect text that no human would have thought to write, decision makers should begin checking their watches. Exactly the contrary: deployable NLG produces only what humans have already written. Every word, phrase, and sentence is first scripted by humans and then conditionally invoked by the underlying data so that the final output rings true.

The power of commercial NLG is in the amplification—not the circumvention—of human domain expertise and analytical skill.

Here’s a fact worthy of bold print: The power of commercial NLG today is in the amplification—not the circumvention—of human domain expertise and analytical skill. 

Natural Language Generation defeats a production constraint that has been insurmountable since monks first touched quill to parchment: even the smartest humans write one sentence at a time.

NLG is useful wherever humans are (or could be) writing about structured data sets.

NLG use cases abound. For now, let’s look briefly at Commercial Real Estate (“CRE”) analytics, a field rich with structured data.

Suppose six CRE analysts are tasked with producing quarterly evaluations of 3,000 commercial real estate submarkets. The reports are structurally similar, with content and conclusions varying according to the input data. Each analyst completes two evaluations per day. At the maximum team rate of sixty per week, we have major problems:

  • Can’t get to all submarkets! Working at full capacity for the entire quarter, the six analysts will complete 780 reports, leaving nearly 75% of the submarket pool unexamined when the quarter rolls.
  • Let’s quadruple the headcount. During a quarter, the analysts can get to all 3,000 submarkets. Good, right? Not really. Headcount expenses are up by 300%, and the analysts are still working linearly, like those early monks. The last few hundred submarket reports contain conclusions that are based on increasingly stale data.

NLG multiplies human output.

The solution is to encode the CRE analysts’ expertise and writing style into an NLG system that will produce an unlimited number of reports all at once. The output documents will contain the same structural consistency and stylistic variations that human analysts render at their one-at-a-time pace, but the speed of the new system is limited only by processing power. Which is to say, analytical capacity is effectively unlimited. In our CRE example, that means 3,000 submarket reports are ready for delivery before the office opens on the first day of the quarter. Each report reads as if it is the peer-reviewed work of human analysts, precisely because it is.

NLG is straightforward, not mysterious.

NLG is commonly presented as hard to understand but easy to create, when in fact the opposite is true. It’s easy to understand, but the up-front work can be daunting. There are no shortcuts, but the effort pays off: the result of a successful NLG project is a business asset that lasts for years. 

For a discussion of how best to get started—including how to bring reluctant traditional writers to the table—take a look at NLG Recipe

Copyright 2021 Qwerticulation LLC

About the author: Greg Williams is the founder of Nila June Instant Property Descriptions, a natural language platform that produces property listing descriptions in response to real estate agents’ responses to a simple survey.  As head of product for Reis, Greg produced the CRE industry’s most widely deployed NLG system for the evaluation of commercial real estate markets, submarkets, and properties throughout the nation.

Qwerticulation is an AX Semantics Gold Managed Services Provider.

“Qwerticulation” is a portmanteau of “QWERTY” and “articulation.”