Skip to main content

IBM, MIT, and Harvard’s AI uses grammar rules to catch linguistic nuances of U.S. English

Image Credit: raindrop74 / Shutterstock

Watch all the Transform 2020 sessions on-demand here.


What’s the difference between independent and dependent clauses? Is it “me” or is it “I”? And how does “affect” differ from “effect,” really? Ample evidence suggests a strong correlation between grammatical knowledge and writing ability, and new research implies the same might be true of AI. In a pair of preprint papers, scientists at IBM, Harvard, and MIT detail tests of a natural language processing system trained on grammar rules — rules they say helped it to learn faster and perform better.

The work is scheduled to be presented at the North American Chapter of the Association for Computational Linguistics conference in June.

“Grammar helps the model behave in more human-like ways,” said Miguel Ballesteros, a researcher at the MIT-IBM Watson AI Lab and coauthor of both papers, in a statement. “The sequential models don’t seem to care if you finish a sentence with a non-grammatical phrase. Why? Because they don’t see that hierarchy.”

The IBM team, along with scientists from MIT, Harvard, the University of California, Carnegie Mellon University, and Kyoto University, devised a tool set to suss out grammar-aware AI models’ linguistic prowess. As the coauthors explain, one model in question was trained on a sentence structure called recurrent neural network grammars, or RNNGs, that imbued it with basic grammar knowledge.


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


The RNNG model and similar models with little-to-no grammar training were fed sentences with good, bad, or ambiguous syntax. The AI systems assigned probabilities to each word, such that in grammatically “off” sentences, low-probability words appeared in the place of high-probability words. These were used to measure surprisal.

The coauthors found that the RNNG system consistently performed better than systems trained on little-to-no grammar using a fraction of the data, and that it could comprehend “fairly sophisticated” rules. In one instance, it identified that “that” in the sentence clause “I know that the lion devoured at sunrise” improperly appeared instead of “what” to introduce the embedded clause, a construction linguists call a dependency between a filler (a word like “who” or “what”)  and a gap (the absence of a phrase where one is typically required).

Filler and gap dependencies are more complicated than you might think. In the sentence “The policeman who the criminal shot the politician with his gun shocked during the trial,” for example, the gap corresponding to the filler “who” is a bit anomalous. Technically, it should come after the verb “shot,” not “shocked.” Here’s the rewritten sentence: “The policeman who the criminal shot with his gun shocked the jury during the trial.”

“Without being trained on tens of millions of words, state of the art sequential models don’t care where the gaps are and aren’t in sentences like those,” said professor in MIT’s Department of Brain and Cognitive Sciences and studies coauthor Roger Levy. “A human would find that really weird, and apparently, so do grammar-enriched models.”

The team claims their work is a promising step toward more accurate language models, but they concede that it requires validation on larger data sets. They leave this to future work.