natural language generation

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

We introduce the Beyond the Imitation Game benchmark (BIG-bench) to inform future research into (large-scale) language modeling, prepare for disruptive new model capabilities, and ameliorate socially harmful effects. A thorough evaluation of state-of-the-art language models illustrates the challenging nature of BIG-bench.

Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences

We investingate the ability of neural and classification models to reason about (im)moral behavior grounded in concrete, structured, social situations.