natural language generation
  
  
  
  
  
    
    
    
    
      
      
        
          We introduce the Beyond the Imitation Game benchmark (BIG-bench) to inform future research into (large-scale) language modeling, prepare for disruptive new model capabilities, and ameliorate socially harmful effects. A thorough evaluation of state-of-the-art language models illustrates the challenging nature of BIG-bench.
        
      
     
  
    
    
    
    
      
      
        
          We investingate the ability of neural and classification models to reason about (im)moral behavior grounded in concrete, structured, social situations.