word sense disambiguation

Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks

We introduce a method for the prediction of disambiguation errors based on statistical data properties, and develop a simple adversarial attack strategy that minimally perturbs sentences in order to elicit disambiguation errors to further probe the robustness of translation models.

Widening the Representation Bottleneck in Neural Machine Translation with Lexical Shortcuts

We argue that the need to represent and propagate lexical features in each layer limits the transformer’s capacity for learning and representing contextual information. To alleviate this bottleneck, we introduce gated shortcut connections between the embedding layer and each subsequent layer within the encoder and decoder, which enables the model to access relevant lexical content dynamically, without expending limited resources on storing it within intermediate states.