SGNMT is an open-source framework for neural machine translation (NMT) and other sequence prediction tasks. The tool provides a flexible platform which allows pairing NMT with various other models such as language models, length models, or bag2seq models. It supports rescoring both n-best lists and lattices. A wide variety of search strategies is available for complex decoding problems.
For example, NMT decoding can be started with this command:
$ python decode.py --predictors nmt --src_test sentences.txt
where sentences.txt is a plain (indexed) text file with sentences. Rescoring OpenFST lattices with NMT is also straight-forward:
$ python decode.py --predictors nmt,fst --fst_path lattices/%d.fst --src_test sentences.txt
See the Tutorial: Basics for more examples.
- Syntactically guided neural machine translation (NMT lattice rescoring)
- NMT support in Theano (Blocks) and TensorFlow (Tensor2Tensor)
- n-best list rescoring with NMT
- Integrating external n-gram posterior probabilities used in MBR
- Ensemble NMT decoding
- Forced NMT decoding
- Integrating language models (Kneser-Ney, NPLM, RNNLM)
- Different search algorithms (beam, A*, depth first search, greedy...)
- Target sentence length modelling
- Bag2Sequence models and decoding algorithms
- Joint decoding with word- and subword/character-level models
- Hypothesis recombination
- Heuristic search
- Extensions to NMT training in Blocks (reshuffling, fixing and customizing word embeddings, ...)
- Neural word alignment (Blocks/Theano)
- Felix Stahlberg, University of Cambridge
- Eva Hasler, SDL Research
The project is licensed under the Apache 2 license.