.. _examples-label: Examples =============================== SGNMT supports many different options, and small changes in the decoder configuration sometimes lead to very different results. This page contains a list of SGNMT configuration files which have been used in our group. Details such as paths and vocabulary sizes are exemplary as we do not provide model files. Some of the examples are not comparable to the SGNMT 1.0 and higher as they rely on Theano/Blocks or NPLM backends that are not supported anymore. However, we hope that this list is still useful as a blueprint for your own experiments. University of Cambridge submission to WMT18 (Tensor2Tensor) *********************************************************** These are the ini-files for the final decoding passes. Please write to fs439@cam.ac.uk for access to the trained models. English-German:: predictors: t2t,t2t,ngramc,ngramc,ngramc,ngramc,wc predictor_weights: 5.7,4.3,0.1,0.1,1.0,0.75,1.5 src_test: /data/mifs_scratch/fs439/data/wmt18/test/indexed_bpe/test18ts.ende.bpe.en beam: 8 early_stopping: false pred_src_vocab_size: 35627 pred_trg_vocab_size: 35627 indexing_scheme: t2t t2t_problem: translate_ende_wmt18 # Relative transformer model t2t_checkpoint_dir: t2t_train/translate_ende_wmt18/transformer-transformer_relative_big_large_batch4/average.1000k/ t2t_model: transformer t2t_hparams_set: transformer_relative_big # Transformer model t2t_checkpoint_dir2: t2t_train/translate_ende_wmt18/transformer-transformer_big_large_batch4/average.1000k t2t_model2: transformer t2t_hparams_set2: transformer_big # slicenet ngram posteriors ngramc_path: supplementary/ende2/ss.ngramc_test18ts/%d.txt # rnn ngram posteriors ngramc_path2: supplementary/ende2/rr.ngramc_test18ts/%d.txt # r2l ngram posteriors ngramc_path3: supplementary/ende2/ender.ngramc_test18ts/%d.txt # SDL posteriors ngramc_path4: supplementary/ende/sdl.ngramc_test18ts/%d.txt outputs: nbest,text,ngram German-English:: (like English-German with the following predictor weights) predictor_weights: 4.2,3.8,0.1,0.125,0.75,1.5,-1.2 Chinese-English:: verbosity: debug predictors: t2t,t2t,ngramc,ngramc,ngramc,ngramc,wc predictor_weights: 6.5,5.5,0.375,0.375,0.375,0.5,-0.5 src_test: /data/mifs_scratch/fs439/data/wmt18/test/indexed_bpe/test18.enzh.bpe.zh beam: 8 early_stopping: false pred_src_vocab_size: 43738 pred_trg_vocab_size: 34306 indexing_scheme: t2t t2t_problem: translate_zhen_wmt18 # Relative transformer model t2t_checkpoint_dir: t2t_train/translate_zhen_wmt18/transformer-transformer_relative_big_large_batch4/average.1000k/ t2t_model: transformer t2t_hparams_set: transformer_relative_big # Transformer model t2t_checkpoint_dir2: t2t_train/translate_zhen_wmt18/transformer-transformer_big_large_batch4/average.1000k t2t_model2: transformer t2t_hparams_set2: transformer_big # slicenet ngram posteriors ngramc_path: supplementary/zhen2/ss.ngramc_test18/%d.txt # rnn ngram posteriors ngramc_path2: supplementary/zhen2/rr.ngramc_test18/%d.txt # r2l ngram posteriors ngramc_path3: supplementary/zhen2/zhenr.ngramc_test18/%d.txt # SDL posteriors ngramc_path4: supplementary/zhen/sdl.ngramc_test18/%d.txt outputs: nbest,text,ngram Joint Word/BPE decoding with multisegbeam (Blocks/Theano) *********************************************************** Ensemble of a word-to-word system (vocabulary size: 30003) and a bpe-to-bpe system using the ``multisegbeam`` decoding strategy, tested on WAT (ja-en):: predictors: nmt,altsrc_nmt src_test: data/dev.ids.ja altsrc_test: data/dev.bpe.ja decoder: multisegbeam multiseg_tokenizations: '30003:data/wmap.en,eow:data/wmap.bpe.en' Lattice rescoring with three NMT systems (Blocks/Theano) ******************************************************** Rescoring an SMT lattice on ja-en WAT with an ensemble of three NMT systems (ja-en WAT):: predictors: fst,nmt,nmt,nmt predictor_weights: 0.5,0.5,0.5,0.5 src_test: data/test.bpe.ja early_stopping: false hypo_recombination: true fst_path: lats.bpe_test/%d.fst use_fst_weights: true gnmt_beta: 0.01 nmt_config: src_vocab_size=32081,trg_vocab_size=30123 nmt_path2: ../jaen-wat-bpe2/train nmt_path3: ../jaen-wat-bpe3/train Iterative NMT beam search (Blocks/Theano) ************************************************** Using the ``bucket`` decoding strategy for pure NMT decoding with multiple beam search passes:: src_test: data/test15.ids.en decoder: bucket bucket_selector: iter max_node_expansions: 2000 predictors: nmt length_normalization: true nmt_config: src_vocab_size=50003,trg_vocab_size=50003 MBR-based NMT with a 3-ensemble (Blocks/Theano) ************************************************** Using MBR-style n-gram posteriors together with a ensemble of three NMT systems on ja-en WAT:: predictors: nmt,ngramc,wc,nmt,nmt predictor_weights: 0.53125,0.46875,0.46875,0.53125,0.53125 src_test: data/test.bpe.ja allow_unk_in_output: false early_stopping: false hypo_recombination: true ngramc_path: lats.ngramc.smooth.bpe_test/%d.txt gnmt_beta: 0.01 nmt_config: src_vocab_size=32081,trg_vocab_size=30123 nmt_path2: ../jaen-wat-bpe2/train nmt_path3: ../jaen-wat-bpe3/train MBR-based NMT with separately tuned Thetas (Blocks/Theano) ********************************************************** Tuning the n-gram posterior weights for each n-gram order separately on WMT15 en-de:: predictors: ngramc,ngramc,ngramc,ngramc,wc,nmt predictor_weights: 1.0,0.603674,0.0950858,0.514882,0.713726,0.510412 src_test: data/test15.ids.en beam: 2 hypo_recombination: true allow_unk_in_output: false ngramc_path: lats.mapped.ngramc.smooth_test15/%d.txt ngramc_order: 1 ngramc_order2: 2 ngramc_order3: 3 ngramc_order4: 4 gnmt_beta: 0.2 nmt_config: src_vocab_size=50003,trg_vocab_size=50003 Exhaustive n-best list rescoring with NMT (Blocks/Theano) ********************************************************* Rescoring a complete n-best list on WMT15 en-de. The n-best list and the NMT system use different word IDs:: src_test: data/test15.ids.en decoder: dfs early_stopping: false predictors: idxmap_forcedlst,nmt src_idxmap: data/idxmap.org.test15.en trg_idxmap: data/idxmap.org.test15.de use_nbest_weights: false trg_test: ../ende-wmt15/results/hifst_test15.1000best nmt_config: src_vocab_size=50003,trg_vocab_size=50003 NMT with length model and neural language model (Blocks/Theano) *************************************************************** Pairing NMT with a length model and a neural language model, and rescoring HiFST translation lattices with the mix. The lattice and the LM use alternative word maps:: src_test: data/test15.ids.en predictors: idxmap_fst,length,idxmap_nplm,nmt predictor_weights: 8.144021,0.579325,1.192874,4.347711 src_idxmap: data/idxmap.org.test15.en trg_idxmap: data/idxmap.org.test15.de fst_path: lats_test15/%d.fst nmt_config: src_vocab_size=50003,trg_vocab_size=50003 use_fst_weights: true length_model_weights: 0.252503399924538,1.26556504208994,0.0476145832475248,0.507108282728234,0.0706249583462012,0.00156446527534046,-0.0114873442886072,0.00724551243039656,-0.108343582699869,-0.225865854796484,0.183585648431748,-0.367378141618226 src_test_raw: data/test15.en nplm_path: nplm/news12-14.de.nnlm.news12-14.5gram-model.large_8.50000.0.10.24.3186.10