Cambridge SMT System
|
Data class containing relevant variables. To be used as template for task classes using it. More...
#include <data-main.createssgrammar.hpp>
Public Member Functions | |
HifstTaskData () | |
HifstTaskData () | |
fst::VectorFst< ArcT > * | getFst (std::string const &key) |
Public Attributes | |
uint | sidx |
Sentence index. More... | |
const GrammarData * | grammar |
Contains translation grammar. More... | |
unordered_map< std::size_t, std::string > | oovwmap |
Contains oovs. More... | |
std::string | originalsentence |
source sentence More... | |
std::string | tokenizedsentence |
std::string | sentence |
std::vector< std::string > | pinstances |
Pattern instances. More... | |
unordered_map< std::string, std::vector< pair< uint, uint > > > | hpinstances |
SentenceSpecificGrammarData * | ssgd |
Sentence-specific grammar information – hashes to rule indices. More... | |
unordered_set< std::string > | tvcb |
Target vocabulary. More... | |
CYKdata * | cykdata |
cyk data structures More... | |
boost::shared_ptr< ucam::fsttools::StatsData > | stats |
To collect statistics across the whole pipeline. More... | |
std::string * | translation |
Translated sentence will be stored here. More... | |
unordered_set< std::string > * | recasingvcblm |
mixed-case vocabulary of the recasing unigram language model More... | |
unordered_map< std::string, ucam::util::WordMapper * > | wm |
Wordmap/Integer map objects. More... | |
unsigned | sidx |
Sentence index. More... | |
unordered_map< std::string, std::vector< pair< unsigned, unsigned > > > | hpinstances |
std::vector< fst::VectorFst< ArcT > * > | filters |
unordered_map< std::string, void * > | fsts |
Pointers to lattices (e.g. translation lattice, lmbr, etc) , and related, accessed by unique keys. More... | |
unordered_map< std::string, std::vector< const KenLMData * > > | klm |
Collections of language models accessed by keys (e.g. in translation we need a bunch for hifst and one for recaser) More... | |
unsigned | numlocallm |
Number of local language models used in hifst. More... | |
boost::shared_ptr< StatsData > | stats |
To collect statistics across the whole pipeline. More... | |
grammar_inversecategories_t | vcat |
This information used for stats. More... | |
unordered_map< std::string, WordMapper * > | wm |
Wordmap/Integer map objects. More... | |
Data class containing relevant variables. To be used as template for task classes using it.
Definition at line 31 of file data-main.createssgrammar.hpp.
|
inline |
Definition at line 37 of file data-main.createssgrammar.hpp.
|
inline |
Definition at line 38 of file data-main.hifst.hpp.
|
inline |
Definition at line 86 of file data-main.hifst.hpp.
CYKdata * ucam::hifst::HifstTaskData< ArcT >::cykdata |
cyk data structures
Definition at line 73 of file data-main.createssgrammar.hpp.
std::vector< fst::VectorFst<ArcT> *> ucam::hifst::HifstTaskData< ArcT >::filters |
Definition at line 80 of file data-main.hifst.hpp.
unordered_map<std::string, void * > ucam::hifst::HifstTaskData< ArcT >::fsts |
Pointers to lattices (e.g. translation lattice, lmbr, etc) , and related, accessed by unique keys.
Definition at line 84 of file data-main.hifst.hpp.
const GrammarData * ucam::hifst::HifstTaskData< ArcT >::grammar |
Contains translation grammar.
Definition at line 49 of file data-main.createssgrammar.hpp.
unordered_map<std::string, std::vector< pair <uint, uint> > > ucam::hifst::HifstTaskData< ArcT >::hpinstances |
Holds instanced patterns (string) over the sentence, mapped to extra information pair<1,2>: positions at which these were encountered (1), and minimum span (2).
Definition at line 64 of file data-main.createssgrammar.hpp.
unordered_map<std::string, std::vector< pair <unsigned, unsigned> > > ucam::hifst::HifstTaskData< ArcT >::hpinstances |
Holds instanced patterns (string) over the sentence, mapped to extra information pair<1,2>: positions at which these were encountered (1), and minimum span (2).
Definition at line 67 of file data-main.hifst.hpp.
unordered_map<std::string, std::vector <const KenLMData *> > ucam::hifst::HifstTaskData< ArcT >::klm |
Collections of language models accessed by keys (e.g. in translation we need a bunch for hifst and one for recaser)
Definition at line 99 of file data-main.hifst.hpp.
unsigned ucam::hifst::HifstTaskData< ArcT >::numlocallm |
Number of local language models used in hifst.
Definition at line 101 of file data-main.hifst.hpp.
unordered_map< std::size_t, std::string > ucam::hifst::HifstTaskData< ArcT >::oovwmap |
Contains oovs.
Definition at line 52 of file data-main.createssgrammar.hpp.
std::string ucam::hifst::HifstTaskData< ArcT >::originalsentence |
source sentence
Definition at line 55 of file data-main.createssgrammar.hpp.
std::vector< std::string > ucam::hifst::HifstTaskData< ArcT >::pinstances |
Pattern instances.
Definition at line 60 of file data-main.createssgrammar.hpp.
unordered_set< std::string > * ucam::hifst::HifstTaskData< ArcT >::recasingvcblm |
mixed-case vocabulary of the recasing unigram language model
Definition at line 82 of file data-main.createssgrammar.hpp.
std::string ucam::hifst::HifstTaskData< ArcT >::sentence |
Definition at line 57 of file data-main.createssgrammar.hpp.
uint ucam::hifst::HifstTaskData< ArcT >::sidx |
Sentence index.
Definition at line 44 of file data-main.createssgrammar.hpp.
unsigned ucam::hifst::HifstTaskData< ArcT >::sidx |
Sentence index.
Definition at line 46 of file data-main.hifst.hpp.
SentenceSpecificGrammarData * ucam::hifst::HifstTaskData< ArcT >::ssgd |
Sentence-specific grammar information – hashes to rule indices.
Definition at line 67 of file data-main.createssgrammar.hpp.
boost::shared_ptr<ucam::fsttools::StatsData> ucam::hifst::HifstTaskData< ArcT >::stats |
To collect statistics across the whole pipeline.
Definition at line 76 of file data-main.createssgrammar.hpp.
boost::shared_ptr<StatsData> ucam::hifst::HifstTaskData< ArcT >::stats |
To collect statistics across the whole pipeline.
Definition at line 104 of file data-main.hifst.hpp.
std::string ucam::hifst::HifstTaskData< ArcT >::tokenizedsentence |
Definition at line 56 of file data-main.createssgrammar.hpp.
std::string * ucam::hifst::HifstTaskData< ArcT >::translation |
Translated sentence will be stored here.
Definition at line 79 of file data-main.createssgrammar.hpp.
unordered_set< std::string > ucam::hifst::HifstTaskData< ArcT >::tvcb |
Target vocabulary.
Definition at line 70 of file data-main.createssgrammar.hpp.
grammar_inversecategories_t ucam::hifst::HifstTaskData< ArcT >::vcat |
This information used for stats.
Definition at line 113 of file data-main.hifst.hpp.
unordered_map<std::string, ucam::util::WordMapper *> ucam::hifst::HifstTaskData< ArcT >::wm |
Wordmap/Integer map objects.
Definition at line 85 of file data-main.createssgrammar.hpp.
unordered_map<std::string, WordMapper *> ucam::hifst::HifstTaskData< ArcT >::wm |
Wordmap/Integer map objects.
Definition at line 116 of file data-main.hifst.hpp.