Cambridge SMT System
ucam::hifst::HifstTaskData< ArcT > Class Template Reference

Data class containing relevant variables. To be used as template for task classes using it. More...

#include <data-main.createssgrammar.hpp>

Collaboration diagram for ucam::hifst::HifstTaskData< ArcT >:

Public Member Functions

 HifstTaskData ()
 
 HifstTaskData ()
 
fst::VectorFst< ArcT > * getFst (std::string const &key)
 

Public Attributes

uint sidx
 Sentence index. More...
 
const GrammarDatagrammar
 Contains translation grammar. More...
 
unordered_map< std::size_t, std::string > oovwmap
 Contains oovs. More...
 
std::string originalsentence
 source sentence More...
 
std::string tokenizedsentence
 
std::string sentence
 
std::vector< std::string > pinstances
 Pattern instances. More...
 
unordered_map< std::string, std::vector< pair< uint, uint > > > hpinstances
 
SentenceSpecificGrammarDatassgd
 Sentence-specific grammar information – hashes to rule indices. More...
 
unordered_set< std::string > tvcb
 Target vocabulary. More...
 
CYKdatacykdata
 cyk data structures More...
 
boost::shared_ptr< ucam::fsttools::StatsDatastats
 To collect statistics across the whole pipeline. More...
 
std::string * translation
 Translated sentence will be stored here. More...
 
unordered_set< std::string > * recasingvcblm
 mixed-case vocabulary of the recasing unigram language model More...
 
unordered_map< std::string, ucam::util::WordMapper * > wm
 Wordmap/Integer map objects. More...
 
unsigned sidx
 Sentence index. More...
 
unordered_map< std::string, std::vector< pair< unsigned, unsigned > > > hpinstances
 
std::vector< fst::VectorFst< ArcT > * > filters
 
unordered_map< std::string, void * > fsts
 Pointers to lattices (e.g. translation lattice, lmbr, etc) , and related, accessed by unique keys. More...
 
unordered_map< std::string, std::vector< const KenLMData * > > klm
 Collections of language models accessed by keys (e.g. in translation we need a bunch for hifst and one for recaser) More...
 
unsigned numlocallm
 Number of local language models used in hifst. More...
 
boost::shared_ptr< StatsDatastats
 To collect statistics across the whole pipeline. More...
 
grammar_inversecategories_t vcat
 This information used for stats. More...
 
unordered_map< std::string, WordMapper * > wm
 Wordmap/Integer map objects. More...
 

Detailed Description

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
class ucam::hifst::HifstTaskData< ArcT >

Data class containing relevant variables. To be used as template for task classes using it.

Definition at line 31 of file data-main.createssgrammar.hpp.

Constructor & Destructor Documentation

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
ucam::hifst::HifstTaskData< ArcT >::HifstTaskData ( )
inline

Definition at line 37 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
ucam::hifst::HifstTaskData< ArcT >::HifstTaskData ( )
inline

Definition at line 38 of file data-main.hifst.hpp.

Member Function Documentation

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
fst::VectorFst<ArcT>* ucam::hifst::HifstTaskData< ArcT >::getFst ( std::string const &  key)
inline

Definition at line 86 of file data-main.hifst.hpp.

Member Data Documentation

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
CYKdata * ucam::hifst::HifstTaskData< ArcT >::cykdata

cyk data structures

Definition at line 73 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
std::vector< fst::VectorFst<ArcT> *> ucam::hifst::HifstTaskData< ArcT >::filters

Definition at line 80 of file data-main.hifst.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_map<std::string, void * > ucam::hifst::HifstTaskData< ArcT >::fsts

Pointers to lattices (e.g. translation lattice, lmbr, etc) , and related, accessed by unique keys.

Definition at line 84 of file data-main.hifst.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
const GrammarData * ucam::hifst::HifstTaskData< ArcT >::grammar

Contains translation grammar.

Definition at line 49 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_map<std::string, std::vector< pair <uint, uint> > > ucam::hifst::HifstTaskData< ArcT >::hpinstances

Holds instanced patterns (string) over the sentence, mapped to extra information pair<1,2>: positions at which these were encountered (1), and minimum span (2).

Definition at line 64 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_map<std::string, std::vector< pair <unsigned, unsigned> > > ucam::hifst::HifstTaskData< ArcT >::hpinstances

Holds instanced patterns (string) over the sentence, mapped to extra information pair<1,2>: positions at which these were encountered (1), and minimum span (2).

Definition at line 67 of file data-main.hifst.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_map<std::string, std::vector <const KenLMData *> > ucam::hifst::HifstTaskData< ArcT >::klm

Collections of language models accessed by keys (e.g. in translation we need a bunch for hifst and one for recaser)

Definition at line 99 of file data-main.hifst.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unsigned ucam::hifst::HifstTaskData< ArcT >::numlocallm

Number of local language models used in hifst.

Definition at line 101 of file data-main.hifst.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_map< std::size_t, std::string > ucam::hifst::HifstTaskData< ArcT >::oovwmap

Contains oovs.

Definition at line 52 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
std::string ucam::hifst::HifstTaskData< ArcT >::originalsentence

source sentence

Definition at line 55 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
std::vector< std::string > ucam::hifst::HifstTaskData< ArcT >::pinstances

Pattern instances.

Definition at line 60 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_set< std::string > * ucam::hifst::HifstTaskData< ArcT >::recasingvcblm

mixed-case vocabulary of the recasing unigram language model

Definition at line 82 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
std::string ucam::hifst::HifstTaskData< ArcT >::sentence

Definition at line 57 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
uint ucam::hifst::HifstTaskData< ArcT >::sidx

Sentence index.

Definition at line 44 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unsigned ucam::hifst::HifstTaskData< ArcT >::sidx

Sentence index.

Definition at line 46 of file data-main.hifst.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
SentenceSpecificGrammarData * ucam::hifst::HifstTaskData< ArcT >::ssgd

Sentence-specific grammar information – hashes to rule indices.

Definition at line 67 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
boost::shared_ptr<ucam::fsttools::StatsData> ucam::hifst::HifstTaskData< ArcT >::stats

To collect statistics across the whole pipeline.

Definition at line 76 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
boost::shared_ptr<StatsData> ucam::hifst::HifstTaskData< ArcT >::stats

To collect statistics across the whole pipeline.

Definition at line 104 of file data-main.hifst.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
std::string ucam::hifst::HifstTaskData< ArcT >::tokenizedsentence

Definition at line 56 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
std::string * ucam::hifst::HifstTaskData< ArcT >::translation

Translated sentence will be stored here.

Definition at line 79 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_set< std::string > ucam::hifst::HifstTaskData< ArcT >::tvcb

Target vocabulary.

Definition at line 70 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
grammar_inversecategories_t ucam::hifst::HifstTaskData< ArcT >::vcat

This information used for stats.

Definition at line 113 of file data-main.hifst.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_map<std::string, ucam::util::WordMapper *> ucam::hifst::HifstTaskData< ArcT >::wm

Wordmap/Integer map objects.

Definition at line 85 of file data-main.createssgrammar.hpp.

template<class ArcT = fst::LexicographicArc< fst::StdArc::Weight, fst::StdArc::Weight>>
unordered_map<std::string, WordMapper *> ucam::hifst::HifstTaskData< ArcT >::wm

Wordmap/Integer map objects.

Definition at line 116 of file data-main.hifst.hpp.


The documentation for this class was generated from the following files: