Cambridge SMT System
|
Struct containing grammar rules. More...
#include <data.grammar.hpp>
Public Member Functions | |
GrammarData () | |
GrammarData constructor. Initializes GrammarData with empty information. More... | |
~GrammarData () | |
Destructor. More... | |
void | reset () |
Reset object. More... | |
const std::string | getRule (std::size_t idx) const |
Gets a rule indexed by idx. Rule format: LHS RHSSource RHSTarget weight. More... | |
const std::string | getLHS (std::size_t idx) const |
Gets left-hand-side of the rule indexed by idx. More... | |
const std::string | getRHSSource (std::size_t idx) const |
Gets right-hand-side source for a rule using rule index idx. More... | |
const std::string | getRHSSource (std::size_t idx, uint rulepos) const |
Gets element at position rulepos from the right-hand-side source for a rule indexed by idx. More... | |
const std::vector< std::string > | getRHSSplitSource (std::size_t idx) const |
Gets a splitted version of RHS (source) More... | |
const uint | getRHSSourceSize (std::size_t idx) const |
Gets number of elements in the RHS source. More... | |
const std::string | getRHSTranslation (std::size_t idx) const |
Returns RHS translation part of a rule accessed by index idx. More... | |
const std::vector< std::string > | getRHSSplitTranslation (std::size_t idx) const |
Returns the translation as a vector of elements. More... | |
const uint | getRHSTranslationSize (std::size_t idx) const |
Returns the number of elements in translation for a given rule. More... | |
const float | getWeight (std::size_t idx) const |
Returns weight of a rule accessed by index idx. More... | |
void | getLinks (std::size_t idx, std::vector< unsigned > &links) const |
const bool | isPhrase (std::size_t idx) const |
Checks whether the rule is a phrase or not (i.e. is hierarchical) More... | |
const std::size_t | getIdx (std::size_t idx) const |
Gets the real position (line) in the (potentially unsorted) file. More... | |
void | getMappings (std::size_t idx, unordered_map< uint, uint > *mappings) const |
Returns the non-terminal mappings. For more details see getRuleMappings function. More... | |
const bool | isAcceptedByVocabulary (const std::size_t idx, const unordered_set< std::string > &vcb) const |
Determines whether a particular rule is allowed within a vocabulary, i.e. all target words of the rule exist within this vocabulary. NOTE: If vocabulary variable is empty, it will always return true, this is, no vocabulary restriction is applied. More... | |
Public Attributes | |
std::string | filecontents |
The whole grammar. More... | |
posindex * | vpos |
Sorted Indices. More... | |
std::size_t | sizeofvpos |
Number of rules. More... | |
unordered_set< std::string > | patterns |
Patterns in these rules. More... | |
CompareTool * | ct |
Pointer to a Comparison object, assumed no ownership. More... | |
grammar_categories_t | categories |
Ordered list of non-terminals (listed in hierarchical order according to identity rules) More... | |
grammar_inversecategories_t | vcat |
Struct containing grammar rules.
Contains the grammar in a string, along with a set of sorted indices telling where each rule can be found in the string. This struct is typically generated by a GrammarTask and used by several other tasks.
Patterns, if precalculated, are also available in this struct. Indices have been sorted according to a comparison object. This object is required for further access and is made available through a pointer. For instance, in a synchronous grammar we need to sort the rules according to an abstraction (i.e. all non-terminals are represented by the same capital letter).
Definition at line 42 of file data.grammar.hpp.
|
inline |
GrammarData constructor. Initializes GrammarData with empty information.
Definition at line 45 of file data.grammar.hpp.
|
inline |
Destructor.
Definition at line 52 of file data.grammar.hpp.
|
inline |
Gets the real position (line) in the (potentially unsorted) file.
Definition at line 198 of file data.grammar.hpp.
|
inline |
Gets left-hand-side of the rule indexed by idx.
Definition at line 90 of file data.grammar.hpp.
|
inline |
|
inline |
Returns the non-terminal mappings. For more details see getRuleMappings function.
idx | rule (sorted) identifier. |
mappings | On completion, non-terminal mappings from source to target will be stored here. |
Definition at line 207 of file data.grammar.hpp.
|
inline |
Gets right-hand-side source for a rule using rule index idx.
Definition at line 96 of file data.grammar.hpp.
|
inline |
Gets element at position rulepos from the right-hand-side source for a rule indexed by idx.
Definition at line 102 of file data.grammar.hpp.
|
inline |
Gets number of elements in the RHS source.
Definition at line 123 of file data.grammar.hpp.
|
inline |
Gets a splitted version of RHS (source)
Definition at line 115 of file data.grammar.hpp.
|
inline |
Returns the translation as a vector of elements.
Definition at line 136 of file data.grammar.hpp.
|
inline |
Returns RHS translation part of a rule accessed by index idx.
Definition at line 129 of file data.grammar.hpp.
|
inline |
Returns the number of elements in translation for a given rule.
Definition at line 145 of file data.grammar.hpp.
|
inline |
Gets a rule indexed by idx. Rule format: LHS RHSSource RHSTarget weight.
Definition at line 83 of file data.grammar.hpp.
|
inline |
Returns weight of a rule accessed by index idx.
Definition at line 152 of file data.grammar.hpp.
|
inline |
Determines whether a particular rule is allowed within a vocabulary, i.e. all target words of the rule exist within this vocabulary. NOTE: If vocabulary variable is empty, it will always return true, this is, no vocabulary restriction is applied.
idx | : rule index |
vcb | : vocabulary to check against |
Definition at line 225 of file data.grammar.hpp.
|
inline |
Checks whether the rule is a phrase or not (i.e. is hierarchical)
Definition at line 189 of file data.grammar.hpp.
|
inline |
Reset object.
Definition at line 72 of file data.grammar.hpp.
grammar_categories_t ucam::hifst::GrammarData::categories |
Ordered list of non-terminals (listed in hierarchical order according to identity rules)
Definition at line 68 of file data.grammar.hpp.
CompareTool* ucam::hifst::GrammarData::ct |
Pointer to a Comparison object, assumed no ownership.
Definition at line 65 of file data.grammar.hpp.
std::string ucam::hifst::GrammarData::filecontents |
The whole grammar.
Definition at line 57 of file data.grammar.hpp.
unordered_set<std::string> ucam::hifst::GrammarData::patterns |
Patterns in these rules.
Definition at line 63 of file data.grammar.hpp.
std::size_t ucam::hifst::GrammarData::sizeofvpos |
Number of rules.
Definition at line 61 of file data.grammar.hpp.
grammar_inversecategories_t ucam::hifst::GrammarData::vcat |
Definition at line 69 of file data.grammar.hpp.
posindex* ucam::hifst::GrammarData::vpos |
Sorted Indices.
Definition at line 59 of file data.grammar.hpp.