|
Cambridge SMT System
|
Lower casing/Tokenization/Detokenization not available for open source release. More...

Go to the source code of this file.
Namespaces | |
| ucam | |
| ucam::util | |
Functions | |
| void | ucam::util::tokenize (const std::string &is, std::string *os, const std::string languagespecific="") |
| Not implemented, just pass through. More... | |
| void | ucam::util::detokenize (const std::string &is, std::string *os, std::string languagespecific="") |
| Not implemented, just pass through. More... | |
| void | ucam::util::addSentenceMarkers (std::string &sentence) |
| Adds sentence markers <s>, </s> to a sentence. More... | |
| void | ucam::util::deleteSentenceMarkers (std::string &sentence) |
| Deletes sentence markers 1/2 or <s>/</s> for a sentence. More... | |
| void | ucam::util::capitalizeFirstWord (std::vector< std::string > &words) |
| Simple function that capitalizes first word and first word of sentence and first word. More... | |
| void | ucam::util::capitalizeFirstWord (std::string &words) |
| Alternative implementation using a string as input/output. More... | |
Lower casing/Tokenization/Detokenization not available for open source release.
Definition in file tokenizer.osr.hpp.