Cambridge SMT System
tokenizer.osr.hpp File Reference

Lower casing/Tokenization/Detokenization not available for open source release. More...

This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Namespaces

 ucam
 
 ucam::util
 

Functions

void ucam::util::tokenize (const std::string &is, std::string *os, const std::string languagespecific="")
 Not implemented, just pass through. More...
 
void ucam::util::detokenize (const std::string &is, std::string *os, std::string languagespecific="")
 Not implemented, just pass through. More...
 
void ucam::util::addSentenceMarkers (std::string &sentence)
 Adds sentence markers <s>, </s> to a sentence. More...
 
void ucam::util::deleteSentenceMarkers (std::string &sentence)
 Deletes sentence markers 1/2 or <s>/</s> for a sentence. More...
 
void ucam::util::capitalizeFirstWord (std::vector< std::string > &words)
 Simple function that capitalizes first word and first word of sentence and first word. More...
 
void ucam::util::capitalizeFirstWord (std::string &words)
 Alternative implementation using a string as input/output. More...
 

Detailed Description

Lower casing/Tokenization/Detokenization not available for open source release.

Date
8-8-2012
Author
Gonzalo Iglesias

Definition in file tokenizer.osr.hpp.