Cambridge SMT System
ucam::util Namespace Reference

Classes

class  FastForwardRead
 Convenience class that reads "quickly" until a queried line. More...
 
class  HashEqVec
 
class  HashFVec
 
class  InfiniteRange
 Implements a Range iterator that will never finish. More...
 
class  iszfstream
 Wrapper stream class that reads pipes, text files or gzipped files. More...
 
struct  MainClass
 
struct  NoThreadPool
 Trivial struct that can replace seamlessly the threadpool for single threaded executions. More...
 
class  NumberRange
 
class  NumberRangeInterface
 Interface for an arbitrary range of numbers. More...
 
class  OneRange
 Implements a Range iterator that only runs once. This is useful e.g. for fsttools that process a batch of files if range is specificied, and only one if not. More...
 
class  oszfstream
 Wrapper stream class that writes to pipes, text files or gzipped files. More...
 
struct  ParamsInit
 Initializes a set of parameters from environment variables PARAMS_FILE or PARAMS. More...
 
class  PatternAddress
 class that expands a wildcard into its actual value. This is useful e.g. for filenames ranging several sentences More...
 
class  PQwmapcompare
 comparison functor for queue sorting. More...
 
class  RegistryPO
 
class  Runner
 Convenience wrapper class that can kick off two type of executions: singlethreaded or multithreaded, triggered by program options. Possibly multithreading with 1 thread would do, but I keep both implementations as any plain bug that might arise will be easier to trace down with a normal execution (threadpool uses two, actually). The class is templated with two classes, one for single threading and another for multithreading. Note that the multithreading details are up to the second templated class. e.g. Runner<SingleThreadedFunctor,SingleThreadedFunctor> would not multithread at all ;-). More...
 
class  Runner2
 Convenience wrapper class that can kick off two type of executions: single or multithreaded, triggered by program options. Possibly multithreading with 1 thread would do, but I keep both implementations as any plain bug that might arise will be easier to trace down with a serialized execution (threadpool uses two, actually). The class is templated with two classes, one for single threading and another for multithreading. Note that the multithreading details are up to the second templated class. More...
 
class  Runner3
 Convenience wrapper class that can kick off three type of executions: singlethreaded, multithreaded, or server, triggered by program options. Possibly multithreading with 1 thread would do, but I keep both implementations as any plain bug that might arise will be easier to trace down with a single thread execution. The class is templated three functors, one for each type of execution Note that the details are up to the each of these functors. More...
 
class  silent
 Provides methods to set and get silent logging mode. More...
 
class  TaskFunctor
 Simple functor that accepts an interface and pointer to the data object in which it will have to run The actual task running is delayed to the call of the (). This is useful e.g. for task dispatching in the threadpool pattern. This functor deletes data and task as soon as it is guaranteed to have been completely executed. More...
 
class  TaskInterface
 Templated (hybrid) Interface for Task classes. More...
 
class  TrivialThreadPool
 Trivial implementation of a threadpool based on boost::asio methods When initiated, creates a threadpool of n threads (n <= number of cpus). Jobs should be submitted with the templated operator(). When the object is deleted it will wait for all threads to finish. More...
 
class  WordMapper
 Loads efficiently a wordmap file and provides methods to map word-to-integer or integer-to-word. To avoid memory footprint issues, hashing the wordmap entries is avoided. More...
 

Typedefs

typedef PatternAddress< uint > IntegerPatternAddress
 
typedef HashEqVec< std::basic_string< unsigned > > hasheqvecuint
 
typedef HashFVec< std::basic_string< unsigned > > hashfvecuint
 
typedef HashEqVec< std::vector< long long > > hasheqvecint64
 
typedef HashFVec< std::vector< long long > > hashfvecint64
 
typedef boost::scoped_ptr< NumberRangeInterface< unsigned > > IntRangePtr
 

Functions

void init_param_options (int argc, const char *argv[], po::variables_map *vm)
 Function to initialize boost program_options module with command-line and config file options. Note that both the config file and the command line options are parsed. This means that whatever the source of the parameter it is equally safe to use, i.e. the expected type (int, string, ...) as defined in the options should be guaranteed a priori. This function is typically used with RegistryPO class, which will contain all relevant variables to share across all task classes. More...
 
void initCommonApplylmOptions (po::options_description &desc)
 
void checkApplyLmOptions (po::variables_map *vm)
 
void initAllCreateSSGrammarOptions (po::options_description &desc)
 
void checkCreateSSGrammarOptions (po::variables_map *vm)
 
void initRules2WeightsOptions (po::options_description &desc, bool addAllOptions=true)
 
void checkRules2Weightptions (po::variables_map *vm)
 
template<typename T >
std::string toString (const T &x, uint pr=2)
 Converts an arbitrary type to string Converts to string integers, floats, doubles Quits execution if conversion has failed. More...
 
template<typename T >
void toString (const T &x, std::string &ss, uint pr=2)
 Converts an arbitrary type to string Converts to string integers, floats, doubles Quits execution if conversion has failed. More...
 
template<typename T >
toNumber (const std::string &x)
 Converts a string to an arbitrary number Converts strings to a number. Quits execution if conversion has failed. More...
 
bool exists (const std::string &source, const std::string &needle)
 Convenience function to find out whether a needle exists in a text. More...
 
void find_and_replace (std::string &haystack, const std::string &needle, const std::string &replace)
 
bool ends_with (std::string const &haystack, std::string const &needle)
 
uint count_needles (const std::string &haystack, const char needle, std::size_t start, std::size_t end)
 Convenience function that counts the number of times a needle appears. More...
 
void trim_spaces (const std::string &input, std::string *output)
 Trims spaces at the edges (no spaces) and also between words (only one space) More...
 
void trim_trailing_zeros (std::string &snumber)
 
bool validate_source_sentence (const std::string &s)
 Checks whether the sentence is in format ^\d+( \d+)*$. More...
 
std::string getTimestamp (void)
 Generates time stamp. More...
 
bool DirName (std::string &dirname, const std::string &filename)
 
bool fileExists (const std::string &fileName)
 
float dotproduct (std::vector< float > &v1, std::vector< float > &v2)
 Implements dot product. More...
 
template<typename T , template< typename ElemT, typename AllocT=std::allocator< ElemT > > class Container>
std::ostream & operator<< (std::ostream &o, Container< T > const &container)
 
template<typename T , template< typename ElemT, typename AllocT=std::allocator< ElemT > > class Container>
std::string printout (Container< T > const &container)
 
std::ostream & operator<< (std::ostream &o, std::basic_string< unsigned > const &container)
 
void initLogger (int argc, const char *argv[])
 Inits logger, parses param options checking for –logger.verbose. More...
 
std::string filteredHeader (const std::string &a)
 This function is meant to filter PRETTY_FUNCTION and attempts to simplify it. With the templating it can get a little nasty to show in the logs. More...
 
template<typename T >
std::vector< T > ParseParamString (const std::string &stringparams, size_t pos=0)
 Function to parse string of parameters, e.g. separated by commas. More...
 
template<typename T >
void ParseParamString (const std::string &stringparams, std::vector< T > &params, size_t pos=0, size_t span=0)
 Version 2, passing output by reference... More...
 
void WriteParamFile (const std::string &filename, std::vector< float > params_)
 Write parameter vector to a file, with comma separators. More...
 
template<typename NumberType >
void getRange (const std::string &range, std::vector< NumberType > &x)
 Generates ranges from a compact string parameter such as 1,3:5,10. More...
 
template<typename NumberType >
NumberRangeInterface< NumberType > * RangeInitFactory (const RegistryPO &rg, const std::string &option=HifstConstants::kRangeInfinite)
 
void init_param_options (int argc, const char *argv[], bpo::variables_map *vm)
 
void initGlobalOptions (bpo::options_description &generic, std::string &configFile)
 class wrapping around the boost program_options variable with parsed values. More...
 
void parseOptionsGeneric (bpo::options_description &desc, bpo::variables_map *vm, int argc, const char *argv[])
 
iszfstreamgetline (iszfstream &izs, std::string &line)
 
template<typename T >
iszfstreamoperator>> (iszfstream &iszf, T &stff)
 Templated operator >> for streaming out of iszfstream. More...
 
template<typename FM >
void readtextfile (const std::string &filename, FM &fm)
 Function that reads from a file. Templated on any external class with a parse method. More...
 
template<typename FM >
void writetextfile (const std::string &filename, FM &fm)
 Function that writes to file. Templated on any external class with a toLine method. More...
 
void tokenize (const std::string &is, std::string *os, const std::string languagespecific="")
 Not implemented, just pass through. More...
 
void detokenize (const std::string &is, std::string *os, std::string languagespecific="")
 Not implemented, just pass through. More...
 
void addSentenceMarkers (std::string &sentence)
 Adds sentence markers <s>, </s> to a sentence. More...
 
void deleteSentenceMarkers (std::string &sentence)
 Deletes sentence markers 1/2 or <s>/</s> for a sentence. More...
 
void capitalizeFirstWord (std::vector< std::string > &words)
 Simple function that capitalizes first word and first word of sentence and first word. More...
 
void capitalizeFirstWord (std::string &words)
 Alternative implementation using a string as input/output. More...
 

Variables

bool user_check_ok
 
const bool detailed = false
 

Typedef Documentation

typedef HashEqVec<std::vector<long long> > ucam::util::hasheqvecint64

Definition at line 222 of file global_funcs.hpp.

typedef HashEqVec<std::basic_string<unsigned> > ucam::util::hasheqvecuint

Definition at line 219 of file global_funcs.hpp.

typedef HashFVec<std::vector<long long> > ucam::util::hashfvecint64

Definition at line 223 of file global_funcs.hpp.

typedef HashFVec<std::basic_string<unsigned> > ucam::util::hashfvecuint

Definition at line 220 of file global_funcs.hpp.

Definition at line 97 of file addresshandler.hpp.

typedef boost::scoped_ptr< NumberRangeInterface<unsigned> > ucam::util::IntRangePtr

Definition at line 214 of file range.hpp.

Function Documentation

void ucam::util::addSentenceMarkers ( std::string &  sentence)
inline

Adds sentence markers <s>, </s> to a sentence.

Parameters
sentencesentence to modify (append,prepend) with the sentence markers.

Definition at line 44 of file tokenizer.osr.hpp.

Here is the call graph for this function:

Here is the caller graph for this function:

void ucam::util::capitalizeFirstWord ( std::vector< std::string > &  words)
inline

Simple function that capitalizes first word and first word of sentence and first word.

Definition at line 74 of file tokenizer.osr.hpp.

Here is the caller graph for this function:

void ucam::util::capitalizeFirstWord ( std::string &  words)
inline

Alternative implementation using a string as input/output.

Definition at line 88 of file tokenizer.osr.hpp.

Here is the call graph for this function:

void ucam::util::checkApplyLmOptions ( po::variables_map *  vm)
inline

Definition at line 45 of file main.applylm.init_param_options_common.hpp.

Here is the caller graph for this function:

void ucam::util::checkCreateSSGrammarOptions ( po::variables_map *  vm)
inline
void ucam::util::checkRules2Weightptions ( po::variables_map *  vm)
inline
uint ucam::util::count_needles ( const std::string &  haystack,
const char  needle,
std::size_t  start,
std::size_t  end 
)
inline

Convenience function that counts the number of times a needle appears.

Definition at line 107 of file global_funcs.hpp.

Here is the caller graph for this function:

void ucam::util::deleteSentenceMarkers ( std::string &  sentence)
inline

Deletes sentence markers 1/2 or <s>/</s> for a sentence.

Parameters
sentencesentence from which to delete markers

Definition at line 62 of file tokenizer.osr.hpp.

Here is the call graph for this function:

Here is the caller graph for this function:

void ucam::util::detokenize ( const std::string &  is,
std::string *  os,
std::string  languagespecific = "" 
)
inline

Not implemented, just pass through.

Definition at line 35 of file tokenizer.osr.hpp.

Here is the caller graph for this function:

bool ucam::util::DirName ( std::string &  dirname,
const std::string &  filename 
)
inline

Retrieves directory name for a filename path.

Definition at line 162 of file global_funcs.hpp.

Here is the caller graph for this function:

float ucam::util::dotproduct ( std::vector< float > &  v1,
std::vector< float > &  v2 
)
inline

Implements dot product.

Definition at line 183 of file global_funcs.hpp.

Here is the caller graph for this function:

bool ucam::util::ends_with ( std::string const &  haystack,
std::string const &  needle 
)
inline

Convenience function that determines whether a string ends with another string.

Definition at line 98 of file global_funcs.hpp.

Here is the caller graph for this function:

bool ucam::util::exists ( const std::string &  source,
const std::string &  needle 
)
inline

Convenience function to find out whether a needle exists in a text.

Definition at line 82 of file global_funcs.hpp.

Here is the caller graph for this function:

bool ucam::util::fileExists ( const std::string &  fileName)
inline

Checks wether a file exists or not.

Definition at line 171 of file global_funcs.hpp.

Here is the caller graph for this function:

std::string ucam::util::filteredHeader ( const std::string &  a)
inline

This function is meant to filter PRETTY_FUNCTION and attempts to simplify it. With the templating it can get a little nasty to show in the logs.

Definition at line 50 of file logger.hpp.

void ucam::util::find_and_replace ( std::string &  haystack,
const std::string &  needle,
const std::string &  replace 
)
inline

Convenience function to find a needle and replace it.

Definition at line 88 of file global_funcs.hpp.

Here is the caller graph for this function:

iszfstream& ucam::util::getline ( iszfstream izs,
std::string &  line 
)
inline

Definition at line 178 of file szfstream.hpp.

Here is the caller graph for this function:

template<typename NumberType >
void ucam::util::getRange ( const std::string &  range,
std::vector< NumberType > &  x 
)
inline

Generates ranges from a compact string parameter such as 1,3:5,10.

Parameters
rangeA key string such as 1,3:5,10, describing a range of values.
xA vector containing explicitly the integers we care about

Definition at line 40 of file range.hpp.

Here is the call graph for this function:

Here is the caller graph for this function:

std::string ucam::util::getTimestamp ( void  )
inline

Generates time stamp.

Definition at line 149 of file global_funcs.hpp.

Here is the caller graph for this function:

void ucam::util::init_param_options ( int  argc,
const char *  argv[],
po::variables_map *  vm 
)
inline

Function to initialize boost program_options module with command-line and config file options. Note that both the config file and the command line options are parsed. This means that whatever the source of the parameter it is equally safe to use, i.e. the expected type (int, string, ...) as defined in the options should be guaranteed a priori. This function is typically used with RegistryPO class, which will contain all relevant variables to share across all task classes.

Parameters
argcnumber of command-line options, as generated for the main function
argvstandard command-line options, as generated for the main function
vmboost variable containing all parsed options.
Returns
void

Definition at line 40 of file main.applylm.init_param_options.hpp.

Here is the call graph for this function:

void ucam::util::init_param_options ( int  argc,
const char *  argv[],
bpo::variables_map *  vm 
)
inline

Here is the caller graph for this function:

void ucam::util::initAllCreateSSGrammarOptions ( po::options_description &  desc)
inline
void ucam::util::initCommonApplylmOptions ( po::options_description &  desc)
inline

Definition at line 26 of file main.applylm.init_param_options_common.hpp.

Here is the caller graph for this function:

void ucam::util::initGlobalOptions ( bpo::options_description &  generic,
std::string &  configFile 
)
inline

class wrapping around the boost program_options variable with parsed values.

Definition at line 47 of file registrypo.hpp.

Here is the caller graph for this function:

void ucam::util::initLogger ( int  argc,
const char *  argv[] 
)
inline

Inits logger, parses param options checking for –logger.verbose.

Parameters
argcStandard main argc parameter
argvStandard main argv parameter

Definition at line 55 of file logger.boost_log.hpp.

Here is the call graph for this function:

Here is the caller graph for this function:

void ucam::util::initRules2WeightsOptions ( po::options_description &  desc,
bool  addAllOptions = true 
)
inline
template<typename T , template< typename ElemT, typename AllocT=std::allocator< ElemT > > class Container>
std::ostream& ucam::util::operator<< ( std::ostream &  o,
Container< T > const &  container 
)
inline

Definition at line 229 of file global_funcs.hpp.

std::ostream& ucam::util::operator<< ( std::ostream &  o,
std::basic_string< unsigned > const &  container 
)
inline

Definition at line 247 of file global_funcs.hpp.

template<typename T >
iszfstream& ucam::util::operator>> ( iszfstream iszf,
T &  stff 
)
inline

Templated operator >> for streaming out of iszfstream.

Definition at line 190 of file szfstream.hpp.

void ucam::util::parseOptionsGeneric ( bpo::options_description &  desc,
bpo::variables_map *  vm,
int  argc,
const char *  argv[] 
)
inline

Definition at line 58 of file registrypo.hpp.

Here is the call graph for this function:

Here is the caller graph for this function:

template<typename T >
std::vector<T> ucam::util::ParseParamString ( const std::string &  stringparams,
size_t  pos = 0 
)
inline

Function to parse string of parameters, e.g. separated by commas.

Definition at line 51 of file params.hpp.

Here is the caller graph for this function:

template<typename T >
void ucam::util::ParseParamString ( const std::string &  stringparams,
std::vector< T > &  params,
size_t  pos = 0,
size_t  span = 0 
)
inline

Version 2, passing output by reference...

Definition at line 77 of file params.hpp.

template<typename T , template< typename ElemT, typename AllocT=std::allocator< ElemT > > class Container>
std::string ucam::util::printout ( Container< T > const &  container)
inline

Definition at line 241 of file global_funcs.hpp.

Here is the caller graph for this function:

template<typename NumberType >
NumberRangeInterface<NumberType>* ucam::util::RangeInitFactory ( const RegistryPO rg,
const std::string &  option = HifstConstants::kRangeInfinite 
)

Definition at line 201 of file range.hpp.

Here is the call graph for this function:

template<typename FM >
void ucam::util::readtextfile ( const std::string &  filename,
FM &  fm 
)
inline

Function that reads from a file. Templated on any external class with a parse method.

Definition at line 359 of file szfstream.hpp.

Here is the call graph for this function:

Here is the caller graph for this function:

void ucam::util::tokenize ( const std::string &  is,
std::string *  os,
const std::string  languagespecific = "" 
)
inline

Not implemented, just pass through.

Definition at line 29 of file tokenizer.osr.hpp.

Here is the caller graph for this function:

template<typename T >
T ucam::util::toNumber ( const std::string &  x)
inline

Converts a string to an arbitrary number Converts strings to a number. Quits execution if conversion has failed.

Parameters
xTemplated variable to be converted.

Definition at line 72 of file global_funcs.hpp.

Here is the caller graph for this function:

template<typename T >
std::string ucam::util::toString ( const T &  x,
uint  pr = 2 
)
inline

Converts an arbitrary type to string Converts to string integers, floats, doubles Quits execution if conversion has failed.

Parameters
xTemplated variable to be converted.
prPrecision (only useful for float/double)

Definition at line 38 of file global_funcs.hpp.

Here is the caller graph for this function:

template<typename T >
void ucam::util::toString ( const T &  x,
std::string &  ss,
uint  pr = 2 
)
inline

Converts an arbitrary type to string Converts to string integers, floats, doubles Quits execution if conversion has failed.

Parameters
xTemplated variable to be converted.
ss
prPrecision (only useful for float/double)

Definition at line 56 of file global_funcs.hpp.

void ucam::util::trim_spaces ( const std::string &  input,
std::string *  output 
)
inline

Trims spaces at the edges (no spaces) and also between words (only one space)

Definition at line 117 of file global_funcs.hpp.

Here is the caller graph for this function:

void ucam::util::trim_trailing_zeros ( std::string &  snumber)
inline

Trims decimal zeros at the end of a floating number represented as a string. Use sparingly – in general prefer float variables. ;-)

Definition at line 128 of file global_funcs.hpp.

Here is the caller graph for this function:

bool ucam::util::validate_source_sentence ( const std::string &  s)
inline

Checks whether the sentence is in format ^\d+( \d+)*$.

Parameters
sstring to validate
Returns
true if validated, false otherwise

Definition at line 143 of file global_funcs.hpp.

Here is the caller graph for this function:

void ucam::util::WriteParamFile ( const std::string &  filename,
std::vector< float >  params_ 
)
inline

Write parameter vector to a file, with comma separators.

Definition at line 111 of file params.hpp.

Here is the caller graph for this function:

template<typename FM >
void ucam::util::writetextfile ( const std::string &  filename,
FM &  fm 
)
inline

Function that writes to file. Templated on any external class with a toLine method.

Definition at line 370 of file szfstream.hpp.

Here is the call graph for this function:

Variable Documentation

const bool ucam::util::detailed = false

Definition at line 27 of file main.countstrings.hpp.

bool ucam::util::user_check_ok

Definition at line 26 of file main.countstrings.hpp.