Cambridge SMT System
File List
Here is a list of all files with brief descriptions:
 address.gtest.cppUnit testing: integerpatternaddress
 addresshandler.hppHandles simple wildcard expansion for strings
 AlignmentCountMapWritable.java
 alilats2splats.main.cppAlignment lattices to sparse vector weight lattices
 applylm.main.cppMain file for applylm tool
 backtrace.hppAdapted from http://charette.no-ip.com:81/programming/2010-01-25_Backtrace/ Simple example of backtrace(), backtrace_symbols(), and __cxa_demangle() Remember to pass in the -rdynamic flag to GCC:
 bleu.hpp
 BleuStats.cpp
 BleuStats.h
 CLI.java
 common-helpers.hpp
 CommonFlags.h
 constants-fsttools.hpp
 constants-hifst.hpp
 constants-lmert.hpp
 CopyRecordsForTesting.java
 countstrings.main.cpp
 createssgrammar.main.cppHifst main entry file
 custom_assert.hppProvides smarter assert methods
 data-main.alilats2splats.hppData object for alilats to sparse weight lats binary
 data-main.applylm.hppData object for applylm tool
 data-main.createssgrammar.hppData object for hifst or related tools
 data-main.disambig.hppData object for disambig tool
 data-main.hifst-client.hppData object for hifst-client
 data-main.hifst.hppData object for hifst or related tools
 data-main.lmbr.hpp
 data-main.rules2weights.hppData object for alilats to sparse weight lats binary
 data.cykparser.cykbackpointers.hppContains functor that provides access to cyk backpointers
 data.cykparser.cykgrid.hppContains functor for the cyk grid
 data.cykparser.hppContains structures and classes for GrammarData
 data.grammar.comparetool.hppContains structures and classes for GrammarData
 data.grammar.hppContains structures and classes for GrammarData
 data.grammar.utilities.hppContains structures and classes for GrammarData
 data.lm.hppImplementation of a language model data structure using kenlm
 data.lmbr.hpp
 data.ssgrammar.hppContains sentence-specific grammar data
 data.stats.hppRelative to Stats across the pipeline
 DebugMert.cpp
 DebugMert.h
 defs.cykparser.hppContains definitions for cykparser data and task
 defs.grammar.hppContains definitions for cykparser data and task
 defs.ssgrammar.hppContains definitions for sentence-specific grammar data and task
 disambig.main.cppMain file for disambig tool
 disambignffst.main.cpp
 EnumRuleType.java
 ErrorSurface.cpp
 ErrorSurface.h
 ExtractedData.java
 ExtractorDataLoader.java
 ExtractorJob.java
 fast-shortest-distance.h
 Feature.java
 FeatureFunctionRegistry.java
 FeatureFunctions.java
 FeatureMap.java
 FeatureRegistry.java
 fstio.gtest.cppUnit testing: Input/Output fst operations with compression
 fstio.hppContains convenience functions to write and read fsts
 fstutils.applylmonthefly.hppContains implementation of ApplyLanguageModelOnTheFly
 fstutils.extractngrams.hpp
 fstutils.ftcompose.hppImplementation of different type of compositions (i.e. failure transitions)
 fstutils.gtest.cppUnit testing: String printing from lattice, multiepsilon composition, composition with failure transitions, generic weight mappers, multiple union of fsts, etc
 fstutils.hppUtilites to extract vocabulary, pseudo-determinize lattices and build substring transducers
 fstutils.mapper.hppGeneralized weight mapper functor
 fstutils.multiepsiloncompose.hppMultiepsilon composition
 fstutils.multiunion.hppImplementations of multiple fst unions
 fstutils.topofeatures.hppSupport for Topological Features. See Iglesias et al. 2015
 function-weight.cpp
 function-weight.h
 global_decls.hppGeneral typedefs, defines..
 global_funcs.hppGeneral functions
 global_incls.hppAll included standard headers, boost headers,..
 globalfunctions.gtest.cppUnit testing: General functions
 googletesting.hUnit testing: google testing common header
 HFilePrint.java
 HFileRuleQuery.java
 HFileRuleReader.java
 hifst-client.main.cppHifst client main entry file
 hifst.main.cppHifst main entry file
 hifst.task.cykparser.gtest.cppUnit testing: cyk parser
 hifst.task.grammar.gtest.cppUnit testing: grammar task testing
 hifst.task.hifst.gtest.cppUnit testing: hifst lattice-building
 hifst.task.patternstoinstances.gtest.cppUnit testing: converting grammar-specific patterns into instance patterns
 hifst.task.postpro.gtest.cppUnit testing: Postprocessing 1-best translation
 hifst.task.prepro.gtest.cppUnit testing: Preprocess source test
 hifst.task.referencefilter.gtest.cppUnit testing: Reference filter task
 hifst.task.ssgrammar.gtest.cppUnit testing: sentence-specific grammar task
 hifst.task.stats.gtest.cppUnit testing: Stats task testing
 hifst_enumerate_vocab.hppExtend EnumerateVocab to access kenlm ids
 idbridge.hppMaps between grammar targets ids and lm ids
 IntervalData.cpp
 IntervalData.h
 IntWritableCache.java
 kenlmdetect.hpp
 latmert.main.cpp
 LatMertMain.h
 LexicalProbability.java
 lexicographic-tropical-tropical-decls.hLexicographic stdarc registering
 lexicographic-tropical-tropical-funcs.hConvenience functors/functions for lexicographic<tropical,tropical> semiring
 lexicographic-tropical-tropical-incls.hHeaders for standalone shared library
 lexmap.main.cppMain file for applylm tool
 lineoptimize.hpp
 lmbr.gtest.cppUnit testing: String printing from lattice, multiepsilon composition, composition with failure transitions, generic weight mappers, multiple union of fsts, etc
 lmbr.main.cppAlignment lattices to sparse vector weight lattices
 LMert.cpp
 LMert.h
 lmert.hpp
 lmert.main.cpp
 logger.boost_log.hppLogger implementation – init method and macros around actual Boost Logger
 logger.hpp
 logger.openfstglog.hppLogger implementation – init method and macros around actual OpenFST logger – requires including openfst libraries first
 main-run.alilats2splats.hppImplements single-threaded version of alilats2splats tool
 main-run.applylm.hppCore implementation of applylm binary. Kicks off either singlethreaded or multithreaded language model application
 main-run.createssgrammar.hppContains createssgrammar core implementation, single-threaded or multithreaded
 main-run.hifst-client.hppContains hifst client core implementation, sends request to hifst in server mode
 main-run.hifst.hppContains hifst core: implements single threaded, multithreaded or as server
 main-run.lmbr.hppImplements single-threaded version of alilats2splats tool
 main-run.rules2weights.hppImplements lats2splats tool
 main.alilats2splats.hppIncluded headers for all the binary should be defined here. This file should be included only once
 main.alilats2splats.init_param_options.hppTo initialize boost parameter options
 main.applylm.hppIncluded headers for all the binary should be defined here. This file should be included only once
 main.applylm.init_param_options.hppTo initialize boost parameter options
 main.applylm.init_param_options_common.hppTo initialize boost parameter options
 main.countstrings.hppIncluded headers for all the binary should be defined here. This file should be included only once
 main.countstrings.init_param_options.hppTo initialize boost parameter options
 main.createssgrammar.hppIncluded headers for all the binary (createssgrammar) should be defined here. This file should be included only once
 main.createssgrammar.init_param_options.hppTo initialize boost parameter options for createssgrammar tool
 main.createssgrammar.init_param_options_common.hppTo initialize boost parameter options for createssgrammar tool
 main.custom_assert.hppStatic variable for custom_assert. Include only once from main file
 main.disambig.hppIncluded headers for the binary (disambig) should be defined here. This file should be included only once
 main.disambig.init_param_options.hppTo initialize boost parameter options
 main.disambignffst.hppIncluded headers for all the binary should be defined here. This file should be included only once
 main.disambignffst.init_param_options.hppTo initialize boost parameter options
 main.gtest.hifst.cppUnit testing: main file
 main.hifst-client.hppIncluded headers for all the binary (hifst-client) should be defined here. This file should be included only once
 main.hifst-client.init_param_options.hppTo initialize boost parameter options for hifst-client
 main.hifst.hppIncluded headers for all the binary should be defined here. This file should be included only once
 main.hifst.init_param_options.hppTo initialize boost parameter options for hifst tool
 main.hpp
 main.lexmap.hpp
 main.lexmap.init_param_options.hppTo initialize boost parameter options
 main.lmbr.hppIncluded headers for all the binary should be defined here. This file should be included only once
 main.lmbr.init_param_options.hppTo initialize boost parameter options
 main.lmert.hpp
 main.lmert.init_param_options.hpp
 main.logger.hppStatic variables for logger. Include only once from main file
 main.printstrings.hpp
 main.printstrings.init_param_options.hppTo initialize boost parameter options
 main.rules2weights.hppIncluded headers for all the binary should be defined here. This file should be included only once
 main.rules2weights.init_param_options.hppTo initialize boost parameter options
 main.rules2weights.init_param_options_common.hppTo initialize boost parameter options
 main.samplehyps.hpp
 main.samplehyps.init_param_options.hpp
 main.tunewp.hppIncluded headers for all the binary should be defined here. This file should be included only once
 main.tunewp.init_param_options.hppTo initialize boost parameter options
 mapping-shortest-path.h
 MarginalReducer.java
 MergeComparator.java
 MergeJob.java
 MergePartitioner.java
 MertCommon.cpp
 MertCommon.h
 MertHashVec.cpp
 MertHashVec.h
 MertLib.h
 MertPrune.cpp
 MertPrune.h
 multithreading.gtest.cppUnit testing: multithreading with a threadpool
 multithreading.helpers.hpp
 multithreading.hppImplements trivial threadpool using boost::asio library
 openfst.hUnit testing: google testing common header
 openfstversion.hpp
 Optimize.cpp
 Optimize.h
 Pair.java
 PairWritable.java
 params.h
 params.hppConvenience functions to parse parameters from a string
 ParamsConfig.cpp
 ParamsConfig.h
 PatternInstanceCreator.java
 PhraseJob.java
 printstrings.main.cppMain file for printstrings tool
 ProvenanceCountMap.java
 ProvenanceProbMap.java
 randomlinesearch.hpp
 range.gtest.cppUnit testing: getRange, InfiniteRange,IntegerRange, OneRange objects
 range.hppHandles different type of integer ranges
 RefsData.cpp
 RefsData.h
 registrypo.gtest.cppUnit testing: registrypo class and related functions
 registrypo.hppContains wrapper class RegistryPO, which uses boost::program_options to parse parameters, and provides methods to access them
 RuleData.java
 RuleExtractorTest.java
 RuleFilter.java
 RulePattern.java
 RuleRetriever.java
 rules2weights.main.cppAlignment lattices to sparse vector weight lattices
 samplehyps.main.cpp
 Score.cpp
 Score.h
 score.main.cpp
 SequenceFilePrint.java
 SidePattern.java
 SimpleHFileOutputFormat.java
 Source2TargetJob.java
 szfstream.gtest.cppUnit testing: szfstream class for [file] operations
 szfstream.hppStream wrapper for pipe/text/compressed files
 Target2SourceJob.java
 TargetFeatureList.java
 task.applylm.hppImplementation of a language model application task
 task.applylm.kenlmtype.hppWrapper to ApplyLanguageModelOnTheFly to apply different kenlm models
 task.cykparser.hppContains cyk parser implementation
 task.disambig.flowerfst.hppUtilities for DisambigTask and related tasks
 task.disambig.hppImplemention of class DisambigTask
 task.dumpnbestfeatures.hppContains task that dumps nbest and feature file
 task.grammar.hppDescribes class GrammarTask
 task.grammar.nonterminalhierarchy.hppThis class decides automatically the hierarchy of non-terminals
 task.hifst-stats.hppTask that dumps statistics related specifically to the tool hifst
 task.hifst.expandednumstates.hppContains utility class to predict number of states of an RTN after expanding to equivalent FSA
 task.hifst.hppContains structures and classes for hifst task (target lattice building)
 task.hifst.localpruningconditions.hppContains functor and struct to handle local pruning conditions
 task.hifst.makeweights.hpp
 task.hifst.optimize.hppContains Function objects that optimize a machine
 task.hifst.replacefstbyarc.hppContains Function objects that determine whether an FST is replaceable or not by an arc pointer
 task.hifst.rtn.hppImplements RTN class. Stores pointers to cell FSAs of the RTN using a hiero-index representing cell coordinates (cc,x,y)
 task.lmbr.applyposteriors.hppBased on Graeme Blackwood's PhD work and original code – implementation of posterior application to a hypotheses space
 task.lmbr.common.hppCommon lmbr functions
 task.lmbr.computeposteriors.hppBased on Graeme Blackwood's PhD work and original code – implementation of posterior computation from evidence space
 task.lmbr.hppLattice MBR task – integrates lattice mbr as a task that can be used standalone (implemented) or included e.g. as another task in hifst
 task.loadlm.hppImplementation of a language model task
 task.loadsparseweightflowerfst.hppImplements a class that loads the grammar sparseweight flower lattice and stores a pointer on the data object
 task.loadunimap.hppImplementation of a unigram transduction model loader task
 task.loadwordmap.hppWrapper around WordMapper loader
 task.optimizefst.hppImplementation of a Fst writer taking the fst from data object
 task.patternstoinstances.hppContains patterns to instance-patterns implementation
 task.postpro.hppTask that writes translation to a text file. This translation might be wordmapped and tokenized
 task.prepro.hppDescribes class PreProTask, which preprocesses (tokenizes and maps to integers with WordMapper) source input
 task.readfst.hppImplementation of a Fst reader to data structure
 task.referencefilter.hppDescribes class ReferenceFilterTask (builds unweighted substring fst for lattice alignment )
 task.sparseweightvectorlattices.hppImplements the task of creating sparse vector weight lattices – contains feature weight contributions separately in each arc and we can use it to dump features, MERT training, etc
 task.ssgrammar.hppContains implementation for sentence-specific grammar task
 task.stats.hppTask that dumps statistics stored by any previous task in the pipeline
 task.tunewpwrite.hppTune word penalty and write output
 task.writefst.hppImplementation of a Fst writer taking the fst from data object
 taskinterface.gtest.cppUnit testing: TaskInterface methods
 taskinterface.hppInterfaces with basic methods for iteration
 TestCommandLineInterface.java
 TestFeatureRegistry.java
 TextArrayWritable.java
 TGMert.cpp
 TGMert.h
 tokenizer.osr.hppLower casing/Tokenization/Detokenization not available for open source release
 fsttools/include/tropical-sparse-tuple-weight-decls.hBasic declarations used for tropical sparse vector weight semiring
 latmert/include/tropical-sparse-tuple-weight-decls.h
 tropical-sparse-tuple-weight-funcs.hConvenience functions for tropical sparse vector weight
 tropical-sparse-tuple-weight-incls.hFiles to include for the tropical sparse tuple semiring
 fsttools/include/tropical-sparse-tuple-weight.hImplementation of tropical sparse tuple weight semiring
 latmert/include/tropical-sparse-tuple-weight.h
 tropical-sparse-tuple-weight.makeweight.hConvenience functors that allow transparent handling for weights within hifst
 tropical_LT_tropical-arc.so.cppMain target file that generates shared library
 tropicalsparsetuple-arc.so.cppMain target file for compilation into a shared library
 TTableClient.java
 TTableServer.java
 TuneSet.cpp
 TuneSet.h
 tuneset.hpp
 tunewp.main.cpp
 Util.java
 vecmap.main.cpp
 weights.gtest.cppUnit testing: Various weight makers on different semirings
 wordmapper.gtest.cppUnit testing: WordMapper, class that maps words to integers and vice versa
 wordmapper.hppClass WordMapper