Loads efficiently a wordmap file and provides methods to map word-to-integer or integer-to-word. To avoid memory footprint issues, hashing the wordmap entries is avoided.
More...
#include <wordmapper.hpp>
Loads efficiently a wordmap file and provides methods to map word-to-integer or integer-to-word. To avoid memory footprint issues, hashing the wordmap entries is avoided.
Also assumes:
- Bijective relationship (word <-> integer id)
- Sorted by id and no id missing (if 61 exists, 59 and 60 must exist in the file and appear in previous lines...
- First index is 0.
Definition at line 63 of file wordmapper.hpp.
ucam::util::WordMapper::WordMapper |
( |
const std::string & |
wordmapfile, |
|
|
bool |
reverse = false |
|
) |
| |
|
inline |
Constructor.
- Parameters
-
wordmapfile | Wordmap file to load. |
reverse | Perform string-to-integer (false) or integer-to-string(true). |
Definition at line 91 of file wordmapper.hpp.
ucam::util::WordMapper::WordMapper |
( |
iszfstream & |
wordmapstream, |
|
|
bool |
reverse = false |
|
) |
| |
|
inline |
ucam::util::WordMapper::~WordMapper |
( |
| ) |
|
|
inline |
std::size_t ucam::util::WordMapper::get_oov_id |
( |
| ) |
|
|
inline |
unordered_map<std::size_t, std::string>& ucam::util::WordMapper::get_oovwmap |
( |
| ) |
|
|
inline |
void ucam::util::WordMapper::operator() |
( |
const std::string & |
is, |
|
|
std::string * |
os, |
|
|
bool |
reverse = false |
|
) |
| |
|
inline |
Perform search. Both directions allowed (int to string or string to int).
- Parameters
-
is | input string |
os | output string |
reverse | if true, triggers reverse search (string to int). |
Definition at line 118 of file wordmapper.hpp.
unsigned ucam::util::WordMapper::operator() |
( |
const std::string & |
is | ) |
|
|
inline |
Quick hack to get what is needed for lm.
Definition at line 131 of file wordmapper.hpp.
void ucam::util::WordMapper::reset_oov_id |
( |
| ) |
|
|
inline |
void ucam::util::WordMapper::set_oovwmap |
( |
unordered_map< std::size_t, std::string > & |
oovmap | ) |
|
|
inline |
The documentation for this class was generated from the following file: