egrecho.utils.text.cleaners#
This code is modified from Nemo and coqui.
- egrecho.utils.text.cleaners.basic_cleaners(string)[source]#
Basic pipeline that lowercases and collapses whitespace without transliteration.
- egrecho.utils.text.cleaners.english_cleaners(string)[source]#
Pipeline for English text, including number and abbreviation expansion.
- egrecho.utils.text.cleaners.chinese_mandarin_cleaners(string)[source]#
Basic pipeline for chinese
- Return type:
str
- egrecho.utils.text.cleaners.transliteration_cleaners(string)[source]#
Pipeline for non-English text that transliterates to ASCII.
- egrecho.utils.text.cleaners.phoneme_cleaners(string)[source]#
Pipeline for phonemes mode, including number and abbreviation expansion.
- egrecho.utils.text.cleaners.replace_symbols(string, lang='en')[source]#
Replace symbols based on the lenguage tag.
- Parameters:
string -- Input text.
lang -- Lenguage identifier. ex: “en”, “fr”, “pt”, “ca”.
- Returns:
The modified string .. rubric:: Example
- input args:
string: “si l’avi cau, diguem-ho” lang: “ca”
- Output:
string: “si lavi cau, diguemho”