egrecho.core.loads#
- class egrecho.core.loads.HLoads[source]#
Bases:
ObjectDict
Structs load parameters as:
model: _cls_: egrecho.models.ecapa.model.EcapaModel # override_init_model_cfg config: {} # other kwargs placeholder ... feature_extractor: _cls_: egrecho.data.features.feature_extractor_audio.KaldiFeatureExtractor # kwargs passing to _cls_.fetch_from ...
- classmethod from_config(config=None, **kwargs)[source]#
Creates hloads from config.
Input
config
can be an instance ofdict|str|Path|HLoads|MutableMapping
, the **kwargs will be merged recursely intoconfig
.Normalize dict -> Merge -> (maybe) Omegaconf resolve -> Instantiate -> Output
- class egrecho.core.loads.HResults[source]#
Bases:
ObjectDict
Structs loaded result of
SaveLoadHelper.fetch_from
.
- class egrecho.core.loads.SaveLoadHelper[source]#
Bases:
object
Save/load model in a directory, overwrite this for any special manners.
Example:
from egrecho.core.loads import SaveLoadHelper from egrecho.models.ecapa.model import EcapaModel from egrecho.data.features.feature_extractor_audio import KaldiFeatureExtractor sl_helper = SaveLoadHelper() extractor = KaldiFeatureExtractor() model = EcapaModel() dirpath = 'testdir/ecapa' sl_helper.save_to(dirpath,model_or_state=model,components=extractor)
$ tree testdir/ecapa testdir/ecapa/ βββ config β βββ feature_config.yaml β βββ model_config.yaml β βββ types.yaml βββ model_weight.ckpt
hresults = sl_helper.fetch_from(dirpath) assert isinstance(hresults.model,EcapaModel) assert isinstance(hresults.feature_extractor, KaldiFeatureExtractor) # hloads control random init hloads = {'model': {'init_weight': 'random'}} hresults = sl_helper.fetch_from(dirpath, hloads=hloads) # kwargs overrides to pretrained again hresults = sl_helper.fetch_from(dirpath, hloads=hloads, model={'init_weight': 'pretrained'}) # now remove types.yaml # rm -f testdir/ecapa/config/types.yaml hresults = sl_helper.fetch_from(dirpath, single_key='model') # raise ConfigurationException: Failed request model type # Let's complete the model type model_cls = 'egrecho.models.ecapa.model.EcapaModel' hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) # Type is ok model_cls = EcapaModel hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) # classname string is ok as EcapaModel is already imported model_cls = 'EcapaModel' hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) model_cls = 'Valle' # Error as 'Valle' is not registed. hresults = sl_helper.fetch_from( dirpath, single_key="model", kwargs_recurse_override=False, model={"_cls_": model_cls, "init_weight": "random", "config": None}, ) # only load model without weight and eliminate the influences of Ecapa model directory from egrecho.models.valle.model import Valle # Try again. hresults = sl_helper.fetch_from( dirpath, single_key="model", kwargs_recurse_override=False, model={"_cls_": model_cls, "init_weight": "random", "config": None}, ) assert isinstance(hresults.model, Valle)
- save_to(savedir, model_or_state=None, components=None, **kwargs)[source]#
Save model after pretraining.
Exports a pretrained model with its subcompnents (configs, tokenizer, etc β¦) outdir like:
./savedir βββ model_weight.ckpt βββ ./config βββ model_config.yaml βββ feature_config.yaml βββ types.yaml
- Parameters:
savedir -- local directory.
model_or_state (
Union
[TopVirtualModel
,Dict
[str
,Any
],None
]) -- TopVirtualModel object or model state dict to be saved.components (
Optional
[Iterable
[Any
]]) -- obj of tokenizer, feature extractor etc..
Example:
from egrecho.core.loads import SaveLoadHelper from egrecho.models.ecapa.model import EcapaModel from egrecho.data.features.feature_extractor_audio import KaldiFeatureExtractor sl_helper = SaveLoadHelper() extractor = KaldiFeatureExtractor() model = EcapaModel() dirpath = 'testdir/ecapa' sl_helper.save_to(dirpath,model_or_state=model,components=extractor)
$ tree testdir/ecapa testdir/ecapa/ βββ config β βββ feature_config.yaml β βββ model_config.yaml β βββ types.yaml βββ model_weight.ckpt
hresults = sl_helper.fetch_from(dirpath) assert isinstance(hresults.model,EcapaModel) assert isinstance(hresults.feature_extractor, KaldiFeatureExtractor) # hloads control random init hloads = {'model': {'init_weight': 'random'}} hresults = sl_helper.fetch_from(dirpath, hloads=hloads) # kwargs overrides to pretrained again hresults = sl_helper.fetch_from(dirpath, hloads=hloads, model={'init_weight': 'pretrained'}) # now remove types.yaml # rm -f testdir/ecapa/config/types.yaml hresults = sl_helper.fetch_from(dirpath, single_key='model') # raise ConfigurationException: Failed request model type # Let's complete the model type model_cls = 'egrecho.models.ecapa.model.EcapaModel' hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) # Type is ok model_cls = EcapaModel hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) # classname string is ok as EcapaModel is already imported model_cls = 'EcapaModel' hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) model_cls = 'Valle' # Error as 'Valle' is not registed. hresults = sl_helper.fetch_from( dirpath, single_key="model", kwargs_recurse_override=False, model={"_cls_": model_cls, "init_weight": "random", "config": None}, ) # only load model without weight and eliminate the influences of Ecapa model directory from egrecho.models.valle.model import Valle # Try again. hresults = sl_helper.fetch_from( dirpath, single_key="model", kwargs_recurse_override=False, model={"_cls_": model_cls, "init_weight": "random", "config": None}, ) assert isinstance(hresults.model, Valle)
- fetch_from(srcdir, hloads=None, base_model_cls=None, skip_keys=None, single_key=None, return_hloads=False, kwargs_recurse_override=True, **kwargs)[source]#
Load module class in Hloads. Return HResults dict contains (MODEL, FEATURE_EXTRACTOR, β¦), MODEL could be:
A instance of model.
None
when skip_keys apply on model.
Note
Workflow is defined as a sequence of the following operations:
- Reslove hloads.
User could use config file or passing kwargs to control behaviour.
- Load available class types via
load_types_dict()
. Note that passing classname as type is available if that class is imported in current namespace and is a subclass of some base module, support (
TopVirtualModel
,BaseFeaature
,BaseTokenizer
) now. E.g., instead of passing a full class path:'egrecho.models.ecapa.model.EcapaModel'
, user can first import that class in python module act as a register manner, then the class name'EcapaModel'
is available. This mechianism could simplify parameter control.
- Load available class types via
- Instantiate classes
instantiate_classes()
according to typtes dict. Specially, the model is loaded lazily, i.e., a tuple of (MODEL_CLS, INIT_MODEL_CFG, LEFT_MODEL_CFG) resloved by
instantiate_model(lazy_model=True)
.
- Instantiate classes
- Instantiate model
_instantiate_model()
. User might overwrite this method in subclasses.
- Instantiate model
Load model weight.
- Parameters:
srcdir (
Union
[str
,Path
]) --Model directory like:
./srcdir βββ model_weight.ckpt βββ ./config βββ model_config.yaml βββ feature_config.yaml βββ types.yaml
hloads (
Union
[str
,Path
,Dict
[str
,Any
],None
]) --Path|str|Dict, optional Hparam dict/file with hierarchical structure as in this example:
model: _cls_: egrecho.models.ecapa.model.EcapaModel # override_init_model_cfg config: {} # other kwargs placeholder ... feature_extractor: _cls_: egrecho.data.features.feature_extractor_audio.KaldiFeatureExtractor # kwargs passing to _cls_.fetch_from ...
You most likely wonβt need this since default behaviours well. However, this arguments give a chance to complete/override kwargs.
base_model_cls (
Union
[str
,Type
,None
]) -- Base model classsingle_key (
Optional
[str
]) -- Load specify key.skip_keys (
Union
[str
,Literal
['model'
,'others'
,'null'
],Set
[str
],None
]) -- Skip keys, e.g., skip model. Invalid whensingle_key=True
.kwargs_recurse_override (
bool
) -- Whether kwargs recursely overrides hloads.kwargs (Dict[str,Any]) --
Overrides hloads.
Hint
Example of model-related params.
self.fetch_from(..., model=dict(init_weight='last.ckpt', strict=False)
init_weight: Init weight from (βpretrainedβ or βrandomβ), or string ckpt name (model_weight.ckpt) or full path to ckpt /path/to/model_weight.ckpt. Default:
'pretrained'
map_location: MAP_LOCATION_TYPE as in torch.load(). Defaults to βcpuβ. If you preferring to load a checkpoint saved a GPU model to GPU, set it to None (not move to another GPU) or set a specified device.
strict : bool, optional, Whether to strictly enforce that the keys in checkpoint match the keys returned by this moduleβs state dict. Defaults to True.
- Return type:
- Returns:
A HResults dict.
Example:
from egrecho.core.loads import SaveLoadHelper from egrecho.models.ecapa.model import EcapaModel from egrecho.data.features.feature_extractor_audio import KaldiFeatureExtractor sl_helper = SaveLoadHelper() extractor = KaldiFeatureExtractor() model = EcapaModel() dirpath = 'testdir/ecapa' sl_helper.save_to(dirpath,model_or_state=model,components=extractor)
$ tree testdir/ecapa testdir/ecapa/ βββ config β βββ feature_config.yaml β βββ model_config.yaml β βββ types.yaml βββ model_weight.ckpt
hresults = sl_helper.fetch_from(dirpath) assert isinstance(hresults.model,EcapaModel) assert isinstance(hresults.feature_extractor, KaldiFeatureExtractor) # hloads control random init hloads = {'model': {'init_weight': 'random'}} hresults = sl_helper.fetch_from(dirpath, hloads=hloads) # kwargs overrides to pretrained again hresults = sl_helper.fetch_from(dirpath, hloads=hloads, model={'init_weight': 'pretrained'}) # now remove types.yaml # rm -f testdir/ecapa/config/types.yaml hresults = sl_helper.fetch_from(dirpath, single_key='model') # raise ConfigurationException: Failed request model type # Let's complete the model type model_cls = 'egrecho.models.ecapa.model.EcapaModel' hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) # Type is ok model_cls = EcapaModel hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) # classname string is ok as EcapaModel is already imported model_cls = 'EcapaModel' hresults = sl_helper.fetch_from(dirpath, single_key='model', model={'_cls_': model_cls}) assert isinstance(hresults.model,EcapaModel) model_cls = 'Valle' # Error as 'Valle' is not registed. hresults = sl_helper.fetch_from( dirpath, single_key="model", kwargs_recurse_override=False, model={"_cls_": model_cls, "init_weight": "random", "config": None}, ) # only load model without weight and eliminate the influences of Ecapa model directory from egrecho.models.valle.model import Valle # Try again. hresults = sl_helper.fetch_from( dirpath, single_key="model", kwargs_recurse_override=False, model={"_cls_": model_cls, "init_weight": "random", "config": None}, ) assert isinstance(hresults.model, Valle)
- load_model_with_components(cfg_dir, hloads=None, base_model_cls=None, skip_keys=None, single_key=None, lazy_model=False)[source]#
Load module class in Hloads. Return tuple contains (MODEL, COMPONETS, HLOADS), where TYPES_DICT indicates what classes will be used to instance an object. Model could be:
A tuple of (MODEL_INSTANCE, LEFT_MODEL_CFG).
A tuple of (MODEL_CLS, INIT_MODEL_CFG, LEFT_MODEL_CFG) resloved as lazy model in
_instantiate_model()
.None
when skip_keys apply on model.
- Parameters:
cfg_dir (
str
) -- Directory contains cfg files.hloads (
Optional
[HLoads
]) --Path|str|Dict, optional Hparam dict/file with hierarchical structure as in this example:
model: _cls_: egrecho.models.ecapa.model.EcapaModel # replace default model_config.yaml config_fname: some_config.yaml # override_init_model_cfg config: {} # other kwargs placeholder ... feature_extractor: _cls_: egrecho.data.features.feature_extractor_audio.KaldiFeatureExtractor # kwargs passing to _cls_.fetch_from ...
You most likely wonβt need this since default behaviours well. However, this arguments give a chance to complete/override kwargs.
base_model_cls (
Union
[str
,Type
,None
]) -- Base model classsingle_key (
Optional
[str
]) -- Load specify key.skip_keys (
Union
[str
,Literal
['model'
,'others'
,'null'
],Set
[str
],None
]) -- Skip keys, e.g., skip model. Invalid whensingle_key=True
.lazy_model (
bool
) -- If False, instantiate model else just left mode cls with its init cfg. Default: False
- Return type:
Tuple
[Union
[Tuple
[Module
,Dict
[str
,Any
]],Tuple
[Type
,Dict
[str
,Any
],Dict
[str
,Any
]],None
],Dict
[str
,Any
],HLoads
]- Returns:
A tuple contains (MODEL, COMPONETS, HLOADS).
- modify_state_dict(state_dict, model_cfg)[source]#
Allows to modify the state dict before loading parameters into a model. :type state_dict: :param state_dict: The state dict restored from the checkpoint. :type model_cfg: :param model_cfg: A model level dict object.
- Returns:
A potentially modified state dict.
- load_instance_with_state_dict(instance, state_dict, strict)[source]#
Utility method that loads a model instance with the (potentially modified) state dict.
- Parameters:
instance -- ModelPT subclass instance.
state_dict -- The state dict (which may have been modified)
strict -- Bool, whether to perform strict checks when loading the state dict.
- egrecho.core.loads.save_ckpt_conf_dir(ckptdir, model_conf=None, extractor=None, model_type=None, feature_extractor_type=None, **kwargs)[source]#
Makes it convenient to load from pretrained, save extractor, model_type, etc.. to a dir.
Construct a dir like:
./ckptdir βββ ./config βββ model_config.yaml βββ feature_config.yaml βββ types.yaml
- Parameters:
ckptdir (
str
) -- the parent of savedir, it will create aconfig
subdir as a placeholder of files.model_conf (
Optional
[Dict
[str
,Any
]]) -- a dict of model config.extractor (
Union
[Dict
[str
,Any
],BaseFeature
,None
]) -- extractor can be either a dict or a instance ofBaseFeature
.model_type (
Union
[str
,Type
,None
]) -- model class type or class import path.feature_extractor_type (
Union
[str
,Type
,None
]) -- feature extractor class type or class import path.
- class egrecho.core.loads.ResolveModelResult(checkpoint=None, model_type=None, feature_config=None)[source]#
Bases:
object
Resolved opts.
- Parameters:
checkpoint (str) -- ckpt weight path
model_type (str) -- model type string.
feature_config (
Optional
[Dict
[str
,Any
]]) -- loaded dict of feature extractor config.
- egrecho.core.loads.resolve_pretrained_model(checkpoint='last.ckpt', dirpath=None, best_k_mode='min', version='version', extractor_fname='feature_config.yaml', **resolve_ckpt_kwargs)[source]#
Resolves
checkpoint
,model_type
,feats_config
.Checkpoint resolving see
resolve_ckpt()
for details. Auto resolve local dir like:./dirpath/version_1 βββ checkpoints βββ best_k_models.yaml βββ last.ckpt βββ abc.ckpt βββ ./config βββ model_config.yaml βββ feature_config.yaml βββ types.yaml
- Parameters:
checkpoint (str, optional) -- The file name of checkpoint to resolve, local file needs a suffix like
".ckpt" / ".pt"
, Whilecheckpoint="best"
is a preseved key means it will findbest_k_fname
which is a file contains Dict[BEST_K_MODEL_PATH, BEST_K_SCORE], and sort by its score to match a best ckpt. Defaults to βlast.ckptβ.dirpath (Path or str, optional) -- The root path. Defaults to None, which means the current directory.
version (str, optional) -- The versioned subdir name. Conmmonly subdir is named as βversion_0/version_1β, if you specify the version name with a version num, it will search that version dir, otherwise choose the max number of version (above βversion_1β). Defaults to βversionβ.
best_k_mode (Literal["max", "min"], optional) -- The mode for selecting the best_k checkpoint. Defaults to βminβ.
extractor_fname (str) -- feature extractor file name, defaults to
"feature_config.yaml"
, search inconfig/
subdir.resolve_ckpt_kwargs (dict) -- additional kwargs to
resolve_ckpt()
.
- Return type:
- egrecho.core.loads.load_module_class(module_path, base_module_type=None)[source]#
Given a import path which contains class and returns the class type.
If import path is full format, it should be dot import format and the last part is the class name.
If only provide model class name (without dot β.β), it will resolve the subclasses of
base_module_type
which have been registered via imports in python file and match the model name in the last part. if one name matches more than one model class, itβwill failed and you need provide the full path to elimiate ambiguity.- Parameters:
module_path (str) -- The import path containing the module class. For the case only provide class name, that class should be registered by
import
in your python.base_module_type (Type, optional) -- The base class type to check against.
- Returns:
The class type loaded from the module path.
- Return type:
Type
- egrecho.core.loads.load_model_type(module_path, base_module_type=<class 'egrecho.core.module.TopVirtualModel'>)[source]#
Given a import path which contains class and returns the class type.
If import path is full format, it should be dot import format and the last part is the class name.
If only provide model class name (without dot β.β), it will resolve the subclasses of
base_module_type
which have been registered via imports in python file and match the model name in the last part. if one name matches more than one model class, itβwill failed and you need provide the full path to elimiate ambiguity.- Parameters:
module_path (str) -- The import path containing the module class. For the case only provide class name, that class should be registered by
import
in your python.base_module_type (Type, optional) -- The base class type to check against.
- Returns:
The class type loaded from the module path.
- Return type:
Type
- egrecho.core.loads.load_extractor_type(module_path, base_module_type=<class 'egrecho.core.feature_extractor.BaseFeature'>)[source]#
Given a import path which contains class and returns the class type.
If import path is full format, it should be dot import format and the last part is the class name.
If only provide model class name (without dot β.β), it will resolve the subclasses of
base_module_type
which have been registered via imports in python file and match the model name in the last part. if one name matches more than one model class, itβwill failed and you need provide the full path to elimiate ambiguity.- Parameters:
module_path (str) -- The import path containing the module class. For the case only provide class name, that class should be registered by
import
in your python.base_module_type (Type, optional) -- The base class type to check against.
- Returns:
The class type loaded from the module path.
- Return type:
Type
- egrecho.core.loads.load_extend_default_type(module_path, default_type=None)[source]#
Allows simple class name when
default_type
is provided.If import path is dot βcalender.Calenderβ format, it should be full import format and the last part is the class name. Note that
default_type
is ignored in this case.If only provide model class name (without dot β.β), must provide
default_type
asbase_module_type
, then it will resolve the subclasses ofdefault_type
which have been registered via imports in python file and match the model name in the last part. If one name matches more than one model class, itβwill failed and you need provide the full path to elimiate ambiguity.
- Parameters:
module_path (str) -- The import path containing the module class. For the case only provide class name, that class should be a subclass of
default_type
registered byimport
in your python.default_type (Type, optional) -- The default class type.
- Returns:
The class type loaded from the module path.
- Return type:
Type