egrecho.utils.patch.io_patch#
- egrecho.utils.patch.io_patch.gzip_open_patch(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)[source]#
Open a gzip-compressed file in binary or text mode. To handle “trailing garbage” in gzip files.
- class egrecho.utils.patch.io_patch.AltGzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)[source]#
Bases:
GzipFile
This is a workaround for Python’s stdlib gzip module not implementing gzip decompression correctly… Command-line gzip is able to discard “trailing garbage” in gzipped files, but Python’s gzip is not.
- class egrecho.utils.patch.io_patch.FsspecLocalGlob[source]#
Bases:
object
A glob from fsspec.
Here are some behaviors specific to fsspec glob that are different from glob.glob, Path.glob, Path.match or fnmatch:
'*'
matches only first level items'**'
matches all items'**/*'
matches all at least second level itemsFor example,
glob.glob('**/*', recursive=True)
, the last/*
is invalid as greedy mode of first pattern'**'
.
- classmethod glob(path, **kwargs)[source]#
Find files by glob-matching.
Here are some behaviors specific to fsspec glob that are different from glob.glob, Path.glob, Path.match or fnmatch:
'*'
matches only first level items'**'
matches all items'**/*'
matches all at least second level itemse.g., glob.glob(’/*’, recursive=True), the last ‘/*’ is invalid as greedy mode of first pattern ‘’.
If the path ends with ‘/’ and does not contain “*”, it is essentially the same as
ls(path)
, returning only files.We support
"**"
,"?"
and"[..]"
. We do not support ‘^’ for pattern negation.Search path names that contain embedded characters special to this implementation of glob may not produce expected results; e.g., ‘foo/bar/*starredfilename*’.
kwargs are passed to
ls
.
- classmethod walk(path, maxdepth=None, topdown=True, **kwargs)[source]#
Return all files belows path
List all files, recursing into subdirectories; output is iterator-style, like
os.walk()
. For a simple list of files,find()
is available.When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again. Modifying dirnames when topdown is False has no effect. (see os.walk)
Note that the “files” outputted will include anything that is not a directory, such as links.
- Parameters:
path (str) -- Root to recurse into
maxdepth (int) -- Maximum recursion depth. None means limitless, but not recommended on link-based file-systems.
topdown (bool (True)) -- Whether to walk the directory tree from the top downwards or from the bottom upwards.
**kwargs -- passed to
ls
.
- classmethod find(path, maxdepth=None, withdirs=False, detail=False, **kwargs)[source]#
List all files below path.
Like posix
find
command without conditions- Parameters:
path (str) --
maxdepth (int or None) -- If not None, the maximum number of levels to descend
withdirs (bool) -- Whether to include directory paths in the output. This is True when used by glob, but users usually only want files.
**kwargs -- passed to
ls
.
- egrecho.utils.patch.io_patch.make_path_posix(path)[source]#
Make path generic and absolute for current OS
- egrecho.utils.patch.io_patch.stringify_path(filepath)[source]#
Attempt to convert a path-like object to a string.
- Parameters:
filepath (object to be converted) --
- Returns:
filepath_str
- Return type:
maybe a string version of the object
Notes
Objects supporting the fspath protocol are coerced according to its __fspath__ method. For backwards compatibility with older Python version, pathlib.Path objects are specially coerced. Any other object is passed through unchanged, which includes bytes, strings, buffers, or anything else that’s not even path-like.
egrecho.utils.patch.torchdata_patch#
- egrecho.utils.patch.torchdata_patch.validate_input_col(fn, input_col)[source]#
Copid from
torch.utils.data.datapipes.utils.common
.Checks that function used in a callable datapipe works with the input column
This simply ensures that the number of positional arguments matches the size of the input column. The function must not contain any non-default keyword-only arguments.
Examples
>>> # xdoctest: +SKIP("Failing on some CI machines") >>> def f(a, b, *, c=1): >>> return a + b + c >>> def f_def(a, b=1, *, c=1): >>> return a + b + c >>> validate_input_col(f, [1, 2]) >>> validate_input_col(f_def, 1) >>> validate_input_col(f_def, [1, 2])
Notes
If the function contains variable positional (
inspect.VAR_POSITIONAL
) arguments, for example,f(a, *args)
, the validator will accept any size of input column greater than or equal to the number of positional arguments. (in this case, 1).- Parameters:
fn (
Callable
) -- The function to check.input_col (
Union
[int
,tuple
,list
,None
]) -- The input column to check.
- Raises:
ValueError -- If the function is not compatible with the input column.
- class egrecho.utils.patch.torchdata_patch.StreamWrapper(file_obj, parent_stream=None, name=None)[source]#
Bases:
object
StreamWrapper is introduced to wrap file handler generated by DataPipe operation like
FileOpener
. StreamWrapper would guarantee the wrapped file handler is closed when it’s out of scope.