egrecho.utils.patch.io_patch#

egrecho.utils.patch.io_patch.gzip_open_patch(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)[source]#

Open a gzip-compressed file in binary or text mode. To handle “trailing garbage” in gzip files.

class egrecho.utils.patch.io_patch.AltGzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)[source]#

Bases: GzipFile

This is a workaround for Python’s stdlib gzip module not implementing gzip decompression correctly… Command-line gzip is able to discard “trailing garbage” in gzipped files, but Python’s gzip is not.

class egrecho.utils.patch.io_patch.FsspecLocalGlob[source]#

Bases: object

A glob from fsspec.

Here are some behaviors specific to fsspec glob that are different from glob.glob, Path.glob, Path.match or fnmatch:

  • '*' matches only first level items

  • '**' matches all items

  • '**/*' matches all at least second level items

    For example, glob.glob('**/*', recursive=True), the last /* is invalid as greedy mode of first pattern '**'.

classmethod glob(path, **kwargs)[source]#

Find files by glob-matching.

Here are some behaviors specific to fsspec glob that are different from glob.glob, Path.glob, Path.match or fnmatch:

  • '*' matches only first level items

  • '**' matches all items

  • '**/*' matches all at least second level items

    e.g., glob.glob(’/*’, recursive=True), the last ‘/*’ is invalid as greedy mode of first pattern ‘’.

If the path ends with ‘/’ and does not contain “*”, it is essentially the same as ls(path), returning only files.

We support "**", "?" and "[..]". We do not support ‘^’ for pattern negation.

Search path names that contain embedded characters special to this implementation of glob may not produce expected results; e.g., ‘foo/bar/*starredfilename*’.

kwargs are passed to ls.

classmethod exists(path, **kwargs)[source]#

Is there a file at the given path

classmethod walk(path, maxdepth=None, topdown=True, **kwargs)[source]#

Return all files belows path

List all files, recursing into subdirectories; output is iterator-style, like os.walk(). For a simple list of files, find() is available.

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again. Modifying dirnames when topdown is False has no effect. (see os.walk)

Note that the “files” outputted will include anything that is not a directory, such as links.

Parameters:
  • path (str) -- Root to recurse into

  • maxdepth (int) -- Maximum recursion depth. None means limitless, but not recommended on link-based file-systems.

  • topdown (bool (True)) -- Whether to walk the directory tree from the top downwards or from the bottom upwards.

  • **kwargs -- passed to ls.

classmethod find(path, maxdepth=None, withdirs=False, detail=False, **kwargs)[source]#

List all files below path.

Like posix find command without conditions

Parameters:
  • path (str) --

  • maxdepth (int or None) -- If not None, the maximum number of levels to descend

  • withdirs (bool) -- Whether to include directory paths in the output. This is True when used by glob, but users usually only want files.

  • **kwargs -- passed to ls.

egrecho.utils.patch.io_patch.make_path_posix(path)[source]#

Make path generic and absolute for current OS

egrecho.utils.patch.io_patch.stringify_path(filepath)[source]#

Attempt to convert a path-like object to a string.

Parameters:

filepath (object to be converted) --

Returns:

filepath_str

Return type:

maybe a string version of the object

Notes

Objects supporting the fspath protocol are coerced according to its __fspath__ method. For backwards compatibility with older Python version, pathlib.Path objects are specially coerced. Any other object is passed through unchanged, which includes bytes, strings, buffers, or anything else that’s not even path-like.

egrecho.utils.patch.torchdata_patch#

egrecho.utils.patch.torchdata_patch.validate_input_col(fn, input_col)[source]#

Copid from torch.utils.data.datapipes.utils.common.

Checks that function used in a callable datapipe works with the input column

This simply ensures that the number of positional arguments matches the size of the input column. The function must not contain any non-default keyword-only arguments.

Examples

>>> # xdoctest: +SKIP("Failing on some CI machines")
>>> def f(a, b, *, c=1):
>>>     return a + b + c
>>> def f_def(a, b=1, *, c=1):
>>>     return a + b + c
>>> validate_input_col(f, [1, 2])
>>> validate_input_col(f_def, 1)
>>> validate_input_col(f_def, [1, 2])

Notes

If the function contains variable positional (inspect.VAR_POSITIONAL) arguments, for example, f(a, *args), the validator will accept any size of input column greater than or equal to the number of positional arguments. (in this case, 1).

Parameters:
  • fn (Callable) -- The function to check.

  • input_col (Union[int, tuple, list, None]) -- The input column to check.

Raises:

ValueError -- If the function is not compatible with the input column.

class egrecho.utils.patch.torchdata_patch.StreamWrapper(file_obj, parent_stream=None, name=None)[source]#

Bases: object

StreamWrapper is introduced to wrap file handler generated by DataPipe operation like FileOpener. StreamWrapper would guarantee the wrapped file handler is closed when it’s out of scope.

classmethod close_streams(v, depth=0)[source]#

Traverse structure and attempts to close all found StreamWrappers on best effort basis.

autoclose()[source]#

Close steam if there is no children, or make it to be automatically closed as soon as all child streams are closed.

egrecho.utils.patch.simple_parse_patch#