File-Related Utility Functions¶
- igbpyutils.file.BinaryStream¶
A type to represent binary file handles.
alias of
IO[bytes] |RawIOBase|BufferedIOBase|GzipFile
- igbpyutils.file.AnyPaths¶
A type to represent any path or iterable of paths.
Can be converted to
Pathobjects withto_Paths().alias of
str|PathLike|bytes|Iterable[str|PathLike|bytes]
- igbpyutils.file.to_Paths(paths: str | PathLike | bytes | Iterable[str | PathLike | bytes]) Generator[Path, None, None][source]¶
- igbpyutils.file.to_Paths(paths: bytes) Generator[Path, None, None]
- igbpyutils.file.to_Paths(paths: str) Generator[Path, None, None]
- igbpyutils.file.to_Paths(paths: PathLike) Generator[Path, None, None]
- igbpyutils.file.autoglob(files: Iterable[str], *, force: bool = False) Generator[str, None, None][source]¶
In Windows
cmd.exe, automatically applyglob()andexpanduser(), otherwise don’t change the input.For example, take the following script:
>>> import argparse ... from igbpyutils.file import autoglob ... parser = argparse.ArgumentParser(description='Example') ... parser.add_argument('files', metavar="FILE", help="Files", nargs="+") ... args = parser.parse_args() ... paths = autoglob(args.files)
On a normal *NIX shell, calling this script as
python script.py ~/*.pywould result inargs.filesbeing a list of"/home/username/filename.py"strings if such files exist, or otherwise a single element of"/home/username/*.py". However, in a Windowscmd.exeshell, the aforementioned command always results inargs.filesbeing['~/*.py']. This function fixes that, such that the behavior on Windows is the same as on Linux.Note
This function now uses a heuristic check of the environment variables
COMSPECandSHELLto detect the current shell. Uncommon values in these variables may cause mis-detection; please feel free to submit patches if the detection does not work on your system.
- igbpyutils.file.cmdline_rglob(paths: str | PathLike | bytes | Iterable[str | PathLike | bytes]) Generator[Path, None, None][source]¶
Given a list of filenames and directories, such as might be given on the command line, return each input item, and also return the result of
Path.rglob('*')for each item that is a directory.If the given list is empty, use
Path()instead, i.e. the current directory, but only its contents are included in the output, not the directory itself; to get that you must explicitly pass the directory as an input.pathlib.Path.absolute()is used to remove duplicates from the output to the best of its ability. This is used instead ofpathlib.Path.resolve()because that resolves symlinks and therefore would cause unexpected results for programs that need to see symlinks.- Seealso:
autoglob()can be used on the list of paths before passing it to this function.
- class igbpyutils.file.Pushd(newdir: str | PathLike)[source]¶
A context manager that temporarily changes the current working directory.
On Python >=3.11, this is simply an alias for
contextlib.chdir().
- igbpyutils.file.filetypestr(st: stat_result) str[source]¶
Return a string naming the file type reported by
os.stat().
- igbpyutils.file.is_windows_filename_bad(fn: str) bool[source]¶
Check whether a Windows filename is invalid.
Tests whether a filename contains invalid characters or has an invalid name, but does not check whether there are name collisions between filenames of differing case.
Reference: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
- igbpyutils.file.open_out(filename: str | PathLike | None = None, mode='w', *, encoding='UTF-8', errors=None, newline=None)[source]¶
This context manager either opens the file specified and provides its file object, or, if the filename is not specified or it is the string
"-",sys.stdoutis provided.Important
When
sys.stdoutis returned, themode,encoding,errors, andnewlinearguments are ignored. Otherwise, when a file is opened, the default encoding is UTF-8, unless you change it.
- igbpyutils.file.replacer(file: str | PathLike, *, binary: bool = False, encoding=None, errors=None, newline=None)[source]¶
Replace a file by renaming a temporary file over the original.
With this context manager, a temporary file is created in the same directory as the original file. The context manager gives you two file handles: the input file, and the output file, the latter being the temporary file. You can then read from the input file and write to the output file. When the context manager is exited, it will replace the input file with the temporary file. If an error occurs in the context manager, the temporary file is unlinked and the original file left unchanged.
Depending on the OS and file system, the
os.replace()used here may be an atomic operation. However, this function doesn’t provide protection against multiple writers and is therefore intended for files with a single writer and multiple readers. Multiple writers will need to be coordinated with external locking mechanisms.
- igbpyutils.file.replace_symlink(src: str | PathLike, dst: str | PathLike, *, missing_ok: bool = False)[source]¶
Attempt to atomically replace (or create) a symbolic link pointing to
srcnameddst.This function works by trying to choose a temporary filename for the link in the destination directory, and then replacing the target with that temporary link.
Depending on the OS and file system, the
os.replace()used here may be an atomic operation. However, the surrounding operations (e.g. checking ifdstexists etc.) present a small chance for race conditions, so this function is primarily suited for situations with a single writer and multiple readers. Multiple writers will need to be coordinated with external locking mechanisms.- Seealso:
replace_link()can do the same, but using a temporary directory instead of a temporary file in the same directory as the target file.
- igbpyutils.file.replace_link(src: str | PathLike, dst: str | PathLike, *, symbolic: bool = False)[source]¶
Attempt to atomically create or replace a hard or symbolic link pointing to
srcnameddst.This function works by creating the link in a new temporary directory first, thus offloading the responsibility for finding a fitting temporary name and cleanup to
TemporaryDirectory.Depending on the OS and file system, the
os.replace()used here may be an atomic operation. However, this function doesn’t provide protection against multiple writers and is therefore intended for files with a single writer and multiple readers. Multiple writers will need to be coordinated with external locking mechanisms.
- igbpyutils.file.NamedTempFileDeleteLater(*args, **kwargs) Generator[source]¶
A
NamedTemporaryFile()that is unlinked on context manager exit, not on close.On Python >=3.12, this simply calls
tempfile.NamedTemporaryFile()withdelete=Trueand the newdelete_on_close=False.
- igbpyutils.file.simple_perms(st_mode: int, *, group_write: bool = False) tuple[int, int][source]¶
This function tests a file’s permission bits to see if they are in a small set of “simple” permissions and suggests new permission bits if they are not.
Deprecated since version 0.5.0: Use https://pypi.org/project/simple-perms/ instead.
The set of “simple” permissions is (0o444, 0o555, 0o644, 0o755) or, when
group_writeisTrue, (0o444, 0o555, 0o664, 0o775).- Parameters:
st_mode – The file’s mode bits from
os.stat_result.st_mode, such as returned byos.lstat()orpathlib.Path.lstat().group_write – When
True, suggest that files / directories writable by the user should be writable by the group too.
- Returns:
A tuple consisting of the file’s current permission and a suggested permission to use instead, based on the user’s permission bits and whether the file is a directory or not. The two values may be equal indicating that no change is suggested. No changes are suggested for symbolic links.
- igbpyutils.file.simple_cache(cache_file: str | PathLike, *, verbose: bool = False) Callable[[Callable[[], _T]], Callable[[], _T]][source]¶
A very basic caching decorator for functions that take no arguments, intended for caching data that is expensive to generate.
On the first call of the function, its return value is saved to the specified file on disk via
pickle, and on subsequent calls that file is loaded instead of calling the wrapped function. The original function can be called via the__wrapped__attribute on the outer function. Currently, the only way to clear the cache is by deleting the file.No file locking or other synchronization is performed, so this is likely not safe for threading or multiple processes.
No type checking is performed on the data loaded from the file.
Warning
Please see the security warnings in the
pickledocumentation!For much more powerful caching and memoization, look at something like
diskcacheor similar modules.