xmu.io
Defines objects used to read and write XML for Axiell EMu
- class xmu.io.EMuReader(path: str | ~pathlib.Path, rec_class: ~collections.abc.Callable = <class 'dict'>, json_path: str | ~pathlib.Path = None)[source]
Bases:
objectRead records from an EMu XML file into dicts
- Parameters:
path (str | Path) – path to a file or directory
json_path (str or Path) – path to a JSON file used to cache records for faster reading
- path
path to a file or directory
- Type:
str | Path
- json_path
path to a JSON file used to cache records for faster reading
- Type:
str | Path
- files
list of file-like objects, each of which is an EMu XML file
- Type:
list
- module
the name of an EMu module
- Type:
str
- config
module-wide configuration parameters. Set automatically when an EMuConfig object is created.
- Type:
- schema
info about a specific EMu configuration. Set automatically when an EMuSchema object is created.
- Type:
- from_file() Generator[dict][source]
Reads data from file, using JSON if possible
- Yields:
dict – EMu record
- from_xml(start: int = 0, limit: int = None) Generator[dict][source]
Reads data from XML
- Parameters:
start (int) – index of record to start processing
limit (int) – number of records to process from start. If omitted, all records are processed.
- Yields:
dict – EMu record
- from_xml_parallel(callback: Callable, num_parts: int = 64, handle_repeated_keys: str = 'overwrite') Any[source]
Reads data from XML in parallel
Experimental. Works by creating temporary copies of the XML file, then reading from those files in parallel. Seems to work best with a small number of copies.
- Parameters:
callback (function) – function to run on the import file
num_parts (int) – number of parts to split the file into
handle_repeated_keys (str) – defines how to handle keys that repeat across dicts returned by different jobs. Must be one of ‘combine’ (which combines entires in a list), ‘keep’ (which keeps the first key found), ‘overwrite’ (which overwrites the existing key), ‘raise’ (which raises a KeyError), r ‘sum’ (which sums integer values). Ignored if callback does not return a dict.
- Yields:
Any – result of callback function combined across jobs. If dict, results are combined into a single dict. If list, results are combined into a single list. If another type, returns a list of results returned by the callback.
- from_json(chunk_size: int = 2097152) Generator[dict][source]
Reads data from JSON
- Parameters:
chunk_size (int) – size of chunk to use when reading the file
- Yields:
dict – EMu record
- to_csv(path: str, **kwargs) None[source]
Writes records in reader object to CSV
- Parameters:
path (str) – path to write the CSV file
kwargs – any keyword argument accepted by open()
- to_json(path: str = None, **kwargs) None[source]
Writes JSON version of XML to file
- Parameters:
path (str) – path to write JSON
kwargs – keyword arguments for json.dump()
- counts() dict | int[source]
Counts the number of records in each file
- Returns:
If one file, the number of records. Otherwise a dict of path: counts for each file.
- Return type:
dict | int
- class xmu.io.FileLike(filelike: str | ZipInfo, zip_file: ZipFile = None)[source]
Bases:
objectOpen text and zip files using the same interface
- Parameters:
filelike (str | zipfile.ZipInfo) – either the path to an XML file or a ZipInfo object
zip_file (zipfile.ZipFile) – if filelike is a ZipInfo object, the zip file containing that object
- path
path to file
- Type:
str
- zip_info
member of a zip archive
- Type:
zipfile.ZipInfo
- zip_file
the zip file containing the ZipInfo object
- Type:
zipfile.ZipFile
- property filename: str
Name of the file-like object
- xmu.io.clean_xml(path: str, encoding: str = 'utf-8') Path[source]
Removes restricted characters from XML file
- Parameters:
path (str) – path to write the CSV file
encoding (str) – encoding for reading/writing XML
- Returns:
path to clean XML file
- Return type:
pathlib.Path
- xmu.io.write_csv(records: list['EMuRecord'], path: str, **kwargs) None[source]
Writes records to CSV
- Parameters:
records (list-like) – list of EMuRecords to be written
path (str) – path to write the CSV file
kwargs – any keyword argument accepted by open()
- xmu.io.write_import(*args, **kwargs) None[source]
Writes records to an EMu import file
Alias for write_xml()
- xmu.io.write_xml(records, path, **kwargs) None[source]
Writes records to an EMu import file
- Parameters:
records (list-like) – list of EMuRecords to be imported
path (str) – path to write the import file
kwargs – any keyword argument accepted by the to_xml() method of the record class
- xmu.io.write_group(records: str, path: str, irn: int = None, name: str = None) None[source]
Writes an import for the egroups module
- Parameters:
records (list[EMuRecord]) – list of EMuRecords, each of which specifies an irn
path (str) – path to write the import file
irn (int) – the irn of an existing egroups record (updates only)
name (str) – the name of the group