xmu.io

Defines objects used to read and write XML for Axiell EMu

class xmu.io.EMuReader(path: str | ~pathlib.Path, rec_class: ~collections.abc.Callable = <class 'dict'>, json_path: str | ~pathlib.Path = None)[source]

Bases: object

Read records from an EMu XML file into dicts

Parameters:
  • path (str | Path) – path to a file or directory

  • json_path (str or Path) – path to a JSON file used to cache records for faster reading

path

path to a file or directory

Type:

str | Path

json_path

path to a JSON file used to cache records for faster reading

Type:

str | Path

files

list of file-like objects, each of which is an EMu XML file

Type:

list

module

the name of an EMu module

Type:

str

config

module-wide configuration parameters. Set automatically when an EMuConfig object is created.

Type:

EMuConfig

schema

info about a specific EMu configuration. Set automatically when an EMuSchema object is created.

Type:

EMuSchema

from_file() Generator[dict][source]

Reads data from file, using JSON if possible

Yields:

dict – EMu record

from_xml(start: int = 0, limit: int = None) Generator[dict][source]

Reads data from XML

Parameters:
  • start (int) – index of record to start processing

  • limit (int) – number of records to process from start. If omitted, all records are processed.

Yields:

dict – EMu record

from_xml_parallel(callback: Callable, num_parts: int = 64, handle_repeated_keys: str = 'overwrite') Any[source]

Reads data from XML in parallel

Experimental. Works by creating temporary copies of the XML file, then reading from those files in parallel. Seems to work best with a small number of copies.

Parameters:
  • callback (function) – function to run on the import file

  • num_parts (int) – number of parts to split the file into

  • handle_repeated_keys (str) – defines how to handle keys that repeat across dicts returned by different jobs. Must be one of ‘combine’ (which combines entires in a list), ‘keep’ (which keeps the first key found), ‘overwrite’ (which overwrites the existing key), ‘raise’ (which raises a KeyError), r ‘sum’ (which sums integer values). Ignored if callback does not return a dict.

Yields:

Any – result of callback function combined across jobs. If dict, results are combined into a single dict. If list, results are combined into a single list. If another type, returns a list of results returned by the callback.

from_json(chunk_size: int = 2097152) Generator[dict][source]

Reads data from JSON

Parameters:

chunk_size (int) – size of chunk to use when reading the file

Yields:

dict – EMu record

to_csv(path: str, **kwargs) None[source]

Writes records in reader object to CSV

Parameters:
  • path (str) – path to write the CSV file

  • kwargs – any keyword argument accepted by open()

to_json(path: str = None, **kwargs) None[source]

Writes JSON version of XML to file

Parameters:
  • path (str) – path to write JSON

  • kwargs – keyword arguments for json.dump()

counts() dict | int[source]

Counts the number of records in each file

Returns:

If one file, the number of records. Otherwise a dict of path: counts for each file.

Return type:

dict | int

map_to_select()[source]

Maps report schema to select parameter for EMu API

verify_group(path: str | list | tuple, module: str = None) None[source]

Verifies that all fields in a group are present in the export

Parameters:
  • path (str) – the path to one field in a group

  • module (str) – the name of an EMu module

Raises:

ValueError – if one or more fields missing

report_progress(by: str = 'time', at: int = 5) None[source]

Prints progress notification messages when reading a file

Parameters:
  • by (str) – either “count” or “time”

  • at (int) – number of seconds (if by time) or number of records (if by count)

class xmu.io.FileLike(filelike: str | ZipInfo, zip_file: ZipFile = None)[source]

Bases: object

Open text and zip files using the same interface

Parameters:
  • filelike (str | zipfile.ZipInfo) – either the path to an XML file or a ZipInfo object

  • zip_file (zipfile.ZipFile) – if filelike is a ZipInfo object, the zip file containing that object

path

path to file

Type:

str

zip_info

member of a zip archive

Type:

zipfile.ZipInfo

zip_file

the zip file containing the ZipInfo object

Type:

zipfile.ZipFile

property filename: str

Name of the file-like object

open(mode: str = 'r', encoding: str = None)[source]

Opens a file or ZipInfo object

getmtime() float[source]

Returns last modification timestamp from a file or ZipInfo object

xmu.io.clean_xml(path: str, encoding: str = 'utf-8') Path[source]

Removes restricted characters from XML file

Parameters:
  • path (str) – path to write the CSV file

  • encoding (str) – encoding for reading/writing XML

Returns:

path to clean XML file

Return type:

pathlib.Path

xmu.io.write_csv(records: list['EMuRecord'], path: str, **kwargs) None[source]

Writes records to CSV

Parameters:
  • records (list-like) – list of EMuRecords to be written

  • path (str) – path to write the CSV file

  • kwargs – any keyword argument accepted by open()

xmu.io.write_import(*args, **kwargs) None[source]

Writes records to an EMu import file

Alias for write_xml()

xmu.io.write_xml(records, path, **kwargs) None[source]

Writes records to an EMu import file

Parameters:
  • records (list-like) – list of EMuRecords to be imported

  • path (str) – path to write the import file

  • kwargs – any keyword argument accepted by the to_xml() method of the record class

xmu.io.write_group(records: str, path: str, irn: int = None, name: str = None) None[source]

Writes an import for the egroups module

Parameters:
  • records (list[EMuRecord]) – list of EMuRecords, each of which specifies an irn

  • path (str) – path to write the import file

  • irn (int) – the irn of an existing egroups record (updates only)

  • name (str) – the name of the group