User guide

Install

Install xmu with pip:

pip install xmu

Or install from the GitHub repository using git and pip:

git clone https://github.com/NMNH-IDSC/xmu
cd xmu
pip install .

Quickstart

from xmu import EMuReader, EMuRecord, EMuSchema, write_xml

# Loading an EMu schema file allows xmu to validate data, coerce data to
# the proper type, and manage grids
EMuSchema("path/to/schema.pl")

# Read records from an XML export file to dicts using EMuReader
records = []
reader = EMuReader("xmldata.xml")
for rec in reader:

    # Convert dicts to EMuRecords to access some extra functionality
    rec = EMuRecord(rec, module=reader.module)

    rec["EmuRef.irn"]  # use dot paths to retrieve keys
    rec["EmuBadKey"]   # keys not found in the schema throw a special error
    rec["EmuDate"]     # dates use EMuDate wrapper to preserve date format
    rec["EmuFloat"]    # floats use EMuFloat wrapper to preserve precision

    # Access grids defined in the schema using any member field
    grid = rec.grid("EmuTable_tab")
    grid[0]                          # get rows by index
    grid[{"EmuTable_tab": "value"}]  # get rows where EMuGrid_tab == value

    # Use EMuRecords to create or update records in EMu
    update = EMuRecord({
      "irn": rec["irn"],                   # include an irn to update a record
      "EmuString": "String",
      "EmuInteger": 100,
      "EmuFloat": 1.2,
      "EmuDate": "1970-01-01",             # dates are strings or datetime.date
      "EmuRef": {"irn": 1234567},          # references are dicts
      "EmuTable_tab": ["Row 1", "Row 2"],  # tables are lists
      "EmuRef_tab": [{"irn": 1234567}],    # ref tables are lists of dicts
      "EmuNested_nesttab": [["Nested"]],   # nested tables are lists of lists
      "EmuBadKey": ["Bad format"],         # bad keys or formats throw an error
    }, module=reader.module)

    # Create a list of records to import
    records.append(update)

# Write the XML import file from the list of EMu records
write_xml(records, "update.xml")

You can use the experimental :py:meth:xmu.io.EMuReader.from_xml_parallel method to read large XML files more quickly. For example, to create a dict mapping IRNs to records:

def callback(path):
    reader = EMuReader(path)
    results = {}
    for rec in reader:
        rec = EMuRecord(rec, module=reader.module)
        results[rec["irn"]] = rec
    return results

results = EMuReader("xmldata.xml").from_xml_parallel(callback)

Using the EMu REST API

EMu 9 includes a REST API that allows users to programatically interact with live data. This package includes support for the search and retrieve endpoints in the API.

Create an instance of EMuAPI:

from xmu import EMuAPI

api = EMuAPI(username="user", password="pass")

Retrieve a record from Catalog by IRN:

api.retrieve("ecatalogue", 1234567)

Search for a record in Catalog:

api.search("ecatalogue", {"CatNumber": 1234})

Please see this notebook for additional information about using the API.