Datatractor Schema: Schemas for Metadata Extractors

A repository at schemarepo, containing the LinkML-based schemas backing the yardrepo.

The schemas implemented here are machine-actionable. They are used by the Registry to validate entries; a reference implementation demonstrating their pracical use is shown in the beamrepo.

Note

This work is a continuation of the MaRDA WG7 on Automated Metadata Extractors.

Contents

The repository contains a schema two user-facing classes:

  • The FileType class, used to specify the types of files passed to the extractors by users.

  • The Extractor class, used to specify the download, installation, and usage instructions, allowing for machine execution of the defined extractor/parser code, as well as a list of FileTypes compatible with the Extractor.