Enum: UsageScope

This enumeration specifies the scope of extraction performed by the Extractor.

Comments

  • Currently allows the Extractor->usage to be described in one of two modes, meta-only and meta+data.

  • In practice, this means that extractors may have multiple defined usage modes for each UsageScope and each UsageType.

  • The default is to assume the ‘worst case’, i.e. that the Extractor returns as much of the data in the input file as it supports, plus potentially additional analysis.

URI: UsageScope

Permissible Values

Value

Description

Comments

meta-only

This usage scope only returns metadata associated with the file, with the
interpretation of “metadata” left up to the Extractor.

This may include, but is not limited to:
- timestamps,
- column headings,
- data types, and
- data shapes. The size of data returned in this mode should be limited and
will typically be much smaller than the file size. This metadata SHOULD be
expressed as a JSON-serializable document that can, for example, be loaded into
Python with only standard library imports.

meta+data

This usage scope refers to a “full” extraction of the file, whereby the
extractor tries to faithfully interpret and return all supported information
contained within the file.

This may include, but is not limited to:
- binary data,
- large multi-dimensional arrays,
- plus all of the associated metadata from a meta-only extraction. The size
of the data returned in this mode is not limited and could be significantly
larger than the input file size (in the case of binary data, or additional
analysis being performed upon extraction).

Slots

Name

Description

scope

Specification of extraction scope

Identifier and Mapping Information

Schema Source

  • from schema: https://datatractor.github.io/schema/main/datatractor_schema/

LinkML Source

name: UsageScope
description: This enumeration specifies the scope of extraction performed by the `Extractor`.
comments:
- Currently allows the `Extractor->usage` to be described in one of two modes, `meta-only`
  and `meta+data`.
- In practice, this means that extractors may have multiple defined usage modes for
  each `UsageScope` and each `UsageType`.
- The default is to assume the 'worst case', i.e. that the `Extractor` returns as
  much of the data in the input file as it supports, plus potentially additional analysis.
from_schema: https://datatractor.github.io/schema/main/datatractor_schema/
rank: 1000
permissible_values:
  meta-only:
    text: meta-only
    description: This usage scope only returns metadata associated with the file,
      with the interpretation of "metadata" left up to the `Extractor`.
    comments:
    - "This may include, but is not limited to:\n  - timestamps,\n  - column headings,\n\
      \  - data types, and\n  - data shapes."
    - The size of data returned in this mode should be limited and will typically
      be much smaller than the file size.
    - This metadata SHOULD be expressed as a JSON-serializable document that can,
      for example, be loaded into Python with only standard library imports.
  meta+data:
    text: meta+data
    description: This usage scope refers to a "full" extraction of the file, whereby
      the extractor tries to faithfully interpret and return all supported information
      contained within the file.
    comments:
    - "This may include, but is not limited to:\n  - binary data,\n  - large multi-dimensional\
      \ arrays,\n  - plus all of the associated metadata from a `meta-only` extraction."
    - The size of the data returned in this mode is not limited and could be significantly
      larger than the input file size (in the case of binary data, or additional analysis
      being performed upon extraction).