Metacat
Introduction
Metacat, a metadata service to make data and metadata easy to discover, process and manage. Metacat supports many datasources as backend.
Datasheet
Status: 14.06.2022
| Homepage | https://knb.ecoinformatics.org/knb/docs/ |
| Description | https://knb.ecoinformatics.org/knb/docs/intro.html |
| Code | https://github.com/NCEAS/metacat |
| Communities | DataONE |
| Version | 2.18.0 (released on 19.05.2022) |
Features
Status: 15.02.2022
| Supported Schema(s) | DTD | |
| Supported Format(s) | XML | |
| Interface(s) | REST/Thrift interface | several implementations are available |
| Open Source | yes | |
| License | GPL 2.0 | |
| Versioning | yes | history of documents |
| AAI | yes | internal password file or LDAP |
| External Storage | yes | supports many storage systems as backend. (Amazon S3 (via Hive), Druid, Elasticsearch, Redshift, Snowflake and MySQL) |
| Referencable | yes | DOI |
Description
- Register Schema:
- Support for arbitrary schemas of a specific format (e.g. JSON Schema, XSD)
- The schema should at least be referencable by a unique identifier.
- Update Schema:
- Possibility to
- work on different versions of a schema
- adapt schemas over time
- Possibility to
- Validate Schema:
- Check schema for correct syntax
- Ingest Metadata:
- Store metadata (document) in repository
- Ideally with previous validation
- Store metadata (document) in repository
- Update Metadata:
- Possibility to update already ingested metadata (documents).
- Validate Metadata:
- Possibility to validate documents on the basis of registered schemas.
- Search by Administrative MD:
- Search documents by their metadata (e.g. ingest date, ingester, ...)
- Search by Content:
- Search documents by their content
- Persistent Identifier:
- Support for Persistent Identifiers (e.g. DOI, Handle)
Additional Features
Status: 25.02.2022
- Support OAI-PMH (oai_dc, EML)
Functionality
Status: 15.02.2022
| Function | Supported | Remarks |
|---|---|---|
| Register Schema | o | Store DTD(s) as package |
| Update Schema | o | |
| Validate Schema | - | |
| Ingest Metadata | + | |
| Update Metadata | + | |
| Validate Metadata | + | provide DTD(s)/package |
| Search by ... | ||
| ... Administrative MD | + | filter |
| ... Content | + | pathquery (similar to XPath) |
| since version 2.1 SOLR is used for indexing (DataONE out of the box but also own documents by configuration) | ||
| Persistent Identifier | + | DOI using the EZID service |
Remarks
At a higher level, Metacat features can be categorized as follows:
- Data abstraction and interoperability
- Business and user-defined metadata storage
- Data discovery
- Data change auditing and notifications
- Hive metastore optimizations