Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

The (Meta)data Management Cookbook is a simple and practical guide which walks users through the key phases of the (meta)data management process based on the FAIR principles. Its scope is to give a basic understanding of the (meta)data management process as well as to provide practical use cases that users can follow. This cookbook highlights services developed and offered by Scientific Computing Center (SCC), the Information Technology Center of the Karlsruhe Institute of Technology (KIT). These tools are customized to meet the specific requirements of different projects.

metadata management cookbook

This (Meta)data Management Cookbook is structured around three phases:

  • Ingredient selection: represents the metadata creation phase, where users are guided in creating structured metadata based on defined schemas.
  • Cooking: corresponds to the (meta)data management phase, where users can explore best practices for storing and maintaining (meta)data.
  • Serving: aligns with (meta)data access phase, where users learn how to make their (meta)data accessible, legally reusable, and how to access (meta)data created by others.

The proposed cookbook is intended as a helpful example. It is required to follow it strictly or use the suggested services. However, be aware that skipping or modifying steps may impact later stages of the process. This quick cookbook offers a basic understanding. For more deeper insights into research data management, online courses are available: Metadata Management: Key Essentials, Fundamentals of Scientific Metadata, etc. If you have additional questions or suggestions to improve this cookbook, please feel free to contact us.
An introductory video to the (meta)data management cookbook is available here.

General recipe

The general recipe walks you through the essential stages of the (meta)data management process, once your data is produced and collected. It can be adapted to each user’s specific use case in order to ensure reproducibility. The recipe is organized as a series of simple, focused questions, each accompanied by a concise and clear answer.

General questions

What is metadata? Why is it relevant?

Metadata describes data by providing essential information that helps in its understanding. This can include administrative metadata such as the creator or creation date, etc, or more detailed information beyond the administrative ones e.g. how the data was produced like the used technique, the instrument’s manufacturer,... Metadata can be unstructured, which doesn’t adhere to a specific template, or structured, which follows a defined template. To enhance the findability and reusability of the data, it is important that metadata follows a specification defined by a metadata schema.

What is a metadata schema? Why is it important?

A metadata schema represents a template, which specifies the expected elements and how they are structured, such as attributes’ names, value types, rules, etc. Metadata schemas are actually files, which can be represented in various formats like JSON, XML,... Using metadata schemas enables users to describe data with rich and structured metadata, enhancing the findability and reusability of the data.

What is a metadata document? Why is it important?

A metadata document is a file including a digital description of data. It provides information such as the resource type, author, creation date, used technique, instrument’s manufacturer, etc. Metadata documents can be unstructured, which does not follow a specific template, or structured, which adheres to a defined template. Creating structured metadata documents, following a metadata schema, enhances the findability of the data and supports the reproducibility of results.

What is the difference between a metadata schema and a metadata document?

While a metadata schema represents a file defining the structure and rules for metadata such as attributes definitions, a metadata document is a file containing the values for such attributes, describing specific data.
N.B.: A metadata document should be valid against the defined metadata schema, this means it should follow the rules described in the defined schema.

What does FAIR mean in research data management?

The FAIR principles are a set of guidelines aimed at making data Findable, Accessible, Interoperable, and Reusable. To make data FAIR, it is essential to describe it with rich metadata, assign unique and persistent identifiers to (meta)data, register it in searchable repositories, and ensure it can be retrievable through standardized communication protocols. FAIRness is not a binary state (yes/no), but it reflects rather how effectively (meta)data fulfill the FAIR principles. This can vary depending on different criterias and the specific needs. An example of detailed guidance on applying the FAIR principles can be found here.

Metadata Creation

How can I create a metadata document for my envisioned purpose?

Unstructured metadata documents can be created by writing a free-text file in a 'readme'-style format using a text editor. Alternatively, structured metadata documents can be created by filling in the values of attributes following a defined template. It is recommended to provide structured metadata documents following metadata schemas, as this improves the findability and reusability of data. Various online tools are available to assist in the validation of such structured metadata documents.

Are there any metadata schemas, which I can use?

There are published metadata schemas which are well-established and widely accepted by the user community, called metadata standards. Adopting existing standards avoids proliferation of descriptions. Different interdisciplinary metadata standards exist, e.g. Dublin Core and DataCite. There are also discipline-specific standards like NeXus, which describes neutron, x-ray, and muon experiment data and is also used in condensed matter physics and materials science. You can search for metadata standards in dedicated metadata registries, e.g. DCC Metadata Standards, FAIRsharing.org, RDA Metadata Standards Catalog, and others. Two instances of metadata registries including metadata schemas are provided: one hosted by the NFDI-MatWerk consortium, and a second one hosted by the NEP and JL-MDMC consortia.

I didn’t find a metadata schema for my envisioned purpose. What should I do?

You can start by creating a "readme"-style metadata document. Once you have a draft, feel free to contact us, we will support you in formalizing it into a proper schema. It can be considered as a good practice through broad adoption. However, to be recognized as a standard, it must be endorsed by a standardization body.

Where do I find the information required to fill out a metadata document?

This information represents your metadata and can be obtained from various sources: it may be filled by memory, retrieved from physical lab notes, or extracted from existing resources such as data files, Electronic Lab Notebooks (ELNs), etc…

I collected the metadata, but it doesn’t follow the selected schema. What should I do?

You can either create the metadata document using a text editor by reviewing the attributes defined in the metadata schema and entering the corresponding values or use a form generated from the schema to enter the metadata. Using a form makes the process easier and reduces the risk of errors. An online JSON editor is available to generate a form based on a JSON schema. eXact lab offers an open-source desktop application called Metadata Editor that provides a user interface, allowing users to easily create metadata documents with the help of forms, generated from already stored metadata schemas. These schemas are available in two different instances: one provided by the NFDI-MatWerk community and the second one offered by NEP and JL-MDMC communities.

Is there a way to automatically generate a metadata document that complies with a metadata schema?

Yes, you can map automatically the metadata included in the data files to metadata schemas. If further metadata is required, this can be added manually. In order to achieve this, you can either create your own customized mapping extension or use an existing tool that supports your data formats and used instruments. We offer a mapping service that extracts metadata from various data-files, and maps it to project-specific JSON metadata schemas ensuring better interoperability. Two instances of the tool are available: one hosted by the NFDI-MatWerk community and a second one by the NEP and JL-MDMC communities. A detailed documentation of the service including the supported instruments, input file formats and vendors is available here.

(Meta)data Management

Is it necessary to store data and metadata in the same location, or can they be stored separately?

Data and metadata can be stored together in the same location or separately in different locations. This depends on the storage capabilities and your specific requirements. To align with the FAIR principles, your metadata should be available even if your data is no more accessible and it should be properly associated with the data it describes. Therefore, choosing the appropriate storage location is very important.

Where can I store my produced data?

Data can be stored in different locations like Electronic Lab Notebooks (ELNs), databases, filesystems or data repositories. Storing your data in repositories ensures that administrative metadata are created and stored. In addition, your data becomes better searchable and findable. This contributes to more efficient data management and alignment with FAIR principles. There exist general-purpose data repositories (such as Zenodo), which support a wide range of disciplines. Additionally, there are also registries which include different research data repositories (such as re3data). Institutional repositories (such as KITopen) are available and managed by universities to store and publish their outputs. Project-specific data repositories are also available, such as the NFDI-MatWerk Data Repository hosted by the NFDI-MatWerk consortium.

Where can I store my metadata?

If you only need to store administrative metadata, this can be done when uploading your data to a repository, as all repositories support this functionality. However, if you want to include more detailed metadata to enrich your data documentation, there are dedicated metadata repositories available for this purpose. Project-specific metadata repositories are available, such as the NFDI-MatWerk Metadata Repository hosted by the NFDI-MatWerk consortium and the MetaRepo Metadata Repository hosted by the NEP and JL-MDMC consortia.

Can everybody see my (meta)data if I store it in a repository?

No, you can choose whether you want to make your (meta)data public, keep it private, share it only with your colleagues or publish it with view-only rights. Most repositories provide the access control permission and user authentication functionalities in order to manage the access rights. In order to fully benefit from the FAIR principles, it is recommended to make your metadata publicly available even if your data is not.

What if data is not stored in a repository, can I still store the corresponding metadata in a metadata repository?

Yes, you can still register your metadata in a metadata repository even if the data itself isn’t stored in a repository. It’s essential to link the metadata to the data it describes. This helps ensure the metadata remains accessible, even if the data becomes unavailable, which supports alignment with the FAIR principles.

How can I associate metadata with data?

Regardless of where your data and metadata are stored, establishing a clear link between them is essential. This association connects the descriptive information (metadata) with the actual object being described (data). This can be done by specifying the location of the data within the metadata document. The best way to do it is to use a globally unique and persistent identifier such as Digital Object Identifier (DOI), Open Researcher and Contributor ID (ORCID),...

(Meta)data Access

How can I search for data using metadata?

You can search for data through general-purpose repositories (such as Zenodo), registry of data repositories (like re3data), or institutional ones (f.e. KITopen) depending on where it has been stored. Most of these repositories provide search interfaces allowing you to filter results using administrative metadata attributes. The access to the data may require authentication, depending on the permissions set by the data owner. We provide two search interfaces, allowing users to find specific created metadata that goes beyond the administrative metadata. The first one is hosted by the NFDI-MatWerk consortium and a second one offered by the NFFA and JL-MDMC consortia.

How can others reuse my (meta)data legally?

To enable the legal reuse with a proper citation of your data, it is important to publish it under a clear license (e.g. Creative Commons License, Apache License,...). A license is a legal agreement between the data creator and the end-user, or the repository where the data is stored, specifying how the data may be used. Choosing an open license or one with minimal restrictions ensures that others can reuse and cite your data in a legally compliant way.

If (meta)data created by others doesn’t include a license, can I use it?

No, without a specified license, you don’t have the legal rights to reuse the (meta)data. In this case, you can contact the author or data owner and ask them to provide a license.

Can I keep (meta)data private or share them only with some collaborators?

Yes, most repositories provide the access control permission and user authentication functionalities in order to keep your (meta)data private or share it only with some collaborators during active research. There is also the possibility to publish your (meta)data with view-only rights, enabling controlled access. In order to fully benefit from the FAIR principles, it is recommended to make your (meta)data publicly available.

Specific recipes

Coming soon