The Crossref Curriculum

Datasets

Dataset records capture information about one or more database records or collections. Dataset deposits do not contain the entire database record or collection, only descriptive metadata. The metadata can include:

  • Contributors: the author(s) of a database record or collection
  • Title: the title of a database record or collection
  • Date (within <database_date>): the creation date, publication date (if different from the creation date), and the date of last update of the record
  • Record number or other identifier (within <publisher_item>): the record number of the dataset item. In this context, <publisher_item> can be used for the record number of each item in the database
  • Description (within <description>): a brief summary description of the contents of the database
  • Format: the format type of the dataset item if it includes files rather than just text. Note the format element here should not be used to describe the format of items deposited as part of the component_list
  • Citations (within <citation_list>): a list of items (such as journal articles) cited by the dataset item. For example, dataset entry from a taxonomy might cite the article in which a species was first identified.

The dataset_type attribute should be set to either record or collection to indicate the type of deposit. The default value of this attribute is record.

Constructing dataset deposits

<database> is the container for all information about a set of datasets. The top-level database may be a functional database or an abstraction acting as a collection (much like a journal is a collection of articles). Individual dataset entries are captured within the <dataset> element.

Datasets that aren’t datasets

The database content type is often used to capture metadata for items that do not fit into our currently defined content types. This may include online collections, videos, archives, and other items that aren’t cited or presented as articles, books, reports, or other defined types of content. Learn more about our supported content types.

Example of a dataset deposit containing several datasets

Review the sample below or download an XML file.

<doi_batch xmlns="http://www.crossref.org/schema/4.3.7" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="4.3.7" xsi:schemaLocation="http://www.crossref.org/schema/4.3.7 http://www.crossref.org/schemas/crossref4.3.7.xsd">
<head>
<doi_batch_id>2006-03-24-21-57-31-10023</doi_batch_id>
<timestamp>20060324215731</timestamp>
<depositor>
<depositor_name>Sample Master</depositor_name>
<email_address>support@crossref.org</email_address>
</depositor>
<registrant>CrossRef</registrant>
</head>
<body>
<database>
<database_metadata language="en">
<titles>
<title>NURSA Datasets</title>
</titles>
<institution>
<institution_name>Nuclear Receptor Signaling Atlas</institution_name>
<institution_acronym>NURSA</institution_acronym>
</institution>
<doi_data>
<doi>10.1621/NURSA_dataset_home</doi>
<resource>http://www.nursa.org/template.cfm?threadId=10222</resource>
</doi_data>
</database_metadata>
<dataset dataset_type="collection">
<contributors>
<person_name contributor_role="author" sequence="first">
<given_name>D</given_name>
<surname>Mangelsdorf</surname>
</person_name>
</contributors>
<titles>
<title>
Tissue-specific expression patterns of nuclear receptors
</title>
</titles>
<doi_data>
<doi>10.1621/datasets.02001</doi>
<resource>
http://www.nursa.org/template.cfm?threadId=10222&dataType=Q-PCR&dataset=Tissue-specific%20expression%20patterns%20of%20nuclear%20receptors
</resource>
</doi_data>
</dataset>
<dataset dataset_type="collection">
<contributors>
<person_name contributor_role="author" sequence="first">
<given_name>R</given_name>
<surname>Evans</surname>
</person_name>
</contributors>
<titles>
<title>Circadian expression patterns of nuclear receptors</title>
</titles>
<doi_data>
<doi>10.1621/datasets.02002</doi>
<resource>
http://www.nursa.org/template.cfm?threadId=10222&dataType=Q-PCR&dataset=Circadian%20expression%20patterns%20of%20nuclear%20receptors
</resource>
</doi_data>
</dataset>
</database>
</body>
</doi_batch>

Last Updated: 2020 April 8 by Laura J. Wilkinson