In the scholarly communications environment, the evolution of a journal article can be traced by the relationships it has with its preprints. Those preprint–journal article relationships are an important component of the research nexus. Some of those relationships are provided by Crossref members (including publishers, universities, research groups, funders, etc.) when they deposit metadata with Crossref, but we know that a significant number of them are missing. To fill this gap, we developed a new automated strategy for discovering relationships between preprints and journal articles and applied it to all the preprints in the Crossref database. We made the resulting dataset, containing both publisher-asserted and automatically discovered relationships, publicly available for anyone to analyse.
The second half of 2023 brought with itself a couple of big life changes for me: not only did I move to the Netherlands from India, I also started a new and exciting job at Crossref as the newest Community Engagement Manager. In this role, I am a part of the Community Engagement and Communications team, and my key responsibility is to engage with the global community of scholarly editors, publishers, and editorial organisations to develop sustained programs that help editors to leverage rich metadata.
STM, DataCite, and Crossref are pleased to announce an updated joint statement on research data.
In 2012, DataCite and STM drafted an initial joint statement on the linkability and citability of research data. With nearly 10 million data citations tracked, thousands of repositories adopting data citation best practices, thousands of journals adopting data policies, data availability statements and establishing persistent links between articles and datasets, and the introduction of data policies by an increasing number of funders, there has been significant progress since.
Have you attended any of our annual meeting sessions this year? Ah, yes – there were many in this conference-style event. I, as many of my colleagues, attended them all because it is so great to connect with our global community, and hear your thoughts on the developments at Crossref, and the stories you share.
Let me offer some highlights from the event and a reflection on some emergent themes of the day.
The funding data service lets members register funding source information for content items deposited with Crossref.
Things to understand before you deposit
Funding metadata must include the name of the funding organization and the funder identifier (where the funding organization is listed in the Registry), and should include an award/grant number or grant identifier. Funder names should only be deposited without the accompanying ID if the funder is not found in the Registry. While members can deposit the funder name without the identifier, those records will not be considered valid until such a time as the funder is added to the database and they are redeposited (updated) with an ID. What that means is that they will not be found using the filters on funding information that we support via our REST API, or show up in our Open Funder Registry search.
Correct nesting of funder names and identifiers is essential as it significantly impacts how funders, funder identifiers, and award numbers are related to each other.
Correct: In this example, funder “National Science Foundation” is associated with the funder identifier https://doi.org/10.13039/100000001
<fr:assertion name="funder_name">National Science Foundation
<fr:assertion name="funder_identifier">https://doi.org/10.13039/100000001</fr:assertion>
</fr:assertion>
Incorrect: Here, the funder name and funder identifier are not nested - these assertions will be indexed as separate funders.
The purpose of funder groups is to establish relationships between funders and award numbers. A funder group assertion should only be used to associate funder names and identifiers with award numbers when multiple funders are present.
Funding data deposit with one group of funders (no “fundgroup” needed):
Funding data deposit with two fundgroups:
Incorrect: Groups used to associate funder names with funder identifiers, these need to be nested as described above.
Deposits using a funder_identifier that is not taken from the Open Funder Registry will be rejected.
Deposits with only funder_name (no funder_identifier) will not appear in funder search results in Open Funder Registry search or the REST API.
Funding data and Crossmark
If you participate in Crossmark, you should nest funding data within the <crossmark> element, for example:
The <fr:program> element in the deposit schema section (see documentation) supports the import of the fundref.xsd schema (see documentation). The fundref namespace (xmlns:fr=https://www.crossref.org/fundref.xsd) must be included in the schema declaration, for example:
To accommodate integration with Crossmark, the fundref.xsd consists of a series of nested <fr:assertion> tags with enumerated name attributes. The name attributes are:
fundgroup: used to group a funder and its associated award number(s) for items with multiple funders.
funder_name: name of the funding agency as it appears in the funding Registry. Funder names that do not match those in the registry will be accepted to cover instances where the funding organization is not listed.
funder_identifier: funding agency identifier in the form of a DOI, must be nested within the funder_name assertion. The funder_identifier must be taken from the funding Registry and cannot be created by the member. Deposits without funder_identifier do not qualify as funding records.
award_number: grant number or other fund identifier
funder_nameandfunder_identifier must be present in a deposit where the funding body is listed in the Open Funder Registry. Multiple funder_name, funder_identifier, and award_number assertions may be included.
Funder and award number hierarchy
A relationship between funder_identifier and funder_name is established by nesting funder_identifier within funder_name. For example, this deposit has the funder National Science Foundation with its corresponding funder identifier in the Open Funder Registry of https://doi.org/10.13039/100000001 :
<fr:assertion name="funder_name">National Science Foundation
<fr:assertion name="funder_identifier">https://doi.org/10.13039/100000001</fr:assertion>
</fr:assertion>
A relationship between a single funder_name and/or funder_identifier and an award_number is established by including assertions with a <fr:program>. In this example, funder National Institute on Drug Abuse with funder identifier https://doi.org/10.13039/100000026 are associated with award number JQY0937263:
<fr:program name="fundref">
<fr:assertion name="funder_name">National Institute on Drug Abuse
<fr:assertion name="funder_identifier">https://doi.org/10.13039/100000026</fr:assertion>
</fr:assertion>
<fr:assertion name="award_number">JQY0937263</fr:assertion>
</fr:program>
If multiple funder and award combinations exist, each combination should be deposited within a fundgroup to ensure that the award number is associated with the appropriate funder(s). In this example, two funding groups exist:
Funder National Science Foundation with funder identifier https://doi.org/10.13039/100000001 is associated with award numbers CBET-106 and CBET-106, and
Funder Basic Energy Sciences, Office of Science, U.S. Department of Energy with funder identifier https://doi.org/10.13039/100006151 is associated with award number 1245-ABDS.
<fr:program name="fundref">
<fr:assertion name="fundgroup">
<fr:assertion name="funder_name">National Science Foundation
<fr:assertion name="funder_identifier">https://doi.org/10.13039/100000001</fr:assertion>
</fr:assertion>
<fr:assertion name="award_number">CBET-106</fr:assertion>
<fr:assertion name="award_number">CBET-7259</fr:assertion>
</fr:assertion>
<fr:assertion name="fundgroup">
<fr:assertion name="funder_name">Basic Energy Sciences, Office of Science, U.S. Department of Energy
<fr:assertion name="funder_identifier">https://doi.org/10.13039/100006151</fr:assertion>
</fr:assertion>
<fr:assertion name="award_number">1245-ABDS</fr:assertion>
</fr:assertion>
</fr:program>
Items with multiple funder names but no award numbers may be deposited without a fundgroup.
At a minimum, a funding data deposit must contain a funder_name and funder_identifier assertion. Deposits with just an award_number assertion are not allowed. A funder_name, funder_identifier, and award_number should be included in deposits whenever possible.
If the funder name cannot be matched in the Registry, you may submit funder_name only, and the funding body will be reviewed and considered for addition to the official Registry. Until it is added to the Registry, the deposit will not be considered a valid funding record and will not appear in funding search or the REST API.
As demonstrated in Example 3 below, items with several award numbers associated with a single funding organization should be grouped together by enclosing the funder_name, funder_identifier, and award_number(s) within a fundgroup assertion.
Some rules will be enforced by the deposit logic, including:
Nesting of the<fr:assertion>elements: the schema allows infinite nesting of the assertion element to accommodate nesting of an element within itself. Deposit code will only allow 3 levels of nesting (with attribute values of fundgroup, funder_name, and funder_identifier)
Values of different<fr:assertion>elements: funder_name, funder_identifier, and award_number may have deposit rules imposed
Only valid funder identifiers will be accepted: the funder_identifier value will be compared against the Open Funder Registry file. If the funder_identifier is not found, the deposit will be rejected.
Deleting or updating funding metadata
If funding metadata is incorrect or out-of-date, it may be updated by redepositing the metadata. Be sure to redeposit all available metadata for an item, not just the elements being updated. A DOI may be updated without resubmitting funding metadata, as previously deposited funding metadata will remain associated with the DOI.
Funding metadata may be deleted by redepositing an item with an empty <fr:program name="fundref"> element:
Submitting an empty Crossmark tag (<crossmark />) will delete all Crossmark data, including funding data. To delete only funding data, submit an empty <fr:program name="fundref"/> element:
Example 2: Funder information outside of Crossmark
The <fr:program> element captures funding data. It should be placed before the <doi_data> element. This deposit contains minimal funding data - one funder_name or one funder_identifier must be present; both are recommended.
<fr:program name="fundref">
<fr:assertion name="funder_name">National Science Foundation
<fr:assertion name="funder_identifier">https://doi.org/10.13039/100000001</fr:assertion> </fr:assertion>
</fr:program>
Example 3: One funder, two grant numbers
This example contains one funder_name and one funder_identifier. Note that the funder_identifier is nested within the funder_name assertion, establishing https://doi.org/10.13039.100000001 as the funder identifier for funder name National Science Foundation. Two award numbers are present.