13 minute read.

How funding agencies can meet OSTP (and Open Science) guidance using existing open infrastructure

In August 2022, the United States Office of Science and Technology Policy (OSTP) issued a memo (PDF) on ensuring free, immediate, and equitable access to federally funded research (a.k.a. the “Nelson memo”). Crossref is particularly interested in and relevant for the areas of this guidance that cover metadata and persistent identifiers—and the infrastructure and services that make them useful.

Funding bodies worldwide are increasingly involved in research infrastructure for dissemination and discovery. While this post does respond to the OSTP guidelines point-by-point, the information here applies to all funding bodies in all countries. It will be equally useful for publishers and other systems that operate in the scholarly research ecosystem.

In response to calls from our community for more specifics, this post:

  1. Provides an overview of the specific ways that Crossref (along with organisations and initiatives like DataCite, ORCID, and ROR) helps U.S. federal agencies—and indeed any other funder—meet critical aspects of the recommendations.
  2. Restates our intent to collaborate with all stakeholders in the scholarly research ecosystem, including the OSTP, the US federal agencies, our existing funder, publisher, and university members, to support the recommendation as plans develop.
  3. References the work and adoption of Crossref Grant DOIs, including analyses of existing metadata matching funding to outputs.
  4. Highlights that what’s outlined in the memo aligns with our longstanding mission to capture and maintain the scholarly record and our vision of the Research Nexus, as we describe in our current blog series, regarding our role in preserving the integrity of the scholarly record (ISR).

Infrastructure already exists to support funder goals; it just needs more adoption

Ensuring free, immediate, and equitable access to metadata that captures the scholarly record is an essential part of meeting the aims of the memo but also supporting Open Science globally.

In September, Crossref ORCID, DataCite, and ROR participated in the 2022 Forum on Global Grants Management run by Altum and the summary provides a good example of the importance of open infrastructure and open metadata to the goals of Open Science:

Open Science begins with open infrastructure: Attendees agreed that Open Science relies on many other ‘opens’ – most notably, open metadata, open infrastructure, and open governance. Metadata and DOIs (digital object identifiers) for publications, grants, and research outputs, are essential to illuminate the connections that exist between funding and outcomes. That metadata runs on infrastructure powered by organizations such as Crossref, ORCID, ROR, and DataCite.

As a foundational scholarly infrastructure committed to meeting the Principles of Open Scholarly Infrastructure (POSI) of governance, insurance, and sustainability, Crossref plays an essential role in implementing and supporting key aspects of the guidance. For many years, we have been focused on the integrity of the scholarly record (ISR), and the shared vision to collectively achieve what we call the Research Nexus, which is described as

A rich and reusable open network of relationships connecting research organisations, people, things, and actions; a scholarly record that the global community can build on forever, for the benefit of society.

Metadata—including persistent identifiers and relationships between different research objects—is the foundation of the Research Nexus and is critical to openly and sustainably fulfilling the OSTP memo’s recommendations.

This topic of open metadata and identifiers isn’t just an issue for research resulting from US federal funding. We are working to implement open scholarly infrastructure globally, bringing significant benefits to the whole scholarly research ecosystem.

The current situation brings to mind the William Gibson quote, “The future is already here - it’s just not evenly distributed yet”. Much of the open infrastructure to support the identifier, metadata and reporting requirements of the OSTP memo already exists, but it is unevenly implemented. Increased collaboration and effort will be needed to bring this all to fruition.

We set out below some steps that all stakeholders can take to meet not just the OSTP guidelines, but Open Science goals more broadly, and globally.

What does ‘adoption’ look like? How exactly do funders and other stakeholders work with this infrastructure?

The OSTP memo calls for specific actions concerning metadata and identifiers where, fortunately, open and global solutions already exist.

For example, item 4 a) says, “Collect and make publicly available appropriate metadata associated with scholarly publications and data resulting from federally funded research.” Crossref and DataCite make metadata, including persistent identifiers (DOIs to be specific), openly available for a broad range of research objects from publications to data. Item 4 b) reads, “Assign unique digital persistent identifiers to all scientific research and development awards and intramural research protocols”. Again, federal agencies and other funders are already joining to register awards and grants and distribute these records openly through Crossref. However, this is an example of uneven adoption as registering awards and grants with DOIs is only being done by a few funders so far, which needs to increase.

Here is an ideal workflow that funders and publishers can already follow

  1. Funders join Crossref to register grants and awards (or indeed any other object such as reports). They apply on our website, accept our terms, and provide key information such as contact details. An annual membership fee ranges from $200-$1200 USD.
  2. Funders and publishers collect ROR IDs and authenticated ORCID iDs for all authors/awardees and their affiliations.
  3. Funders register a Crossref DOI for the award/grant, including awardees’ ORCID iDs and ROR IDs. They send us XML information about the grant (note that we will imminently release an online form to make it easier for the less technical funders). Many funder members register the metadata through a third party, such as Altum (if they use ProposalCentral) or Europe PMC.
  4. At the same time, funders update the awardees’ ORCID record directly with the Crossref Grant DOI and metadata.
  5. Grantees produce research objects and outputs such as data, protocols, code, preprints, articles, conference papers, book chapters, etc.
  6. These objects are registered with Crossref or DataCite, and DOIs are created by the publisher or repository members who include ORCID iDs, Crossref Grant DOIs (gathered from the author), ROR IDs for affiliations for all contributors, and other key metadata such as licensing information, and in the case of publications - references and abstracts. Note that the publisher works its magic (actually, publishers do a lot of editorial and production work, such as including data citations in the references using DataCite DOIs for the data in data repositories).
  7. On the Crossref side, we do a bunch of processing and matching and are planning to refine this and do more. Sometimes relationships are notified and added, such as data citation, preprints related to articles or funding acknowledgements converted from free text to Open Funder Registry IDs and names.
  8. Grant records with Crossref DOIs are now part of the scholarly record. All stakeholders may retrieve the open metadata and relationships through our public APIs. Crossref and DataCite will always provide open metadata, as safeguarded by our respective commitments to POSI.

Anyone can use the open metadata registered with Crossref, DataCite and ORCID as connections have been established between (ideally all) research objects and entities through open metadata and identifiers. This means that:

  • Funding agencies can monitor compliance with their policies
  • Publishers can identify the funder and meet their requirements
  • Funding agencies can assess and report on the reach and return of their funding programs
  • The provenance and integrity of the scholarly record is preserved and discoverable, benefitting all stakeholders.

Suggestions for meeting OSTP and Open Science guidance, point by point

OSTP RecommendationPublishers should…Funding agencies should…
4 a) Collect and make publicly available appropriate metadata associated with scholarly publications and data resulting from federally funded research
  • For scholarly publications: register comprehensive metadata & DOIs with Crossref.
  • For scholarly data: register comprehensive metadata and DOIs with DataCite.
  • Use Crossref’s API to retrieve publication and other metadata.
  • Use DataCite’s API to retrieve data/repository metadata.
i) all author and co-author names, affiliations, and sources of funding, referencing digital persistent identifiers, as appropriate;
  • Collect and validate the following from authors at manuscript submission: ROR & ORCiD IDs, Crossref Grant DOIs.
  • Include data citations in reference lists, preferably with DataCite DOIs.
  • Register awards and grants with Crossref and create DOI records for them.
  • Use ORCID’s API to retrieve validated contributor metadata.
  • Update contributors’ ORCID records with Crossref Grant DOIs and metadata.
  • Use ROR API to retrieve and verify affiliation metadata.
  • Recommend data citations be included in published outputs.
ii) the date of publication; and,
  • Include acceptance and publication dates in Crossref metadata.
  • Use Crossref’s API to retrieve publication dates.
iii) a unique digital persistent identifier for the research output;
  • For scholarly publications and research outputs: register full metadata & DOIs with Crossref.
  • For scholarly data: register full metadata and DOIs with DataCite.
  • Use Crossref and DataCite APIs to retrieve DOIs for research outputs.
4 b) Instruct federally funded researchers to obtain a digital persistent identifier that meets the common/core standards of a digital persistent identifier service defined in the NSPM-33 Implementation Guidance, include it in published research outputs when available, and provide federal agencies with the metadata associated with all published research outputs they produce, consistent with the law, privacy, and security considerations.
  • Collect ORCID iDs on manuscript submission for all authors.
  • Register Crossref and DataCite DOIs and metadata for research outputs, including data.
  • Recommend that researchers applying for funding obtain an ORCID iD and collect them upon grant application for all applicants.
  • Prepopulate grant applications with CV and publication information from applicants’ ORCID records.
  • ORCID iDs should be included in the grants registered by the agencies with Crossref.
  • Agencies can use our open APIs to retrieve the metadata on publications and data rather than ask researchers to do it, saving time and effort.
4 c) Assign unique digital persistent identifiers to all scientific research and development awards and intramural research protocols that have appropriate metadata linking the funding agency and their awardees through their digital persistent identifiers.
  • Join Crossref to register Crossref Grant DOIs, including ROR IDs and ORCID iDs
  • Ensure grant proposal and assessment systems integrate with Crossref, ROR for affiliations and with ORCID for applicants/awardees.
5 a) coordinate between federal science agencies to enhance efficiency and reduce redundancy in public access plans and policies, including as it relates to digital repository access;
  • Work with agencies to ensure a smooth, automated workflow.
  • Using and supporting existing open scholarly infrastructure and using open identifiers will avoid duplication of effort and make the overall ecosystem more efficient .
5 b) improve awareness of federally funded research results by all potential users and communities;
  • Collect Crossref Grant DOIs from authors and use them to link from publications to grant information.
  • Communicate your Crossref Grant DOIs and open grant metadata widely via human and machine interfaces. Inclusion in the Crossref API will enhance dissemination and discoverability
  • Update contributors’ ORCID records with Crossref Grant DOIs and metadata
5 c) consider measures to reduce inequities in the publishing of, and access to, federally funded research and data, especially among individuals from underserved backgrounds and those who are early in their careers;
  • Registering grants and sharing metadata through Crossref means it’s part of the world’s largest open community-governed metadata exchange and makes it available to the entire world without restriction.
5 d) develop procedures and practices to reduce the burden on federally funded researchers in complying with public access requirements;
  • Ensure your systems and those you work with make it as easy as possible for authors to provide the necessary metadata and persistent identifiers - work towards as much automation as possible and pulling from other systems rather than asking for data to be re-keyed.
  • Ensure the platforms you work with, such as grant proposal or assessment systems, retrieve and prepopulate ROR IDs, ORCID iDs, and Crossref and DataCite DOIs and associated metadata whenever possible so that the researchers don’t have to manually rekey or reformat data.
5 e) recommend standard consistent benchmarks and metrics to monitor and assess implementation and iterative improvement of public access policies over time;
  • Ensure that platforms and systems integrate with ROR, ORCID, Crossref, and DataCite so that this open metadata can lead to the creation of benchmarks and metrics.
5 f) improve monitoring and encourage compliance with public access policies and plans;
  • Use open infrastructure to help authors easily comply with public access and funder/institution policies. Automate systems as much as possible.
  • Using the open infrastructure, metadata, and identifiers outlined in this post will make monitoring more straightforward and compliance easier for all stakeholders. The community can build services on open infrastructure and metadata.
5 g) coordinate engagement with stakeholders, including but not limited to publishers, libraries, museums, professional societies, researchers, and other interested non-governmental parties on federal agency public access efforts;
  • Work with the global open infrastructure organisations (Crossref, DataCite and ORCID) whose members include funding agencies, societies, publishers, universities, libraries, repositories, museums, NGOs, and many other stakeholders - all looking to improve the efficiency of the research ecosystem.
  • Work with the global open infrastructure organisations (Crossref, DataCite and ORCID) whose members include funding agencies, societies, publishers, universities, libraries, repositories, museums, NGOs, and many other stakeholders - all looking to improve the efficiency of the research ecosystem.
5 h) develop guidance on desirable characteristics of—and best practices for sharing in—online digital publication repositories;
  • Support automated systems that use metadata and identifiers to populate repositories automatically.
  • Collaborate with publishers, Crossref and others to develop automated systems to populate repositories.
5 j) develop strategies to make federally funded publications, data, and other such research outputs and their metadata are findable, accessible, interoperable, and re-useable, to the American public and the scientific community in an equitable and secure manner.
  • Provide and support a range of discovery services based on open infrastructure.
  • Encourage discovery services - and develop services - that use the open infrastructure, metadata and persistent identifiers to enable.

Everybody needs to play their part

A lot of the work on making the above happen is already underway, and there is widespread adoption of open identifiers and metadata, but as noted above, funders are still early in the adoption journey, and implementation among all stakeholders is patchy.

Critical parts of the infrastructure rely on third-party platforms that supply tools and systems to authors, funders, and publishers - so coordinating the support for the appropriate metadata and identifiers in these systems and tools is very important.

We are emphasising how our existing open scholarly infrastructure systems are helping. But we also know that it’s not all perfect yet. Infrastructure is always evolving, metadata is never complete, refactoring workflows and systems can be costly, and integration can always be smoother. But our existing open infrastructure has already delivered significant benefits, and broader adoption will bring additional benefits to the whole scholarly research and communications ecosystem and help achieve the promise of Open Science in advancing human knowledge.

While working on this coordination and integration, we all try to remember that it should minimise work for researchers, and processes should be as automated as possible.

Collaboration is key to making this all work.

We already work with many funders through our Advisory Group, our 30 funder members, 25 of whom have so far collectively registered around 40,000 Crossref Grant DOIs, retrievable from our open API. Some grants are even matched to resulting outputs already, and some funders have recently dug into Crossref metadata to analyse outcomes from their investments, such as the Dutch Research Council (NWO) which presents findings and makes a case for greater emphasis on Crossref funding metadata.

We also work closely with partners Europe PMC and Altum, and we engage in community research and discussion, for example, through the Open Research Funders Group.

Alongside our fellow infrastructures and open identifier registries ORCID, DataCite, and ROR, we integrate with and support each other operationally and out in the community.

We will continue focusing our resources and efforts on engaging with funders, including US federal agencies responding by the OSTP guidelines, and all stakeholders to support the entire global scholarly research ecosystem.

Everyone has a part to play, and we must all pull together to prioritize this work.

Who’s in?

Please get in touch with Ed, Ginny, or Jennifer (or indeed DataCite or ORCID or ROR) if you’d like to have a discussion about the workflows described here, or just to make sure you’re up to date on the latest developments and opportunities we describe. We look forward to working with all funding agencies to support them as they develop their plans.

Further reading

Page owner: Ed Pentz   |   Last updated 2022-November-17