CrossRef is an independent membership association for building shared technologies. Its mission is to improve access to published scholarship through services that require collective agreement among scholarly and professional publishers. In the five-plus years since CrossRef launched its cross-publisher reference linking service, it has become financially self-sustaining and has achieved critical mass in terms of buy-in from various segments of the information industry. The CrossRef database now includes records for over 37 million content items from 2,800 publishers and societies.
CrossRef launched in early 2000 as a cooperative effort among publishers to enable citation linking in journals using the Digital Object Identifier, or DOI. A CrossRef DOI is an alphanumeric name (for example, “http:dx.doi.org/10.1101/gr.10.12.1841”) for digital content, such as a book, journal article, chapter, image, and so on.
The CrossRef DOI is paired with the object’s electronic address, or URL, in a central DOI directory that is easily updated. The CrossRef DOI is published in place of the URL to prevent link attrition while allowing the content to move as needed. For instance, the publisher may need to migrate content from one production system to another (pre-print to post-print), or content may move from one publisher to another if a journal or the publisher itself changes ownership. In these cases the DOI never changes, which means that all the hyperlinks to that content that have already been published and disseminated still function. Hence, one key insight of the CrossRef DOI system is persistence. The other is “actionability”; like the URL itself, one click on a CrossRef DOI gets users to the location of the content they want.
CrossRef is an official DOI registration agency, appointed by the International DOI Foundation (http://www.doi.org). To date, CrossRef is the most robust implementation of the DOI model. Although CrossRef began with a specific focus on linking journal articles in STM fields, it now covers DOI-based linking of scholarly and professional literature more broadly, including a variety of content types and disciplines.
HOW CROSSREF WORKS
Participating publishers use automated processes to deposit metadata records into the CrossRef metadata database. Each deposited record must include minimal bibliographic information, a CrossRef DOI, and a current URL. For a journal article, the descriptive metadata includes journal title, ISSN, first author, year, volume, issue, and page number. After a metadata record is deposited, CrossRef registers each CrossRef DOI-URL pair in the central directory. When a user clicks on a CrossRef DOI, it is resolved through the central DOI directory.
When publishers deposit records, they immediately activate links to their content because other publishers, librarians, and intermediaries have automated processes in place to retrieve CrossRef DOIs from CrossRef. In these processes, the retrieving party submits the journal references or bibliographic records to a query process that looks for matches in the CrossRef database and returns DOIs where matches are found. The returned CrossRef DOIs allow a publisher to add persistent outbound hyperlinks to citations or records for items already registered in the CrossRef system.
When metadata and CrossRef DOIs are deposited with CrossRef, publishers must have live response pages in place so that they can accept incoming links at the article level. A minimal response page consists of a full bibliographic citation and a means for
gaining access to the full text.
Business models for full-text access remain under publisher control; CrossRef itself is access-model neutral. Most publishers take users to an abstract page as a default and permit authenticated users to go directly to the full text. Many publishers also present unsubscribed users with pay-per-view options. If the full text is available at no charge, as it is for a growing amount of content registered with CrossRef, all users can view it immediately.
As of August 2008, CrossRef includes over 700 dues-paying members, representing over 2,800 publishers and societies. The CrossRef database covers 20,000 journals and 37 million CrossRef DOIs, including several million DOIs for non-journal content such as books, conference proceedings, and components such as images and supplemental information.
The CrossRef system is adding close to 13,000 new CrossRef DOI records every day. Many of these CrossRef DOIs point to back-file content as publishers digitize archival material, several of them as far back as volume 1, issue 1. Another source of growth is the addition of new content genres. Publishers who have been registering journal content for some time are now registering CrossRef DOIs for other content types, and new members who do not publish journals are joining to register books, proceedings, technical reports, working papers, datasets, and dissertations.
CrossRef’s success today is best measured by its impact on the research experience. DOIs are currently being resolved at a rate of about nine million clicks per month. Roughly 3 million DOIs per month are retrieved from the system, which gives some indication of the number of CrossRef DOI-based links being implemented. In addition to the 700 publishers who participate as members, there are over 1600 other participating organizations under contract, including libraries, database publishers, full-text aggregators, software vendors, and journal hosting/linking platforms. These intermediaries query the CrossRef database of CrossRef DOIs and metadata on a regular basis to facilitate linking through their own products.
Cross-publisher linking offers several advantages to publishers. First and foremost, it brings readers to their publications and their Websites. By allowing readers to connect to their content from outside resources and locations, they not only serve their subscribed user-base better, but also create opportunities for article or chapter-based sales, whether through document delivery services, hosting intermediaries, or their own pay-per-view mechanisms.
CrossRef provides publishers with both a technology and a business infrastructure for persistent linking. On the business side, the publisher signs one agreement with CrossRef and earns the right to link to all other participating publishers. With the current membership of over 700 members, more than 200,000 bilateral agreements would have otherwise been needed to enable the same network of connections. Hence, a linking network on this scale could never have taken shape without an organizational infrastructure like CrossRef. In particular, the smaller publishers that CrossRef now includes would most likely not have had the means or opportunity to participate in widespread interlinking under the bilateral approach.
On the technology side, CrossRef obviates the problem of broken URL links by use of the CrossRef DOI, as described above. For a publisher with content registered in the CrossRef database, this means that almost 1,600 participating organizations other publishers, A&I databases, aggregators, and libraries link automatically to their response pages. For electronic books such as reference works, CrossRef facilitates internal linking of components and references as well. Assigning CrossRef DOIs to book chapters gives publishers a head start in re-purposing content for derivative works requiring a subset or re-ordering of the original components, such as course-packs and for e-commerce at the chapter level. Publishers also benefit from being part of a collaborative platform for ongoing development of shared technologies, while maintaining control over their own business practices and how their content is accessed. Through inclusion of smaller publishers who might otherwise have been left behind in the move to interlink across information providers, CrossRef extends the network of accessible scholarship online.
CROSSREF AND THE OPENURL
Most researchers access content through the institutions with which they are affiliated. Because DOI assignment is a publisher-regulated process, CrossRef DOIs default to resources designated by the publisher. For the user working in an institutional context, it is not always appropriate to be directed to the publisher’s online version of a research article. For instance, the institution may not subscribe directly to the e-journal but may still be able to offer the user access to the target article through an aggregated database or through print holdings. In addition, the library may wish to provide a range of navigational options beyond what is available at the publisher’s website.
In order for information providers to equip their products for optimal integration with library linking systems, they are being asked to implement the OpenURL. This has caused some confusion concerning primary and secondary publishers who use the CrossRef/DOI system for cross-publisher links to full text, because of the mistaken perception that the OpenURL and the CrossRef DOI are competing technologies. They are not.
The OpenURL is a mechanism for transporting metadata and identifiers describing a publication for the purpose of context-sensitive linking. A link resolver is a system for linking within an institutional context that can interpret incoming OpenURLs, take the local holdings and access privileges of that institution into account, and display links to appropriate resources. A link resolver allows the library to provide a range of library-configured links and services, including links to the full text, a local catalogue to check print holdings, document delivery or ILL services, databases, search engines, etc.
The CrossRef DOI and the OpenURL work together in several ways. First, the CrossRef DOI directory itself, where link resolution occurs in the CrossRef platform, is OpenURL-enabled. This means that it can recognize a user with access to a local resolver. When such a user clicks on a CrossRef DOI, the CrossRef system redirects that CrossRef DOI back to the user’s local resolver. It also allows the CrossRef DOI to be used as a key to pull metadata out of the CrossRef database, metadata that is needed to create the article-level OpenURL link. As a result, the institutional user clicking on a CrossRef DOI is directed to appropriate resources.
By using the CrossRef DOI system to identify their content, publishers in effect make their products OpenURL aware. Since CrossRef DOIs facilitate linking and data management processes for publishers, many publishers are beginning to require that the CrossRef DOI be used as the primary linking mechanism to their full text. Link resolvers can use the CrossRef system to retrieve the CrossRef DOI if that CrossRef DOI is not already available from the source (i.e., citing) document.
ENHANCEMENTS TO LINKING
The CrossRef network has expanded not only in terms of content coverage to different genres and levels of granularity but also in terms of functionality. Two key recent developments are described here.
Cited-by Linking refers to tracking which other publications cite a given publication. In addition to using CrossRef to create outbound links from their references, CrossRef member publishers can now retrieve “cited-by” links -- links to other articles that cite their content. This new service is being offered as an optional tool to allow CrossRef members to display cited-by links in the primary content that they publish.
Cited-by Linking is a natural extension of the CrossRef linking network and will provide a better online reading environment for researchers and scholars. As part of the same functionality, CrossRef also offers a match-alert feature that eliminates the need for users to query CrossRef repeatedly for citations that do not initially return a match. When a query is marked to enable alerts, the CrossRef system automatically sends an email containing the matched results once the relevant content gets registered in CrossRef.
CrossRef system supports the assignment of more than one URL to a single DOI, a concept known as multiple resolution (MR). CrossRef's MR service works by providing an interim page solution, which presents a list of link choices to the end user. Each choice represents a location at which the published journal article or another type of published content may be obtained.