In 2020 we released our first public data file, something we’ve turned into an annual affair supporting our commitment to the Principles of Open Scholarly Infrastructure (POSI). We’ve just posted the 2022 file, which can now be downloaded via torrent like in years past.
We aim to publish these in the first quarter of each year, though as you may notice, we’re a little behind our intended schedule. The reason for this delay was that we wanted to make critical new metadata fields available, including resource URLs and titles with markup.
Unfortunately, Bryan Vickery has moved onto pastures new. I would like to thank him for his many contributions at Crossref and we all wish him well.
I’m now pleased to announce that Rachael Lammey will be Crossref’s new Director of Product starting on Monday, May 16th.
Rachael’s skills and experience are perfectly suited for this role. She has been at Crossref since 2012 and has deep knowledge and experience of all things Crossref: our mission; our members; our culture; and our services.
Since we announced last September the launch of a new version of iThenticate, a number of you have upgraded and become familiar with iThenticate v2 and its new and improved features which include:
A faster, more user-friendly and responsive interface A preprint exclusion filter, giving users the ability to identify content on preprint servers more easily A new “red flag” feature that signals the detection of hidden text such as text/quotation marks in white font, or suspicious character replacement A private repository available for browser users, allowing them to compare against their previous submissions to identify duplicate submissions within your organisation A content portal, helping users check how much of their own published content has been successfully indexed, self-diagnose and fix the content that has failed to be indexed in iThenticate.
A re-cap We kicked off our Ambassador Program in 2018 after consultation with our members, who told us they wanted greater support and representation in their local regions, time zones, and languages.
We also recognized that our membership has grown and changed dramatically over recent years and that it is likely to continue to do so. We now have over 16,000 members across 140 countries. As we work to understand what’s to come and ensure that we are meeting the needs of such an expansive community, having trusted local contacts we can work closely with is key to ensuring we are more proactive in engaging with new audiences and supporting existing members.
Crossref strives for balance. Different people have always wanted different things from us and, since our founding, we have brought together diverse organizations to have discussions—sometimes contentious—to agree on how to help make scholarly communications better. Being inclusive can mean slow progress, but we’ve been able to advance by being flexible, fair, and forward-thinking.
We have been helped by the fact that Crossref’s founding organizations defined a clear purpose in our original certificate of incorporation, which reads:
“To promote the development and cooperative use of new and innovative technologies to speed and facilitate scientific and other scholarly research.”
As Crossref prepares to turn 20 in January 2020, it’s an opportunity to reflect on achievements and highlights from 2018-19 and also ponder the preceding decades. Change is a constant at Crossref but the organization has never strayed from its initial defined purpose. Our services and value now extend well beyond persistent identifiers and reference linking, and our connected open infrastructure benefits our 11,000+ membership as well as all those involved in scholarly research. This expansion is exactly what was envisioned to meet the goal of “speeding and facilitating” research.
This year’s annual report is different from previous years’; it has been expanded into a ‘fact file’ so that we can invite comments on the path ahead, based on transparent access to data about our membership, activities, and finances. As we were pulling together the charts and tables for this annual report we noticed stark differences in where Crossref is today compared to years past.
The rate of membership growth has accelerated and we now have over 180 new members joining every month, leading to one of the most striking changes we found. The lowest three membership tiers now account for 46% of revenue (up from 25% in 2011) while the highest three tiers account for 36% (down from 56% in 2011).
Today, the typical Crossref member has just a few hundred registered content items.
One way we have been able to accommodate this growth efficiently is by collaborating with sponsors in different countries. Very small members can join via a local sponsor that is able to provide technical, financial, language, and administrative support. We now have more members joining via sponsors, who otherwise would largely not be able to join at all. While you’d need to be a millionaire by US standards to join directly from Indonesia in our lowest fee tier (calculated using Purchasing Power Parity), the sponsor program—supported often by government investment in science and education—has enabled Indonesian organizations to join Crossref in large numbers, supporting their aim to become one of the fastest-growing nations in open research, and to help that research be discovered.
Crossref has repeatedly stayed ahead of developments in the community
In 2007, when the Similarity Check working group discussions and pilot started, there was disagreement on the board about whether Crossref should provide such a service and whether it was a strategic priority for members. By the end of the pilot, when the decision came to launch a production service, it was seen as essential and a top priority. This conclusion has been borne out in recent research into the value of Crossref; Similarity Check is one of the services of most importance to members.
Adding preprints as a content type was controversial at the time. The board discussed the topic of “duplicative works” for about two years with strong opinions on all sides. The working group delivered a good set of policies and technical specifications and in the July 2015 board meeting there was a majority—but not 100%—agreement on the motion to approve. We implemented preprints as a content type just in time to accommodate the snowballing of preprint servers emerging from existing and new members.
Another example of a former—and current—area of contention is the approach to metadata. When Crossref first launched, there were lengthy discussions about what metadata we should collect. The initial focus was on the minimal set of metadata to enable reference matching in support of reference linking. In the beginning, neither article titles, lists of authors, references, nor abstracts were included in the minimal metadata set. We supported them as optional but most members opted out. However, the huge set of metadata that Crossref collects and disseminates now is seen as essential, providing a lot of value for members in terms of discoverability.
Today, Crossref enables metadata retrieval on a large scale—an average of more than 600 million queries per month—through a variety of interfaces, most notably the REST API (Public, Polite, and Plus versions). The metadata is used by thousands of organizations and services—both commercial and not-for-profit—increasing the discoverability of member content. In fact, members of all stripes have long initiated projects to expand the metadata Crossref is able to collect and disseminate: from facilitating text mining (through license and full-text URLs); to enabling better connections with and evidence of contributions (through Funder IDs, ORCID iDs, and soon CRediT roles and ROR IDs).
These are all examples of where Crossref has successfully “promoted the cooperative use of new and innovative technologies” and where we are meeting our mission to make scholarly communications a little bit better. As ever, we need to thank our brilliant staff for their unfailing resilience, balance, and diligence, in these times of dynamic change.
Considering the value and future of Crossref
Research is global, and supporting a diverse global community is a challenge. This year, we conducted our first wide-ranging investigation into what people value from Crossref. This involved telephone interviews with over 40 community members as well as an online survey of 600+ respondents.
The results of the value research are referenced throughout the annual report/fact file and are available online publicly. We will be discussing the insights in various forums and posing some questions, such as:
How should Crossref balance the different dynamics in the community?
Are the right members involved in key decisions?
Are the sustainability model we have and the fees we charge fair?
Which initiatives should be top or bottom priorities?
“The Crossref of 2040 could be an even more robust, inclusive, and innovative consortium to create and sustain core infrastructures for sharing, preserving, and evaluating research information.”
But only if Crossref is not:
“held back, and its remit circumscribed, by legacy priorities and forces within the industry that may perceive open data and infrastructure as a threat to their own evolving business interests.”
We welcome this public commentary and encourage others in the community to respond and report what value Crossref offers as community-owned infrastructure, and how they’d like to see the organization evolve.
More than ever, we need to have this discussion with a broad and representative group. So please, read the value research report and the annual report/fact file, and get ready to voice your opinions!