In the scholarly communications environment, the evolution of a journal article can be traced by the relationships it has with its preprints. Those preprint–journal article relationships are an important component of the research nexus. Some of those relationships are provided by Crossref members (including publishers, universities, research groups, funders, etc.) when they deposit metadata with Crossref, but we know that a significant number of them are missing. To fill this gap, we developed a new automated strategy for discovering relationships between preprints and journal articles and applied it to all the preprints in the Crossref database. We made the resulting dataset, containing both publisher-asserted and automatically discovered relationships, publicly available for anyone to analyse.
The second half of 2023 brought with itself a couple of big life changes for me: not only did I move to the Netherlands from India, I also started a new and exciting job at Crossref as the newest Community Engagement Manager. In this role, I am a part of the Community Engagement and Communications team, and my key responsibility is to engage with the global community of scholarly editors, publishers, and editorial organisations to develop sustained programs that help editors to leverage rich metadata.
STM, DataCite, and Crossref are pleased to announce an updated joint statement on research data.
In 2012, DataCite and STM drafted an initial joint statement on the linkability and citability of research data. With nearly 10 million data citations tracked, thousands of repositories adopting data citation best practices, thousands of journals adopting data policies, data availability statements and establishing persistent links between articles and datasets, and the introduction of data policies by an increasing number of funders, there has been significant progress since.
Have you attended any of our annual meeting sessions this year? Ah, yes – there were many in this conference-style event. I, as many of my colleagues, attended them all because it is so great to connect with our global community, and hear your thoughts on the developments at Crossref, and the stories you share.
Let me offer some highlights from the event and a reflection on some emergent themes of the day.
We believe in Persistent Identifiers. We believe in defence in depth. Today we’re excited to announce an upgrade to our data resilience strategy.
Defence in depth means layers of security and resilience, and that means layers of backups. For some years now, our last line of defence has been a reliable, tried-and-tested technology. One that’s been around for a while. Yes, I’m talking about the humble 5¼ inch floppy disk.
This may come as surprise to some. When things go well, you’re probably never aware of them. In day to day use, the only time a typical Crossref user sees a floppy disk is when they click ‘save’ (yes, some journals still require submissions in Microsoft Word).
History
But why?
Let me take you back to the early days of Crossref. The technology scene was different. This data was too important to trust to new and unproven technologies like Zip disks, CD-Rs or USB Thumb Drives. So we started with punched cards.
IBM 5081-style punched card.
Punched cards are reliable and durable as long as you don’t fold, spindle or mutilate them. But even in 2001 we knew that punched cards’ days were numbered. The capacity of 80 characters kept DOIs short. Translating DOIs into EBCDIC made ASCII a challenge, let alone SICIs. We kept a close eye on the nascent Unicode.
Breathing Room
In 2017 the change of DOI display guidelines from http://dx.doi.org to https://doi.org shortened each DOI by 2 characters, buying us some time. But eventually we knew we had to upgrade to something more modern.
So we migrated to 5¼ inch floppy disks.
5¼ Floppy disk in drive
At 640 KB per disk these were a huge improvement. We could fit around 20,000 DOIs on one floppy. Today we only need around 10,000 floppy disks to store all of our DOIs (not the metadata, just the DOIs). Surprisingly this only takes about 20 metres of shelf space to store.
Typical work from home setup. Getting ready to backup some DOIs!
The move to working-from-home brought an unexpected benefit. Staff mail floppy disks to each other and keep them in constant rotation, which produces a distributed fault tolerant system.
Persistence Means Change
But it can’t last forever. DOIs registration shows no sign of slowing down. It’s clear we need a new, compact storage medium. So, after months of research, we’ve invested in new equipment.
Today we announce our migration to 3½ inch floppies.
If it goes to plan you won’t even notice the change.