Metadata Plus snapshots provide access to our approximately 97802661 metadata records at a particular point in time in a single file and are available for Metadata Plus service users.
The files are made available via a
/snapshots route in the REST API which offers a compressed .tar file (tar.gz) containing the full extract of the metadata corpus in either JSON or XML formats.
New snapshots are created each month available by the 5th day, providing all records up to and including the previous month.
If you’re looking for the most up to date snapshot (all records up to and including the previous month), you can use the following URLs which will always alias to the current month:
If you want to test to see if a particular snapshot is available, you can do a HTTP HEAD request using the following URL patterns:
As snapshots are only available to Metadata Plus users, you need to identify yourself in your request by using an “Authorization” HTTP header with your access token. The example below shows how this should be formatted, with XXX replaced by your token:
Authorization: Bearer XXX
The files will be very large (>42GB) so may take a while to download depending on the speed of your internet connection.
Snapshots will become available around the 5th day of each calendar month. We are working to get the process down to a few hours.
Snapshots are kept available for current and previous quarters. Each quarter we will remove the files from the two prior quarters (e.g.: on 1st April the files from the previous Oct/Nov/Dec are removed).
If you’re looking for the current month, this may be because the archive hasn’t been created for that month yet. They are usually available by the 5th of each month.
If you’re looking for a month that’s more than 6 months old, it may be that the snapshot has been deleted. On a quarterly basis, we will remove the files from two quarters previous (e.g. on April 1, 2018 the files from Oct/Nov/Dec 2017 will be removed). If you aren’t looking for a particularly new or particularly old archive and you’re still seeing a 404 error, please contact us using the dedicated Plus support email.
Snapshots are only available to Metadata Plus users. This message means that the system doesn’t recognise you as a Metadata Plus user. If you’re already a Metadata Plus user, make sure you’re using your correct token in the header of your query. If you’re still having problems, contact us using the dedicated Plus user support email.
Snapshot archives are provided at the start of each month. The archive contains all the registered content received by Crossref up until that time. (Really? Yea, all of it.) If you need a snapshot mid month, you should download and ingest the latest archive and then harvest and ingest the registered content that has changed since then.
To get the registered content that has changed since an archive was created, use OAI PMH Plus or the REST API. For example, if the archive was created on January 31, 2018 then the OAI PMH Plus harvest’s initial URL is
This will harvest journal data. If you are interested in book data then use the “B” set.
If you are interested in series data then use the “S” set.
It is important to use the
created date and not the
completed date. It takes time to build the archive and so changes will occur during the build. Using the created date ensures those changes are harvested too.
Please contact our Plus support team with any questions.