The Crossref Curriculum

How to participate in Similarity Check

When you apply to join the Similarity Check service, you must ensure you have full-text URLs for Similarity Check present in the metadata of at least 90% of your registered articles (across all your journal prefixes). These URLs will be used by Turnitin to index your content into the iThenticate database, making you eligible for reduced-rate access to iThenticate through the Similarity Check service.

The URLs must point directly to your full-text PDF, HTML, or plain text content, and you must continue to include these links in all future deposits. If you aren’t registering any journal articles and instead are registering other content types (such as conference papers), please contact us.

The metadata you deposit with Crossref is available to be searched and retrieved by everyone, and this includes Similarity Check full-text URLs. If your content is paywalled, please make sure that your Similarity Check URLs prompt an authentication step before allowing a user to access full-text content. You’ll also need to ensure that your hosting provider has whitelisted the Turnitin IP range to ensure that the content is available for them to index.

Where should Similarity Check URLs point?

These URLs will be used to index your content, so they need to resolve directly to the content itself - the full-text PDF, HTML or plain text content. PDFs in a frame can’t be indexed, and neither can content that’s wrapped in javascript. The URL must point directly to the location of the full-text content, and not to the article landing page (even if the content is available via a link on that page). Most members supply the PDF download link.

Learn more about how to include these full-text URLs in your new deposits or add them to content that you’ve previously registered.

Whitelisting the Turnitin IP address

Once you’ve added your Similarity Check URLs to your metadata, the Turnitin indexing crawler will index your content. If your content is openly available, the crawler will be able to access and index your content without further work on your side. But if your content is protected by authentication, you may need to whitelist Turnitin’s IP address and UserAgent so they can do this.

If your content is protected by authentication, please ask your hosting provider to whitelist the following IP address and UserAgent:

IP address range: 199.47.87.132 to 199.47.87.135 UserAgent: TurnitinBot/ContentIngest (http://www.turnitin.com/robot/crawlerinfo.html)

Last Updated: 2020 April 8 by Laura J. Wilkinson