An example metadata query: 01291831,01291831|International Journal of Modern Physics C |Huang|11||287|2000|full_text|| When a query result is returned the data will be presented in either the same pipe delimited format (default) or as XML. The CrossRef Resolver is specifically built to perform a fuzzy match on the query input. This happens in several steps:
As a result not all the values in a query may be used in finding a match with the results reflecting what was found in the repository for each field. This will most often happen with journal titles and author names due to misspellings.
CrossRef Query Format (Books/Conference Proceedings - 12 fields) Two more fields are used to query books and conference proceedings in addition to the 10 used for journals.
= new Conference proceeding deposits can include information about the conference event, the proceedings publication and the individual conference papers within the proceedings. When serach for conference proceeding DOIs consider the following:
Book deposit information includes title, colume and edition metadata about the book as well as title information about the content item (the chapter or section). The book level metadata may also contain series title inforamation. Example: 078037293X||10th IEEE International Conference on Fuzzy Systems (Cat No01CH37297) FUZZY-01|Ha|1||332|2001||||
There are two methods for submitting queries, interactively via http://doi.crossef.org and programmatically using the CrossRef HTTP interface. Each of these methods operate in two modes. The first is a synchronous mode where you (or an automated system) submit a query and wait for the results. The second is an aynchronous mode where queries are submitted in a file (formated as XML or pipe'd) and the results are returned later in an email message. Using the Interactive Synchronous Query Interface To use the browser interface login at http://doi.crossref.org using your CrossRef username & password. To obtain your username and password contact Chuck Koscher (ckoscher@crossref.org) . Once logged on select the Queries and then Interactive Query Upload tabs. Enter your queries one per line (no line breaks within a query) using the XML or pipe delimited format described above as shown in figure 1 below. Alternately try the Sample Query option as a demonstration. The query input form allows you to select the format of the return results as either piped delimited (default) or as XML. The Area option is used to select between the production holdings repository (Live) or your account's test area. Each member account has a test area that can be used for trial uploads and queries. Each test area is isolated from other test areas.
Figure 1 - Query Input Form The results from an interactive query in pipe delimited format are shown in figure 2 below. When using the XML format the results will be sent directly to you browser which may attempt to display the XML.
Figure 2 - Query Results Page In Pipe'd Format Note: CrossRef recommends that you limit the number of queries entered in a single operation to no more than 20. This will reduce the chance of a timeout occurring and interrupting your request. Note: CrossRef also provides a simple forms based interface for submitting single queries using a guest account available at http://www.crossref.org/guestquery Using the Interactive Asynchronous Query Upload The asynchronous query upload feature has the advantage of not being prone to HTTP connection timeouts that may occur when using the synchronous (wait for your results) mode. In addition, it more efficiently utilizes CrossRef resources and is highly recommended for large query jobs. Using this mode requests containing 100 to 5000 queries can be easily be handled. Once logged into http://doi.crossref.org follow the Submissions tab to the Upload tab. The Type selection is used to identify the kind of operation being performed. Caution: this form is also used to upload metadata deposits to the CrossRef system (Type selection=Metadata). For metadata queries in pipe'd or XML format select a Type of "Query". For DOI queries (DOIs submitted and metadata returned) select a Type of "DOI Query". Using the Browse button will bring up a familiar file selection window as shown in figure 3 below.
Figure 3 - Uploading an Asynchronous Query For pipe formated queries the file should contain the queries one per line and needs a header line to identify the return email address as shown below. H:email=ckoscher@crossref.org For XML formated queries the file should follow the query schema Using the HTTP Interface The most common method of submitting queries to CrossRef is to have an automated service interact with the Resolver's HTTP interface. This interface supports both the synchronous and asynchronous modes of operation. When used in the synchronous mode we strongly encourage the grouping of queries into requests containing 10 or more but less than 500 individual queries. These limits will help balance the load on CrossRef resources. The HTTP interface supports both GET and POST methods for queries. Synchronous queries are performed using a URL with encoded parameters as follows: http://doi.crossref.org/servlet/query?usr=<USR>&pwd=<PWD>&qdata= To place more than one query in the request simply include it in the qdata paramater separated by '%0A': example: qdata=|%20Natl%20Acad.%20Sci.%20USA|Zhou|94|24|13215|1997 In this example <USR> and <PWD> would be replaced with your account username and password. It is also necessary to URL encode the data provided in the 'qdata' parameter. Certain characters can not be passed in a URL without causing problems. The table below lists the characters which must be encoded, for more information. For more information visit http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
To utilize the asynchronous interface you will need to construct an HTTP post with the encType se to multipart/form-data . The body of the multi-part message should be formated the same as described above in the section on uploading ayshncronous queries while the remaining parameters expected in the URL are shown in the following table.
For complete technical documentation please see the help pages at http://doi.crossref.org/doc/userdoc.html. Sample Java code is available at http://doi.crossref.org/doc/samples.zip. For help with Perl and Visual Basic programs please contact Chuck Koscher (ckoscher@crossref.org)
The Open Channel Interface (OCI) The Open Channel interface offers a significantly improved performance over the normal HTTP interface. This interface is intended for use when resolution of a query has an impact on an individual who may be waiting for the results. Users who are performing queries as part of their back end processing to populate local link databases should not use the OCI. Back end systems performing large volumes of queries where the results can be processed off line should consider using the query upload feature whereby a file of queries (XML or pipe'd) is uploaded to the system, it is processed in a queue and the results are emailed to the user. The OCI operates much like a 'telnet' session in that the system performing the queries connects to a special port on the CrossRef system and then simply writes queries to the session and reads the results back. Here is a sample session where 5 pipe'd queries were submitted to the OCI ( the line numbers have been added here to help describe the activity) 1) [root@cr2 root]# telnet 172.20.1.17 8081 2) Trying 172.20.1.17... 3) Connected to cr1.crossref.org (172.20.1.17). 4) Escape character is '^]'. 5) H:USR=creftest;PWD=****** 6) AUTHORIZED 7) |curr opin struct biol|Zwickl|10||242|2000||KEY1| 8) 0959440X|Current Opinion in Structural Biology|Zwickl|10|2|242|2000|full_text|KEY1|10.1016/S0959- 440X(00)00075-0 9) |nature|Groll|386||463|1997||KEY2| 10) |cell|Glickman|94||615|1998||KEY3| 11) |trends cell biol|Schwechheimer|11||420|2001||KEY4| 12) |mol cell|Kohler|7||1143|2001||KEY5| 13) 00928674|Cell|GLICKMAN|94|5|615|1998|full_text|KEY3|10.1016/S0092-8674(00)81603-7 14) 00280836,14764679|Nature|Groll|386|6624|463|1997|full_text|KEY2|10.1038/386463a0 15) 09628924|Trends in Cell Biology|Schwechheimer|11|10|420|2001|full_text|KEY4|10.1016/S0962-8924(01)02091-8 16) 10972765|Molecular Cell|KOHLER|7|6|1143|2001|full_text|KEY5|10.1016/S1097-2765(01)00274-X Lines 7, 9, 10 and 11 are the queries being written by the user to the OCI. Lines 8, 13, 14, 15 and 16 are the query results being written back by the OCI. Note that line 8 was returned very quickly before any more queries could be input. The next 4 queries took a little time to process and were returned in the order when completed, note: the results from the third query (line #10, KEY3) is returned before the results for the second query (line #9, KEY2) In Java the code would look like this:
The first three lines open a connection to the OCI. Line 4 sends a login username and password while line 5 reads the status of the login (which would be "AUTHORIZED" if the connection is made). Lines 1 through 5 only need to be executed once. Lines 6 and 7 are then repeated for any queries to be resolved. The string 'qData' would contain the pipe'd query and the results would be read back into the variable 'line'. Operation of the OCI interface does place certain demands on the CrossRef system and is only available once a user account has been authorized. If you feel this interface would provide a dramatic improvement to your operation please contact me to discuss the possibility of using the OCI.
Metadata Queries CrossRef maintains a resolver that accepts metadata, does a search to find the DOI and optionally redirects the caller to the target of the DOI (via.dx.doi.org). Example http://doi.crossref.org/resolve?pid=<USR>:<PWD>&aulast=Maas%20LRM&title= JOURNAL%20OF%20PHYSICAL%20OCEANOGRAPHY&volume=32&issue=3&spage=870&date=2002 <USR> should be replace with your username and <PWD> with your password. Additional parameters accepted are: issn - not recommended redirect - set to false to return the DOI instead of redirecting to the target URL (default is true) Current limitations: No parsing is being performed on the date field to extract year from a more complex represenation. I was unable to locate a defined format for an extended value to be accepted in this field. DOI Queries Cross currently supports DOI queries formatted as OpenURL version 0.1 requests. These queries are used to retrieve the basic metadata for a known DOI. This metadata includes journal identifiers (title and/or ISSN), first author , journal enumeration (volume issue page) and year. http://doi.crossref.org/servlet/query?id=10.1006/jmbi.2000.4282&pid=<USR>:<PWD> Where <USR> and <PWD> would be replaced with your account username and password. Results will always be returned in an XML format as shown below.
<?xml version="1.0" encoding="UTF-8" ?>
<doi_batch version="0.3">
<head>
<doi_batch_id>Crossref::Resolver_26-Feb-2004@09:25:12</doi_batch_id>
<timestamp>26-Feb-2004@09:25:12</timestamp>
<depositor>
<name />
<email_address />
</depositor>
<registrant />
</head>
<body>
<doi_record type="full_text" key="">
<doi_data>
<doi>10.1006/jmbi.2000.4282</doi>
<url />
</doi_data>
<journal_article_metadata>
<article>
<author sequence="first">
<given_name />
<surname>Jiang</surname>
</author>
<date type="print">
<year>2001</year>
</date>
<enumeration>
<volume>305</volume>
<issue>3</issue>
<first_page>377</first_page>
</enumeration>
</article>
<journal>
<full_title>Journal of Molecular Biology</full_title>
<issn type="print">00222836</issn>
<issn type="electronic">10898638</issn>
</journal>
</journal_article_metadata>
</doi_record>
</body>
</doi_batch>
Considerations When Querying Journal Titles & ISSN One of the most powerful features of the CrossRef system is its fuzzy matching processes. Consequently we strongly encourage the use of journal title as an identifier instead of ISSN. Queries that supply only ISSN tend not to resolve as well as those that supply just journal title or both title and ISSN. ISSN is essentially treated as a number resulting in the need to perform an exact match on the text string representation. Very little normalization can be done. Title however is a very complex value that can be normalized in several ways giving the matching function more options as it seeks to locate the proper DOI. Real Time Queries The CrossRef system was not intended to support real time queries where a metadata search to obtain a DOI is performed the instant a person clicks on a link. However, the synchronous HTTP interface will lend itself to being used in this manner. The primary concern regards the acceptable response time that the system can provide. Since most users are submitting batch queries via an automated process they are not expecting all their transactions to complete in one to two seconds. Typically if a request has only one query the transaction will complete in under 5 seconds (actually around 1 second). Requests with 20 to 50 queries often complete in around 30 seconds, while large queries (100-500) can take several minutes. While our system is performing very well, and we are in the process of adding new hardware resources, we can not assure the service level typically considered acceptable for real time queries. If you plan to use our system in this manner despite this caution, please let us know. Batch Query UploadWriting a program to perform the upload is a fairly simple process (in Java or Perl anyway). A fully functional Java program can be download from http://www.crossref.org/08downloads/doQPost.java. This program accepts an XML file or text that must conform to the batch query formats. It also accepts a file (anything without a .XML extension) that is a list of XML files to deposit. It is run by issuing: java doQPost <USR> <PWD> filename In order to use this you will need a copy of a recent Java runtime and you'll need the HTTP Client library. For more complete technical documentation please visit http://doi.crossref.org/doc/userdoc.html.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||