Sunday, October 23, 2011

OpenURL COinS: A Convention to Embed Bibliographic Metadata in HTML

http://ocoins.info/

Abstract
COinS (ContextObjects in Spans) is a simple, ad hoc community specification for publishing OpenURL references in HTML.

Contents

Introduction

Recently, there has been very compelling work by thought leaders in the library information community focusing on the possibilities of embedding citation metadata in html web pages using OpenURL. (for example, see the GCS-PCS list )
Citations to other works are familiar to any scholar- they ground a work of scholarship to a field of study, put new research into context, and often give credit where credit is due. The essence of citation is to identify the previous work with a set of metadata- author, title, and particulars of publication. The idea behind OpenURL is to provide a web-based mechanism to package and transport this type of citation metadata so that users in libraries can more easily access the cited works. Most typically, OpenURL is used by subscription-based abstracting and indexing databases to provide linking from abstracts to fulltext in libraries. A institutional subscription profile is used together with a dynamic customization system to target links at a user's OpenURL linking service.
Although the institutional profile method of providing links works very well in many circumstances, there are many situations in which citation metadata embedded in static documents would be very useful. For example, Open Access, public domain, and pay-per-use publishers typically do not have "subscribers" and have difficulty discovering a user's institutional affiliation which is needed to make an OpenURL. Embedded metadata can be used by client-side software to add links to non-subscription based content. This method of providing OpenURL links to users by combining embedded metadata with client side link activation has been called "latent OpenURL".
Embedded citation metadata in web content may be useful in many other ways. It's not hard to imagine specialized indexing and search systems which make use of the embedded information to deliver new types of information retrieval services. "Semantic Web" systems could use embedded metadata to extract knowledge from large collections of documents.
The possibility to embed OpenURL citation metadata in conventional, static HTML documents has been around for a while, but implementation has been almost nonexistent. For a number of reasons, this situation may be rapidly changing.
  1. A large number of institutions have implemented OpenURL resolvers to manage linking to electronic resources.
  2. An increasing number of free or open-access internet resources need a simple and cost-effective way to provide OpenURL services to readers with access to full-text resources in libraries.
  3. New forms of publishing, such as blogs, syndicated news feeds and collaborative bookmarking environments, need ways to provide localized linking services to libraries.
  4. Barriers to client-side implementations have fallen, as javascript-based browser plugins and bookmarking techniques are becoming popular. Institutional agents such as rewriting proxy-servers that are widely deployed to facilitate web access could also act to implement localized linking.
  5. NISO (the National Information Standards Organization) has approved and published OpenURL 1.0 (formally known as Z39.88-2004) as an international standard. As part of the standard, a citation metadata package called the "ContextObject" was defined.
What has been missing so far is agreement (or even awareness) among the diverse actors on the best way to embed OpenURL citation metadata in conventional HTML. Example implementations have been reported by Van de Sompel (DLIB) and by Chudnov et al. (Ariadne) . The intent of the current document is to distill the essence of previous proposals into the simplest convention necessary for the majority of applications to make use of an OpenURL embedded in HTML.

Specification : OpenURL ContextObject in SPAN (COinS)- Embedding Citation Metadata in HTML

The goal is to embed citation metadata into html in such a way that processing agents can discover, process and make use of the metadata. Since an important use of this metadata will be to allow processing agents to make OpenURL hyperlinks for users in libraries (latent OpenURL), the method must allow the metadata to be placed any where in HTML that a link might appear. In the absence of some metadata-aware agent, the embedded metadata must be invisible to the user and innocuous with respect to HTML markup. To meet these requirements, the span element was selected. The NISO OpenURL ContextObject is selected as the specific metadata package. The resulting specification is named "ContextObject in SPAN" or COinS for short.
To add a COinS to an HTML document, put a NISO 1.0 "ContextObject" into the "title" attribute of an HTML span element with class attribute set to "Z3988". A brief guide to the OpenURL 1.0 ContextObject is available.

Example

OpenURL COinS:
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.issn=1045-4438"></span>
This COinS is placed directly below this line:

A COinS processing agent might use the embedded metadata to place a link here, otherwise, the line above should be empty.

Discussion : How to use COinS in HTML

There may be many ways that embedded metadata may be used, but in general, the recommended procedure is as follows:
  1. select all span elements with class 'Z3988'.
  2. for each selected span extract the value of the title attribute.
  3. operate on that value, which is the OpenURL ContextObject, as you wish, but *do not* overwrite the original class and title values of the span element. This allows for different actions to be taken on the same element in a variety of potential scenarios.
COinS Generator site is available. A discussion of how COinS in HTML should be processed in the latent OpenURL application is here.

Details

Empty SPANs.

The example above shows an empty span tag. In the absence of further processing, nothing will be visible to the user. The page is designed to gracefully accommodate a bit of added text or a button image to anchor an added link. Alternatively, the web page might have default text inside the span for users without access to activating agents. Some HTML checkers (such as HTML Tidy) strip out empty span elements, causing the loss of COinS data. A comment or hard space in the span will prevent this from happening.

Why "Z3988"?

The official designator for the NISO OpenURL standard is Z39.88-2004; the year and punctuation are removed in the present specification. This is because web browser software does not recognize css classes with punctuation in the class names. If processing agents require version information they can look inside the ContextObject. The "Z" MUST be capitalized. Browser software seems to distinguish the lower case version. "OpenURL" as an alternative to "Z3988" was considered, but "Z3988" was considered to be extremely unlikely to be chosen for any other application, and compatibility was judged to trump other considerations.

What is a ContextObject?

During the standardization of OpenURL, a separation was made between the data package, called the ContextObject, and the "transport". In its simplest form, which is use here, the ContextObject is just a series of key-value pairs. When joined to an http baseURL and version information is added, a usable OpenURL is created.

Choosing the type of ContextObject for Compatibility.

The OpenURL Standard defines the ContextObject with a great deal of flexibility in the ways that entities can be represented. For example,metadata objects to be transported "by-reference" using a network pointer and "by-value" in an encoded blob. This flexibility can introduce complexities for processing agents.
To make it easier for processing agents to deal with the complexities of having to deal with multiple OpenURL data formats, this convention STRONGLY RECOMMENDS the following guidelines for ContextObjects in Span:
  1. the Key/Encoded-Value Format only should be used for the ContextObject
  2. the Referent and Referrer should be described using only identifiers and By-Value Metadata
By following these guidelines, the OpenURL metadata packages can be be easily adapted for use with ANY resolver system, including those which understand only the older version.

XHTML

This specification can also be applied to XHTML. For compatibility with HTML browsers, empty span elements should NOT be minimized. (see the XHTML Compatibility Note C3)

Why the span element?

A draft of this proposal used the HTML Anchor element instead of span. The Anchor element proved to be problematic in certain situations, and the use of span made it easier for processors to leave the ContextObject in place for subsequent processors to use.
Another approach to this problem would have been to use namespaced xml embedded in xhtml, for example <dc:creator>Shakespeare</dc:creator>. The biggest drawback to namespaced xml was uncertainty about being able to access data from javascript. Data in namespaced xml IS NOT available to javascript in at least one version of Internet Explorer. There was also the concern about strict conformance with HTML (as opposed to XHTML). So using SPAN buys us the prospect that COinS processing can be available in a wider variety of HTML processors, which seemed worthwhile.

Why class and title attributes?

Only a limited number of attributes can be attached to span in valid HTML Documents. ID cannot be used for OpenURL data because it is required to be unique in a document.
The class attribute can contain a space separated list of class names, so a COinS laden span element will contain class="Z3988" or perhaps class="Z3988 anotherclass athirdclass etc".

Implementations

In this section we list COinS implementations

Embedding Sites

  1. Research Blogging hosts a number of blogs about various fields of scholarly research, each of which supports COinS.
  2. the Physical Review Online Archive (PROLA)
  3. Wikipedia
    1. Wikipedia Book Sources Page
    2. Cite this article / article bibliographic details pages
    3. References that use citation templates
  4. MRS Internet Journal of Nitride Semiconductor Research 4000+ page Reference Database Pages
  5. ResearchGATE - ResearchGATE is the leading professional network for scientists.
  6. Citebase - Citebase Search is a search and citation analysis tool for the free, online research literature.
  7. CiteULike - CiteULike is a free service for managing and discovering scholarly references.
  8. Hubmed - An alternative interface to the PubMed medical literature database.
  9. The law journal index Current Law Journal Content has added COinS into its Table of Contents records
  10. Zetoc provides access to the British Library's Electronic Table of Contents of around 20,000 current journals and around 16,000 conference proceedings published per year.
  11. The Copac V3 Experimental Interface now has COinS (ContextObjects in Spans) in the Full Record display.
  12. OCLC's Open WorldCat
  13. The Lunar and Planetary Institute is now using COinS. For example, see the New additions page
  14. The West Midland Bird Club is using COinS.
  15. Open Context is a free, open access resource for the electronic publication of primary field research from archaeology and related disciplines.
  16. Blogs

COinS Processors

  1. OCLC's OpenURL Referrer FireFox Extension adds OpenURL (either version 1.0 or 0.1) links to COinS enabled pages.
  2. Hundreds of COinS Browser Extensions for Your Library are available, thanks to some cooperation with the OCLC OpenURL Resolver Registry.
  3. Alf Eaton's Greasemonkey script for processing COinS
  4. Citavi is a reference manager which supports COinS through its Firefox and Internet Explorer extensions.
  5. Mendeley is a research management tool for desktop & web.
  6. Virginia Tech's LibX is a Firefox extension that provides direct access to your library's resources. It includes a toolbar and a right-click context menu. It support searches against the library catalog (OPAC) as well as against an OpenURL linking server (which provides copies of works to which your library has access.)
  7. The Center for History and New Media is developing Zotero, is described as a next-generation research tool.

Other Software support for COinS

  1. VuFind Open Source OPAC system. Here's an example link
  2. simple pyblosxom plugin now renders COinS
  3. Peter Binkley has written a WordPress Plugin to help bloggers use COinS
  4. John Miedema has also written a WordPress Plugin. The plugin lets users show book covers and other data from Open Library in WordPress pages. Now it also insert the COinS HTML so that the bibliographic data for the book can get picked up by apps like Zotero.
  5. VTLS has a COinS script for iPortal.
  6. Eric Lease Morgan's Really Rudimentary Catalog has implemented COinS for items with ISBN numbers. Try an example page.
  7. refbase is a web-based, platform-independent, multi-user interface for managing scientific literature & citations.

Links

  1. NISO OpenURL Standard
  2. Van de Sompel DLIB article talking abount embedding ContextObjects in HTML.
  3. In a paper in Ariadne Dan Chudnov and co-workers do a good job explaining why COinS (in a previous incarnation) is a Good Thing.
  4. Dan Chudnov has a web page introducing COinS.
  5. The DCMI Bibliographic Citations recommendation is a complementatry mechanism for embedding metadata in XHTML documents. The difference is that within an (X)HTML document, the DCMI recommendation would put all the ContextObjects within the <head> element, and is more general than just for Web pages. In contrast, COinS allows you to embed a ContextObject next to a reference within a Web document.
  6. Dan Chudnov is working on a related specification linking COinS and OAI-PMH.
  7. Figoblog has an explanation of COinS en Français.

Notes

Note that, for clarity, our displayed examples have not converted ampersand to "&amp;" as they should.
  • March 22, 2005. first version- Eric Hellman.
  • April 23, 2005. new version based on comments on GCS-PCS list, switched from "CLASS" to "REL". Expanded on 1.0 to 0.1 conversion. Previous version is here.
  • June 4, 2005. New 3rd draft using SPAN in stead of A.
  • July 6, 2005. added a bunch of introductory discussion; changed recommendation so that the span element remains after activation.
  • July 21, 2005. new name: COinS (Ross Singer's idea). moved latent-specific discussion to a separate page.
  • July 26, 2005. added style sheet, clarified discussion on "choosing ContextObject format" using suggestion from Herbert vdS.
  • July 29, 2005. Clarified motivation for span, and syntax for class, made capitalized Z a MUST. Fixed other nits noted by Alf Eaton. added links, including citebase. lifted abstract from Dan Chudnov's introducing coins page. Added index
  • August 8, 2005. Added DCMI link.
  • December 12, 2006. Corrected percent encoding in examples.
  • June 16, 2009. This page is intentionally left as a stable version, for any broken links, you may try WayBackMachine.

No comments:

Post a Comment