Latent OpenURLs in HTML for Resource Autodiscovery, Localization and Personalization

!!!!!!!!!DRAFT, SUBJECT TO CHANGE!!!!!!!!!

Recently, there has been very compelling work by thought leaders in the library information community focussing on the possibilities of embedding metadata in html web pages using OpenURL. (for example, see the obscurely named GCS-PCS list )

Although the possibility to embed OpenURLs in conventional HTML documents has been around for a while, implementation has been almost nonexistent. For a number of reasons, this situation may be rapidly changing.

  1. A large number of institutions have implemented OpenURL resolvers to manage linking to electronic resources.
  2. An increasing number of free or open-access internet resources need a simple and cost-effective way to provide OpenURL services to readers with access to full-text resources in libraries.
  3. New forms of publishing, such as blogs, syndicated news feeds and collaborative bookmarking environments, need ways to provide localized linking services to libraries.
  4. Barriers to client-side implementations have fallen, as javascript-based browser plugins and bookmarking techniques are becoming popular. Institutional agents such as rewriting proxy-servers that are widely deployed to facilitate web access could also act to implement localized linking.

What has been missing so far is agreement (or even awareness) among the diverse actors on the best way to implement OpenURL in conventional HTML. Example implementations have been reported by van de Sompel (DLIB) and by Chudnov and Frumkin (working paper). The intent of the current document is to distill the essence of previous proposals into the simplest convention necessary for the majority of applications to make use of an OpenURL. embedded in HTML.

Proposal : the Latent OpenURL Convention for HTML

To add a Latent OpenURL to an HTML document, put a NISO 1.0 OpenURL into the "href" attribute of an HTML anchor ("a") tag with class attribute set to "Z3988". Example: a latent or unactivated OpenURL link:

<a class="Z3988" href="?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.issn=1045-4438"></a>

This latent OpenURL is placed directly below this line:

If you are being served by a compliant activating agent, you will see a link, if not, the line above should be empty.

The same link, after (hypothetical) activation:

<a class="Z3988" href="http://library.example.edu/?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.issn=1045-4438">Find at Example Library</a>

This hypothetically activated OpenURL is placed directly below this line:
Find at Example Library
If you are being served by a compliant activating agent, you will see a link, different from "Find at Example Library"

To activate Latent OpenURLs in an HTML document, select all anchor tags with class="Z3988". Replace the part of the href attribute before the "?" with the baseurl of the local link server. Replace the content of the anchor tag with the anchor text or button image for the local link server. If the target resolver supports only 0.1 OpenURL, adjust the rest of the URL accordingly.

Details

What kind of OpenURL?

Because it is harder for an activating agent to convert a 0.1 OpenURL to 1.0 than it is to convert 1.0 OpenURLs to 0.1, this proposal requires use of the NISO 1.0 version of the OpenURL. Through the use of the NISO standard, embedded OpenURLs can be used with any resolver system. A simple guide to OpenURL 1.0 implementation is HERE.

Activating agents need to to adjust the OpenURL for use with 0.1-only resolver systems by omitting "rft." from most referent metadata keys and should skip over those OpenURLs for which 0.1 OpenURLs are nonexistent (patents, dissertations, Dublin Core links). To simplify this proceedure, the embedded 1.0 OpenURLs are restricted to using the "in-line ContextObject format" and "in-line metadata" for the referent. (In addition to transporting metadata in the key-value pairs of a query string, the full 1.0 OpenURL standard allows metadata objects to be transported "by-reference" using a network pointer and "by-value" in an encoded blob; these forms are not to be used in latent OpenURLs.)

Empty anchors.

The example above shows an empty anchor tag, and the OpenURL is present with an empty baseurl. In the absence of an activating agent, the link will be completely invisible to the user. The page assumes that user or institutional agents will fill in anchor text to make the link visible; its layout and design should gracefully accommodate additions. Alternatively, the web page might use a default baseurl and anchor text for users without access to activating agents. This has been done for the link to the von de Sompel article cited above, by taking advantage of the fact that webservers ignore query data when serving static pages.

Why CLASS attributes?

The use of CLASS attributes to mark the elements with latent OpenURL links is chosen to permit the use of css stylesheets for specialized formatting or processing. For example, this html document sets the font of such links and their background color to fuschia.

In HTML, multiple classes can be associated with an element using a single attribute: class='class1 class2' To maximize consistency with css class formatting, activating agents should allow this syntax.

Alf Eaton has suggested to use REL in stead of CLASS. As far as I can tell, the main practical advantage and disadvantage of using REL is the lack of coupling with CSS, so this remains for discussion.

Why Z3988?

The official designator for the NISO OpenURL standard is Z39.88-2004; we removed year and punctuation. This is because web browser software does not recognize css classes with punctuation in the class names. If activating agent require version information they can look inside the OpenURL. The "Z" should be capital. Browser software seems to distiguish the lower case version

Forms

This specification does not provide means to embed or activate latent OpenURL information in HTML forms.

Definitions

Activating Agent
Software that processes an HTML page to make OpenURLs active for a user.

Note that, for clarity, our displayed examples have not converted ampersand to "&amp;" as they should.

Open Issues
  1. Which attribute of anchor element to use for marking?
    1. CLASS (my suggestion)
    2. REL (suggested by Alf Eaton, good arguments both ways)
    3. TITLE (suggested by Ross Singer, used? in experiment.) It seems everyone else is indifferent.
  2. Should embedded 0.1 links be supported? (Decent activating agents will, of course have to support making both 0.1 and 1.0 links out of the embedded ones) My view is NO, but Ross reasonably raises the question. Perhaps his points can be addressed by requiring a very simple form of OpenURL 1.0?
  3. Is there need for TYPE or version? I don't understand what the motivation for TYPE is. Versioning seems to be addressed by OpenURL 1.0.