Object Service Architecture

Web Annotation Service

Project Summary

Venu Vasudevan and Mark Palmer
Object Services and Consulting, Inc.
 
September 15, 1998
 

Executive Summary

Military intelligence analysis and command and control organizations as well as industrial and educational institutions increasingly use the World Wide Web to share and make available information about their "organizational memory" or the situations they monitor.  Today's Web is a global document repository where authors actively publish information and browsers passively view it.   The objective of the OSA/Annotations project is to augment the Web with the basic capability for third party viewers to augment or annotate content supplied by authors, with or without author knowledge.  This can allow closed groups of collaborators or large "communities of interest" to actively add expertise to information posted on the Web so that the community information base can rapidly and incrementally improve based on the expertise of all its members, not just primary authors.  Our approach is to develop an open Annotation service that can be used in conjunction with standard Web browsers.  The service is invoked by a general purpose URL interceptor which installs the side-effect (transducer or extension) of searching for document annotations as the document itself is retrieved.  The basic annotation service can be tailored in a number of ways, for instance, to only display annotations from certain groups or of certain types.  Higher-level facilities like pedigree drill down and filtering, argumentation, or workflow can be built using the OSA/annotations service as a primitive.

Problem Statement

In the JTF/ATD Web Server project, DARPA is using an idiosyncratic object-oriented "web" to represent and monitor situation description and assessment information about evolving command and control events.  Though it is much less structured at present, information authors use the World Wide Web for much the same purpose, to post information content about topics of interest within private Intranets or to the public Internet.  To date, Webs are populated by authors who are authorized to add content.  Surfers who access the information provided on Web pages are generally passive viewers, with no direct way to add their expertise to the subject matter.  Yet these same viewers have an interest in the subject matter they are accessing and may have additional knowledge (more recent, deeper, better analyzed, corrections, viewpoints, clarifications, ratings, filters) that could be shared across a community to augment published content.  For example, a user of a global map facility could use a general annotation facility to report new roads in a new subdivision or inaccurate addresses; a soldier in the field could report that a situation has changed, for instance, a bridge is temporarily disabled.  If system applications like DBMS, workflow, or collaborative authoring are open for extensional services like annotations, then soldiers might use publicly available information, annotating it to adapt it to military use, instead of having the military invest in its own tool sets to get the extensional capabilities it needs.

Objectives

The technology objective of the OSA/Annotation project was to develop an open, scalable annotation service that can be tailored to accomodate a range of custom uses.  The Annotation Service allows existing web content to be augmented or filtered using third-party information, which can be much richer in structure and content. Such information adds value by elaborating on, restricting, providing a background or rationale to, branding, and personalizing the web content based on user metadata.

The infrastructure objective was to understand how to add new services to standard web engines in the same way that we can for ORBs.

The technology transfer objective was to complete a prototype toolkit available to the DARPA AITS application community, the Web vendor and user communities, and the Web standards community W3C.

Architecture

The architectural elements of a basic Annotation Service include: Design Space
  • An open Annotation service involves a rich design space.  A design space describes, for each part of the architecture, the range of alternative design choices that can be supported.  The design space can be viewed as a formal AND/OR graph structure in which AND layers describe required components and OR layers describe tailoring alternatives.  The design space enumerates the applicable alternatives and tradeoffs, can identify the basic architectural structure of the service, can support tailoring of the service, and can be used to generate specifications. We found little prior work describing the design space for Annotations, and wanted to explore alternatives that seemed promising but had not yet been tried. For our purposes, we describe the design space informally as a hierarchy of objectives and requirements.
  • The following is a list of the issues and alternatives we encountered in reviewing prior work and in designing our prototypes:

    Limitations of Current Work

    At present, browsers have no general mechanism for third parties to provide value-added information to the Web and make it available to others. The state of practice mechanisms available today requires sending email to Web page authors suggesting adding updates or doing this more publicly if the page author includes comment boxes on his web page, which is a very limited form of annotation.

    Recent work on Web annotations includes the CERN annotator,  Stanford ComMentor, CritLink, CoNote , Strand/GrAnT and the Net Notions product. While the architectures of some of these systems are open in one way or another (for instance, including the idea of public and group annotations), the implementations tend to require modified web infrastructures, some systems are slow, and none has been widely deployed. CoNotes requires document authors to anticipate places where readers might wish to add commentary or questions by inserting "annotation points" into the original document. CoNotes can only annotate documents under its server's control, and does not support annotating arbitrary web content. In HyperNews and CritLink, the request for annotated content is syntactically different from a request for the unannotated document. HyperNews system is primarily a discussion group tool, but has supported annotation in conjunction with the use of modified browsers (Mosaic and HotJava) in the past. CritLink uses a specialized server called the mediator. The Mediator retrieves a requested page, prefixes all references in it with its Internet domain, and returns the annotated, modified page to the user. All further navigation from links within the returned pages are forced to be processed by the Mediator. This approach requires users to navigate from within the CritLink-processed page instead of by using the browser's bookmarks or direct URL entry. Both NCSA's Mosaic project and ComMentor modify web browsers to augment them with annotation capabilities. The NCSA Mosaic browser supported private annotations and group shared annotations in several releases. Both Strand/GrAnT and the Net Notions product introduce web annotation function without modifying web content, browsers or servers. They are similar in philosophy to the systems that we have built. Strand uses a client intermediary for page interception. The Strand authors noted some problems with changing page layout when inserting annotation indicators and functions, and envisioned that much of the GrAnT functionality might be moved into a Java applet, the approach that JotBot explores. Net Notions is a commercial product from Sideware Systems, Inc. that allows users to affix text annotations "over" arbitrary web pages. Net Notions uses a client process that apparently records current URL and position information when annotations are created.

    Other topics related to annotation include:

    Results

    We have constructed two prototype OSA/annotation implementations to illustrate these different aspects of an open annotation service. Early releases of both prototypes were demonstrated at the DARPA PI meeting in San Diego during 10/97. A reference model for annotation architectures is presented in a paper to be published in the HICSS '99 conference. The Server-side prototype modifies the Jigsaw web server and maps annotated URLs to a new URL space. The JotBot prototype takes a client-side approach by using a Java applet to retrieve and compose annotations within the browser, allowing the annotation server to be thin.

    The Server-side Prototype supports:

    The JotBot prototype supports: We can author annotations without requiring any change to web infrastructure since the editor and server are independent components. We can retrieve and present annotations also non-invasively since the Viewer applet is an independent component. The viewer applet needs only to know when the browser fetches a new URL, and it has an open interface for a browser-dependent interceptor to use. The Survey, Rating, and Messaging functions are innovative in that they adopt features found in other collaboration research for use in an annotations context.

    Lessons Learned

    It is a restriction of current browser implementations that they don't publish URL fetches, and getting around that restriction - even to a limited extent - has proved difficult. By introducing a client intermediary we can obtain most of the information we need and modify content sent to the browser arbitrarily without being limited by browser security. We should not need to modify infrastructure to get URL info if the browser allows access to it in a way that supports its security controls. It would be desirable for browsers to provide more open interfaces in the future.

    During late 1997 and 1998, XML became a hot topic and gained significant momentum as a future web technology that will ultimately complement - but not replace - HTML. XML's support for semantic tagging makes it possible to express annotation objects and links as XML elements. The use of XML could potentially make most web documents accept annotation, and facilitate development of more powerful forms of annotation. An XML-based annotation repository could offer a lighter weight alternative to other data store choices. However, a number of concepts in XML such as semantic tagging and the ability to create external hyperlinks between documents could change the architecture of browsers, and therefore the architecture of annotation systems. The XML developments occurred too late in this contract to really allow time to redirect our work to focus on XML. At present, standard browsers do not yet support XML and the nature of this support remains to be seen. HTML is likely to constitute the vast majority of web documents for some time to come. Given our goal of supporting commonly available technology unobtrusively, we decided to continue to work with existing browser technology.

    During the spring of 1998 we worked closely with a local entrepreneur who wanted to license the annotation technology for use on a future commercial web site, as a means to conduct online focus groups and reviews of business plans. This effort gave us some positive feedback about usefulness and usability of the tool as well as providing insight to the work that would be required to deploy the software for commercial use, testing on all browsers and server platforms.

    Also notable was the release of the Net Notions product by SideWare, inc. in 1998. This annotation product was the first to appear commercially, and it embodied a design similar to the novel approach we had taken in developing JotBot, using a client process instead of a Java applet to monitor browser activity and retrieve and present annotations. Although NetNotions was built by a larger software development team, it underscores the shortened time it takes for an idea to go from research to commercial product using today's internet.

    A general mechanism for annotating web documents can be the basis of a number of new and useful document management applications. We have developed an architecture to unobtrusively annotate web content and two concrete implementations of this architecture. Useful annotation capabilities could be introduced into today's Intranets and Internets without resorting to specialized infrastructure, by placing a simple intermediary function either in the browser (perhaps as an applet, as we have demonstrated) or at the server and controlling access to the intermediary. However, some relatively minor changes in current browser and server technology would make it much easier to extend their functionality. We have identified several desirable changes on each end, and would be gratified to see commercial products that adopt a more open stance toward extensions.

    These conclusions are also summarized in our HICSS '99 conference paper "On Web Annotation: Promises and Pitfalls of Current Web Infrastructure"

    Future Work

    The technology we have implemented is now released.  Directions for further work include
    This research is sponsored by the Defense Advanced Research Projects Agency and managed by the U.S. Army Research Laboratory under contract DAAL01-95-C-0112. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied of the Defense Advanced Research Projects Agency, U.S. Army Research Laboratory, or the United States Government.

    © Copyright 1997, 1998 Object Services and Consulting, Inc. Permission is granted to copy this document provided this copyright statement is retained in all copies. Disclaimer: OBJS does not warrant the accuracy or completeness of the information in this survey.
     
    Last revised: September 15, 1998.  Send comments to Venu Vasudevan or Mark Palmer.