Towards a Web Object Model

Frank Manola
Object Services and Consulting, Inc. (OBJS)
10 February 1998


Today, the World Wide Web is a global informationrepository of resources primarily consisting of syntactically-structuredHTML documents and MIME-typed files. These relatively unstructured datamodels do not provide the foundation for command and control situationmodeling or enterprise computing, or for a new generation of tools to operateon a more semantically-structured, knowledge-based web. Richer base datamodel(s) are needed that converge the benefits of emerging Web structuringmechanisms and distributed object service architectures.

A number of ongoing activities are attempting to merge aspects of objectmodels with those of the World Wide Web. This paper describes a numberof these activities, with particular emphasis on those which focus on providingenhanced facilities for representing metadata for describing Web (and other)resources. The intent of this paper is to:


1. Introduction
1.1 Background
1.2 Capabilities Provided by an Object Service Architecture
1.3 Increasing the Structuring Power of the Web
2. Relevant Work
2.1 Structured Data Representations and "LightweightObject Models"
2.1.1 Summary Object Interchange Format (SOIF)
2.1.2 Object Exchange Model (OEM)
2.1.3 Knowledge Interchange Format (KIF)
2.1.4 Extensible Markup Language (XML)
2.2 Higher-Level Models and Metadata
2.2.1 Dublin Core
2.2.2 Warwick Framework
2.2.3 PICS and PICS-NG
2.2.4 XML-Data
2.2.5 Meta Content Framework (MCF)
2.2.6 Resource Description Framework (RDF)
2.3 Adding Behavior to Web Pages
2.3.1 Document Object Model (DOM)
2.3.2 Embedded Objects
2.3.3 Web Interface Definition Language
2.4 Related OMG Technologies
2.4.1 OMG Property Service
2.4.2 Tagged Data Facility
3. Building a Web Object Model
3.1 Integration Approach
3.2 Discussion
3.3 Formal Principles
3.3.1 Logic Basis
3.3.2 Representation of Higher Level Semantics
3.3.3 Object Logics
4. Conclusions

1. Introduction

1.1 Background

Many business and governmental organizations are planning or developingenterprise-wide, open distributed computing architectures to support theiroperational information processing requirements. Such architectures generallyemploy distributed object middleware technology, such as the Object ManagementGroup's (OMG's) Common Object Request Broker Architecture (CORBA) [OMG95],as a basic infrastructure.

The use of objects in such architectures reflects the fact that advancedsoftware development increasingly involves the use of object technology.This includes the use of object-oriented programming languages, class librariesand application development frameworks, application integration technologysuch as Microsoft's OLE, as well as distributed object middleware suchas CORBA. It also involves the use of object analysis and design methodologiesand associated tools.

This use of object technology is driven by a number of factors, including:

The first two factors reflect requirements for business systems to berapidly and cheaply developed or adapted to reflect changes in the enterpriseenvironment, such as new services, altered internal processes, or alteredcustomer, supplier, or other partner relationships. Object technology providesmechanisms, such as encapsulation and inheritance, that have the potentialto support more rapid and flexible software development, higher levelsof reuse, and the definition of software artifacts that more directly modelenterprise concepts.

The third factor reflects a situation faced by many large organizations,in which a key issue is not just the development of new software, but thecoordination of existing software that supports key internal processesand human activities. Mechanisms provided by object technology can helpencapsulate existing systems, and unify them into higher-level processes.

The fourth factor is particularly important. It reflects the fact that,as commercial software vendors incorporate object concepts in key products,it will become more and more difficult to avoid using object technology.This is illustrated by the rapid pace at which object technology is beingincluded in software such as DBMSs (including relational DBMSs) and othermiddleware, and client/server development environments. Due to this factor,organizations may be influenced to begin adopting object technology beforethey would ordinarily consider doing so.

At the same time, the Internet is becoming an increasingly important factorin planning for enterprise distributed computing environments. For example,companies are providing information via World Wide Web pages, as well ascustomer access via the Internet to such enterprise computing servicesas on-line ordering or order/service tracking facilities. Companies arealso using Internet technology to create private Intranets, providingaccess to enterprise data (and, potentially, services) from throughoutthe enterprise in a way that is convenient and avoids proprietary networktechnology. Following this trend, software vendors are developing softwareto allow Web browsers to act as user interfaces to enterprise computingsystems, e.g., to act as clients in workflow or general client/server systems.Products have also been developed that link mainframes to Web pages (e.g.,translating conventional terminal sessions into HTML pages).

Organizations perceive a number of advantages in using the Web in enterprisecomputing. For example, Web browser software is widely available for mostclient platforms, and is cheaper than most alternative client applications.Web pages generally work reasonably well with a variety of browsers, andmaintenance is simpler since the browser and associated software can reducethe amount of distributed software to be managed. In addition, the Webprovides a representation for information which

However, as organizations have attempted to employ the Web in increasingly-sophisticatedapplications, these applications have begun to overlap in complexity thesorts of distributed applications for which architectures such as OMG'sCORBA, and its surrounding Object Management Architecture (OMA) [OMG97]were originally intended. Since the Web was not originally designed tosupport such applications, Web application development efforts increasinglyrun into limitations of the basic Web infrastructure. As a result, numerousefforts are being made to enhance Web capabilities, to enable them to supportthese more complex applications. In order to understand the missing elements,it is useful to look at the components of OMG's OMA.

1.2 Capabilities Provided by an ObjectService Architecture

There is increasing agreement that modeling a distributed system asa distributed collection of interacting objects provides the appropriateframework for use in integrating heterogeneous, autonomous, and distributed(HAD) computing resources. Objects form a natural model for a distributedsystem because, like objects, distributed components can only communicatewith each other using messages addressed to well-defined interfaces, andcomponents are assumed to have their own locally-defined procedures enablingthem to respond to messages sent them. Objects accommodate the heterogeneousaspects of such systems because messages sent to distributed componentsdepend only on the component interfaces, not on the internals of the components.Objects accommodate the autonomous aspects of such systems because componentsmay change independently and transparently, provided their interfaces aremaintained. These characteristics allow objects to be used both in thedevelopment of new components, and for encapsulating access to legacy components.In addition, because object-oriented implementations bundle data with relatedoperations in modular units, the use of objects provides the possibilityof fine-grained tuning in the computing architecture by moving or copyingobjects to appropriate nodes of the network (this is becoming increasinglyfeasible with the development of technology such as Sun's Java).

OMG's Object Management Architecture (OMA) is an example of a distributedobject architecture intended to support distributed enterprise computingapplications. The OMA includes the following components:

These components provide multiple levels of capabilities in supportof developing complex distributed applications.

The ORB in the OMA is defined by the CORBA specifications. An ORB doesnot require that the objects it supports be implemented in an object-orientedprogramming language. The CORBA architecture defines interfaces for connectingcode and data to form object implementations, with interfaces definedby IDL, that are managed by the ORB and its supporting object services.It is this flexibility that enables ORBs to be used in connecting legacysystems and data together as components in enterprise computing architectures.

A distributed enterprise object system must provide functionality beyondthat of simply delivering messages between objects. OMG's Object Serviceshave been defined to address some of these requirements. Object Servicesprovide the next level of structure above the basic object messaging supportprovided by CORBA. The services define specific types of objects (or interfaces)and relationships between them in order to support higher-level capabilities.Object Services currently defined by OMG include, among others:

Taken together, OMG Object Services provide services for ORB-accessibleobjects similar to those that an Object DBMS (ODBMS) provides for objectsin an object database (queries, transactions, etc.). The Object Services,together with the basic connectivity provided by the ORB, turn the collectionof network-accessible objects into a unified shared object space,accessible by any ORB client application. Managing the collection of ORB-accessibleobjects thus becomes a generalized form of "object database management",with the ORB being part of the internal implementation of what is effectivelyan ODBMS. Viewed in this way, the OMA provides a powerful object-orientedinfrastructure for the development of general-purpose applications, justas an enterprise database and its associated DBMS provide such an infrastructurefor the development of general-purpose enterprise applications. Additionallevels of organization are also needed. These additional levels are whereOMG's Common Facilities, Application, and Domain Objects, as well as stillhigher level concepts, come into play [MGHH+97].

If the Web is to be used as the basis of complex enterprise applications,it must provide generic capabilities similar to those provided by the OMA,although these may need to be adapted to the more open, flexible natureof the Web. Providing these capabilities involves addressing not only theprovision of higher level services and facilities for the Web, but alsothe suitability of the basic data structuring capabilities provided bythe Web (its "object model"). For example, in the case of services,search engines (a form of query service) are becoming indispensable tools,and agent technology can add additional intelligence to the searching process.Similarly, extended facilities to support transactions over the Web arebeing investigated. However, the ability to define and apply powerful genericservices in the Web, and the ability to generally use the Web to supportcomplex applications, depends crucially on the ability of the Web's underlyingdata structure to support these complex applications and services.

1.3 Increasing the StructuringPower of the Web

The basic data structure of the Web consists of hyperlinked HTML documents.It is generally recognized that HTML is too simple a data structure tosupport complex enterprise applications. For example, Jon Bosak's XML,Java, and the Future of the Web [Bos97] identifies a number of keylimitations of HTML:

These limitations severely affect the ability to develop advanced applicationsusing HTML, including:

Proprietary HTML extensions have been developed to address some of theseproblems, but none deals with all of them, and together they create barriersto interoperability. The same is true of the proprietary data formats usedby particular applications. Their use requires specialized helper applications,plug-ins, or Java applets, creating interoperability problems, and difficultyin reusing that data in different applications for new purposes. Whileuse of some specialized formats is necessary in particular applications(e.g., multimedia), in many cases these formats are just used to addressthe deficiencies of HTML for generalized document and data processing.

A more fundamental direction of efforts to address HTML limitationshas been attempts to integrate aspects of object technology with the basicinfrastructure of the Web. There are a number of reasons for the interestin integrating Web and object technologies:

Such efforts all contribute toward giving the Web a richer structuralbase, capable of directly supporting a wider variety of activities, inmore flexible and extensible ways. However, up until recently these effortshave still been based on HTML, with its basic structuring limitations,and have generally been pursued as separate, non-integrated activities.There is much other ongoing work within both the Web and database communitieson data structure developments to address Web-related enhancements. Workon similar issues is ongoing within the Object Management Group as well(see Section 2.4). This work has contributed valuable ideas, and the variousproposals illustrate similar basic concepts, generally, movement towardsome form of simple object model. However, these similarities are oftenobscured by detailed representational differences, and the work is fragmentedand lacks a unifying framework. As a result, individual proposals oftenlack key capabilities that are in some cases contained in other proposals.Moreover, in many cases these proposals are not well-integrated with keyareas of emerging industry consensus on Web data structuring technologies.

If the Internet is to develop to support advanced application requirements,there is a need for both richer individual data structuring mechanisms,and a unifying overall framework which supports heterogeneous representationsand extensibility, and provides metalevel concepts for describing and integratingthem.

The intent of this paper is to describe how a number of (in some respects)separate "threads" of Web-related development can be combinedto form the basis of a Web object model to address these requirements.This combination is based on the observation that the fundamental componentsof any object model are:

As a result, what is needed to progress toward a Web object model is:

At the same time, the openness of the Web compared to conventional objectmodels needs to be preserved, due to the distinct requirements of the Webenvironment for openness and scalability.

In the following sections, this paper will:

2. Relevant Work

As noted in the Introduction, there has been much ongoing work on enhancementsto address Web limitations in supporting richer data structures, and integratingobject technology. For example, the Internet and Web communities have developedboth additional representations, and a number of "object models"or data structuring principles, to represent richer data structures. Thedatabase community has also developed proposals for "lightweight objectmodels," partly driven by attempts to represent the structure of Webresources. All this work has contributed valuable ideas and, taken as awhole, exhibits important common underlying principles. What is requiredis that this work be integrated, and the best ideas merged.

The Introduction specifically noted that what is needed to progresstoward a Web object model is:

This section describes a number of the key technologies that attemptto address parts of these problems. Several of these technologies willbe used as the basis of an approach, described in Section 3, which integratesthem to support a Web object model.


The following subsections describing the various technologies are insome cases rather long, and include a great deal of text and specific examplestaken from the cited references. The purpose in doing this is to provideenough detail in one place to illustrate key concepts and the roles theymight play in supporting a Web object model, and to give the reader a feelfor how generalizations of the concepts might be developed. Hence, thisreport makes no claims of originality for most of this material (and readersshould refer to the cited sources for further details). The subsectionsalso include some additional commentary highlighting key points, and establishing"forward references" to later material.

Several of the sections describe ongoing activities of the World WideWeb Consortium (W3C), particularly:

The reader should be aware that in many cases these specifications areworks in progress. As a result, some of the details described in this report,as well as the source references, may no longer be completely accurate(or accessible due to changed URLs) by the time the report is read. Thelatest information on these activities can be obtained through the mainW3C Web page <>or W3C's technical report page <>.

2.1 Structured Data Representationsand "Lightweight Object Models"

The Introduction briefly described HTML's limitations in supportingthe data structure requirements of more complex Web applications. HTMLwas adequate as long as what applications were generally doing was simplydisplaying pages to users. However, more complex applications require programsto be able to recognize and process parts of Web pages that have specificsemantic meaning within the application. In some cases, applications requiredata that has a well-defined, fixed format (such as an invoice or otherform). Even if applications don't require such fully regular structures,they often need the ability to identify specific pieces of a page's contents.For example, a document may not have a fixed number of authors, but itis still important to be able to identify the strings of text that correspondto authors' names. In some cases, these "pieces" would correspondto specific fields in records, such as "author". In other cases,they would correspond to specific relationships (e.g., a "citation"link to a related paper).

These are the same structuring requirements that apply to object statein object models; i.e., an object's state must be structured in such away that the object methods can find the parts of the state that they needin order to execute properly. As compared with HTML, whose tags are primarilyconcerned with how the tagged information is to be presented, satisfyingthis structuring requirement involves some form of semantic markup,i.e., the ability to tag items with names that can be used to identifyitems based (at least to some extent) on their semantics.

This section describes a number of developments directed at dealingwith the problems of providing richer data structuring capabilities forWeb data.

2.1.1 Summary Object Exchange Format (SOIF)

Harvest's Summary Object Interchange Format (SOIF) is a syntax for representingand transmitting descriptions of (metadata about) Internet resources suchas files, sites, Web pages, etc., as well as other kinds of structuredobjects (see Internetdraft: CIP Index Object Format for SOIF Objects <>).SOIF is based on a combination of the Internet Anonymous FTP Archives (IAFA)IETF Working Group templates and BibTeX. Each resource descriptionis represented in SOIF as a list of attribute-value pairs (e.g., Company= 'Netscape'). SOIF handles both textual and binary data as values, and,with some minor extensions, multivalued attributes. SOIF also allows bulktransfer of many resource descriptions in a single, efficient stream. ASOIF stream contains one or more SOIF objects, each of which contains thestructured content of a resource description. An example SOIF object mightbe:

@DOCUMENT { Title{20}: Welcome to Netscape! Last-Modified{29}: Thu, 16 May 1996 11:45:39 GMT }

Resource Description Messages(RDM) <>, 24 July 1996, by DarrenHardy (Netscape), is a technical specification of Resource DescriptionMessages (RDM). RDM is used in Netscape's Catalog Server. RDM is a mechanismto discover and retrieve metadata about network-accessible resources, knownas Resource Descriptions (RDs). A Resource Description consists of a listof attribute-value pairs (e.g., Author = Darren Hardy, Title = RDM)and is associated with a resource via a URL. Agents can generate RDs automatically(e.g., a WWW robot), or people can write RDs manually (e.g., a librarianor author). Once a repository of Resource Descriptions is assembled, theserver can export it via RDM as a programmatic way for WWW agents to discoverand retrieve the RDs.

RDM uses Harvest's SOIF format to encode the RDs. The data model thatSOIF provides is a flat name space for the attributes, and treats all valuesas blobs. The RDM schema definition language extends the SOIF data modelby providing:

SOIF illustrates a theme that will be repeated in other Web-relatedstructured data representations discussed here: the representation of dataas semantically tagged data items (attribute/value pairs),where the tags or attribute names convey something of the meaning of theassociated data value. A key advantage of an approach based on individualattribute/value pairs is that, unlike a database-like "typed record"approach, it is arbitrarily extensible in a federated environment likethe Web (without a centralized collection of types or schema). Anyone canrecord any attributes they feel are necessary, without going through the"overhead" of defining a new type (and, in particular, possiblyhaving to define it as a subtype of an existing type), and distributingthat type definition throughout a distributed network.

However, while SOIF supports attribute/value pairs, its structuringcapabilities are not sufficiently rich to support the full structuringrequirements of the Web. For example, it lacks support for nested structures,and cannot support the functionality of HTML, let alone extensions to it.It is also not well integrated with more advanced developments in Web datarepresentation, such as XML, RDF, and DOM, described later.

2.1.2 Object Exchange Model (OEM)

Stanford's Object Exchange Model (OEM) [PGW95, AQMW+96] is a "lightweightobject model" developed to act as a general model capable of representingboth database and Web data structures. A similar model, developed at theUniversity of Pennsylvania, is described in [BDHS96, BDFS97]. OEM was introducedin TSIMMIS (The Stanford-IBM Manager of Multiple Information Sources) asa self-describing way of representing metadata. OEM was later modifiedfor use in the Lore (Lightweight Object Repository) system. OEM existsin two main variants. In the original (TSIMMIS) version, OEM defines aset of labeled nodes. Each node has an object identifier (oid),a label, a type, and a value (the type defines the type of the value).The types include primitive types such as integer, and set. If thetype is set, the value consists of a set of oids of other nodes.This allows aggregate structures to be defined. These structures are shownin the figure below.

original (TSIMMIS) OEM:

+-----+-------+------+-------+| oid | label | type | value | type includes "set"+-----+-------+------+-------++-----+-------+------+-----------------+| oid | label | set  | {oid, oid, ...} | +-----+-------+------+-----------------+

In the newer (Lore) version of OEM, the structures have been modifiedso that edges are labeled rather than nodes. In this scheme, a complexobject consists of a set of (label,oid) pairs. These effectively representrelationships between the containing object and the target object. Thatis, a given (label,targetoid) pair contained in object sourceobject representsthe relationship

label(sourceobject, targetobject)

This revised structure thus more closely resembles a first order logic(FOL) structuring of data. These structures are shown in the figure below.

new (Lorel) OEM:

atomic object+-----+------+-------+| oid | type | value | +-----+------+-------+complex object+-----+---------+-------------------------------------------+| oid | complex | value = {(label, oid), (label, oid), ...} | +-----+---------+-------------------------------------------+

Since individual objects do not have labels in this scheme, additionallabels are introduced so that top-level objects can also have names.

As an example, a simple structure for information on books in a librarymight have the following structure in the TSIMMIS OEM:

+----+---------+------+---------------+| &1 | library | set  | {&2, &5, ...} | +----+---------+------+---------------++----+------+------+----------+| &2 | book | set  | {&3, &4} | +----+------+------+----------++----+--------+--------+-----+| &3 | author | string | Aho | +----+--------+--------+-----++----+-------+---------+-----------+| &4 | title | string  | Compilers | +----+-------+---------+-----------+

Linearly, this might be represented as:

<&1, library, set, {&2,&5,...} ><&2, book, set, {&3,&4} ><&3, author, string, Aho ><&4, title, string, Compilers >

In the Lorel OEM, the same structure would be:

         +----+------+-----------------------------+library: | &1 | set  | {(book,&2), (book,&5), ...} |          +----+------+-----------------------------+         +----+------+---------------------------+         | &2 | set  | {(author,&3), (title,&4)} |          +----+------+---------------------------+         +----+--------+-----+         | &3 | string | Aho |          +----+--------+-----+         +----+---------+-----------+         | &4 | string  | Compilers |          +----+---------+-----------+

OEM can represent complex graph structures, similar to those that existin the Web. It is a "lightweight" object model in the sense that:

OEM and related models effectively define global models for a federateddatabase system, where the federated components include unstructured orsemistructured data sources such as the Web (unlike the more conventionalstructured database sources usually considered in federated database systems).These models provide a valuable core of ideas for applying database conceptsto Web data. As the examples illustrate, OEM is based on the use of attribute/valuepairs. This is important in allowing the individual components of Web resourcesto be recognized and accessed in a meaningful way by applications. In addition,OEM extends the basic attribute/value pair model by providing each pairwith its own identifier. This is important in allowing complex nested andgraph structures to be defined. It is also potentially important in allowingadditional descriptive information (metadata) to be directly associatedwith the pairs (e.g., to describe an attribute's meaning more fully). However,this latter idea has not directly followed up in the OEM-related papersreviewed.

While these models are intended to represent data in (or extracted from)Web and other resources, and hence constitute a form of metadata, the capabilitiesof these models for representing metadata that might already exist abouta resource, and for representing their own metadata, are somewhat undeveloped.They do not explicitly consider capturing type and schema information whereit exists, or linking that type information to the structures it describes.For example, when OEM is used to capture a database structure, a schemaactually exists for this data, unlike Web resources. It should be possibleto capture both the data and the schema in OEM, and link them together.This is not really followed up in existing OEM work (although it couldbe). Related work has been done on a concept called DataGuides [GW97,NUWC97]. A DataGuide resembles a schema, but is derived dynamically asa summary of the structures that have been encountered, and only approximatelydescribes the structures that may actually be encountered. This is appropriatefor unstructured and semistructured data, but does not fully representthe semantics of an actual schema.

These models as currently implemented are also not well integrated withemerging Web technologies, such as the XML, DOM, and RDF work describedbelow, that are likely to change the basic nature of the Web's representation.The approach taken in work such as OEM has so far assumed that the Webwill continue to be largely unstructured or semistructured, based on HTML,and that data from the Web will need to be extracted into separate OEMstructures (or interpreted as if it had been) in order perform database-likemanipulations on it. On the other hand, the new Web technologies providea higher level, more semantic representational structure, which can startwith the assumption that information authors themselves have support toprovide more semantic structural information. Our work on a Web objectmodel is based on the idea that, with this additional representation support,it makes sense to investigate building more database-like capabilitieswithin the Web infrastructure itself, rather than assuming that almostall of these database capabilities need to be added externally. Since Webstructures are unlikely to become as regular as conventional databases,some of the principles developed by work such as OEM will continue to beimportant (and, in fact, as a model, OEM has many similarities withwork such as RDF described later in this report). However, it seems likelythat these principles will need to be applied in the context of representationssuch as XML and DOM, used directly as the basis of an enhanced Web infrastructure.

2.1.3 Knowledge Interchange Format (KIF)

The Knowledge InterchangeFormat <> provides a commonmeans of exchanging knowledge between programs with differing internalknowledge representation techniques. It is human-readable, with declarativesemantics. It can express first-order logic (FOL) sentences, with somesecond-order capabilities. Translators exist to convert various knowledgerepresentation languages to and from KIF. A simple example of KIF in representinginformation about an ontology (from [BBBC+97]) is:

ontology(o_857)ontology_name(o_857,'healthcare')ontology_frame(o_857,f_123)frame(f_123)frame_name(f_123,'encounter_drg')slot(s_345)frame_slot(f_123,s_345)slot_name(s_345,'patient_age')constraint(c_674)slot_constraint(s_345,c_674)constraint_expression(c_674,[[gt,'patient_age',43]  [lt,'patient_age',75]]]

The example illustrates that the KIF representation of data is basedon the use of attribute/value pairs; in fact, this is a direct representationof the way this information might be expressed in first-order logic. Thisalso illustrates the fact that a FOL representation necessarily introducesa number of "intermediate" object identifiers (like o_857and f_123), in order to assert the identity of distinct concepts,and to represent relationships among the various parts of the description.This is similar to the way that OEM introduces identifiers for the individualparts of a resource description. The KIF example particularly illustratesthe use of such identifiers in defining namespaces like frames or ontologies,which qualify contained information.

Like OEM, KIF is capable of representing arbitrary graph structures.Moreover, KIF illustrates the importance of identifying parts of a datastructure representation with logical assertions in conveying semanticsbetween applications. Section 3 will describe how this principle servesthe basis of a formal Web object model definition. However, while KIF iswidely used for knowledge interchange, it, like OEM, is not well integratedwith emerging Web infrastructure technologies.

2.1.4 Extensible Markup Language (XML)

The Extensible Markup Language(XML) <>, is an ongoing effort within the WorldWide Web Consortium (W3C). XML is a data format for structured documentinterchange on the Web. More specifically, XML defines a simple subsetof SGML (the Standard Generalized Markup Language [ISO86]; see also, e.g.,[DeR97]), and is intended to make it easy to use SGML on the Web. XML isextensible because unlike HTML, which defines a fixed set of tags, XMLallows the definition of customized markup languages with application-specifictags, e.g., <AUTHOR> or <QTY-ON-HAND>, forexchanging information in particular application domains such as chemistry,electronics, or general business. Hence, XML is really a metalanguage(a language for describing languages).

Because authors and providers can design their own document types usingXML, browsers can benefit from improved facilities, and applications canuse tailored markup to process data. As a result, XML provides direct supportfor using application-specific tagged data items (attribute/value pairs)in Web resources, as opposed to the current need to use ad hoc encodingsof data items in terms of HTML tags. [KR97] provides a useful overviewof the potential benefits of using XML in Web-related applications.

Although XML could eventually completely replace HTML, XML and HTMLare expected to coexist for some time. In some cases, applications maywish to define entirely separate XML documents for their own processing,and convert the XML to HTML for display purposes. Alternatively, applicationsmay wish to continue using HTML pages as their primary document format,embedding XML within the HTML for application-specific purposes. For example,[Hop97] describes the use of blocks of XML markup enclosed by <XML>and </XML> tags within an HTML document for this purpose.

XML has considerable industry support, e.g., from Netscape, Microsoft,and Sun. For example, Microsoft has built an XML parser into Internet Explorer4.0 (which uses XML for several applications), has made available XML parsersin Java and C++, together with links to other XML tools (see,and has indicated that it will use XML in future versions of MicrosoftOffice products. Microsoft has also contributed to a number of proposalsto W3C on the use of XML as a base for various purposes (some of whichwill be discussed in later sections). Netscape has said it will supportXML via the Meta Content Framework (described in Section 2.2) in a futureversion of its Communicator product. Work is also underway on tying XMLto Java in a number of ways. Other commercial vendors are also developingXML-related software tools. In addition, a number of XML tools are availablefor free non-commercial use. A list of some of these tools is availableat the W3C XML Web page identifiedabove.

A number of industry groups have defined SGML Document Type Definitions(DTDs) for their documents (e.g., the U.S. Defense Department, which requiresmuch of its documentation to be submitted according to defined SGML DTDs);in many cases these could either be used with XML directly, or convertedin a straightforward fashion. Work is already underway to define XML-baseddata exchange formats in both the chemical and healthcare communities.Work has also been done on other applications of XML, e.g., an OntologyMarkup Language (OML) <>for representing ontologies in XML.

The W3C XML specification has several parts:

A DTD is usually a file (or several files together) which contains aformal definition of a particular type of document. This acts like a databaseschema, and defines what names can be used for elements, where they mayoccur (e.g., <ITEM> might only be meaningful inside <LIST>),and how they all fit together. The DTD lets processors parse a documentand identify where each elements belongs, so that stylesheets, browsers,search engines, and other applications can be used. The linking of resourceswith the DTDs that describe them is similar to the association of a databaserecord with its schema type, and to the association of an object with itstype or class definition.

An XML document may be either valid or well-formed. Avalid XML document is well-formed, and has a DTD. The document beginswith a declaration of its DTD. This may include a pointer to an externaldocument (a local file or the URL of a DTD that can be retrieved over thenetwork) that contains a subset of the required markup declarations (calledthe external subset), and may also include an internal subsetof markup declarations contained directly within the document. The externaland internal subsets, taken together, constitute the complete DTD of thedocument. The DTD effectively defines a grammar which defines a class ofdocuments. Normally, the bulk of the markup declarations appear in theexternal subset, which is referred to by all documents of the same class.If both external and internal subsets are used, the XML processor mustread the internal subset first, then the external subset. This allows theentity and attribute declarations in the internal subset to take precedenceover those in the external subset (thus allowing local variants in documentsof the same class). XML DTDs can also be composed, so that new documenttypes can be created from existing ones.

A well-formed XML document can be used without a DTD, but mustfollow a number of simple rules to ensure that it can be parsed correctly.These rules require, among other things, that:

The general characteristics of XML can be illustrated using an exampleof a document that maintains a list of people's electronic business cards(this example is modified from one in [KR97], and is not necessarily consistentwith the details of the latest XML specification). Each business card containsthe person's first name, last name, company, email address, and Web pageaddress. There is more than one way to represent attribute-value styledata in XML. One approach is to specify the attributes as the "attributes"of XML tags. In this case, the document contains only tags annotated withattribute-value pairs, and there is no content in the document other thanthe tags themselves (which can be parsed and processed by applications).Using this approach, an example document would be:

<!DOCTYPE bCard ""><bCard><?xml default bCard      firstname = ""      lastname  = ""      company   = ""      email     = ""      webpage   = ""?><bCard      firstname = "Frank"      lastname  = "Manola"      company  =  "Object Services and Consulting"      email     = ""      webpage  =  ""/><bCard      firstname = "Craig"      lastname  = "Thompson"      company  =  "Object Services and Consulting"      email     = ""      webpage  =  ""/></bCard>

The default specification ensures that every tag has the same numberof attribute-value pairs.

An alternative representation uses different tags, rather than XML attributes,to identify the meaning of the content. Using this approach, the same contentwould be represented as:


The paper XML representationof a relational database <> usesa relational database as a simple example of how to represent more complexstructured information in XML. A relational database consists of a setof tables, where each table is a set of records. A recordin turn is a set of fields and each field is a pair field-name/field-value.All records in a particular table have the same number of fields with thesame field-names. This description suggests that a database could be representedas a hierarchy of depth four: the database consists of a set of tables,which in turn consist of rows, which in turn consist of fields.The following example, taken from the cited paper, describes a possibleXML representation of a single database with two tables:

<!doctype mydata ""><mydata><authors><author><name>Robert Roberts</name><address>10 Tenth St, Decapolis</address><editor>Ella Ellis</editor><ms type="blob">ftp://docs/rr-10</ms><born>1960/05/26</born></author><author><name>Tom Thomas</name><address>2 Second Av, Duo-Duo</address><editor>Ella Ellis</editor><ms type="blob">ftp://docs/tt-2</ms></author><author><name>Mark Marks</name><address>1 Premier, Maintown</address><editor>Ella Ellis</editor><ms type="blob">ftp://docs/mm-1</ms></author></authors><editors><editor><name>Ella Ellis</name><telephone>7356</telephone></editor></editors></mydata>

The representation is human-readable, but fairly verbose (since XMLin general is verbose). However, it compresses well with standard compressiontools. It is also easy to print the database (or a part of it) with standardXML browsers and a simple style sheet.

The database is modeled with an XML document node and its associatedelement node:

<!doctype name "url">
table 2
table n

The name is arbitrary. The url is optional, but can beused to point to information about the database. The order of the tablesis also arbitrary, since a relational database defines no ordering on them.Each table of the database is represented by an XML element node with therecords as its children:


The name is the name of the table. The order of the records isarbitrary, since the relational data model defines no ordering on them.A row is also represented by an element node, with its fields as children:


The name is the name of the row type (this was not required inthe original relational model, but the current specification allows definitionof row types); the name is required in XML anyway. The order of the fieldsis arbitrary. A field is represented as an element node with a data nodeas its only child:

<name type="t">

If d is omitted, it means the value of the fields is the emptystring. The value of t indicates the type of the value (such asstring, number, boolean, date). If the type attribute is omitted, the typecan be assumed to be `string.'

This example illustrates that XML tags can (and will) represent conceptsat multiple levels of abstraction. The example defines a specific four-levelhierarchy, but does not explicitly define the relational model and indicatethe hierarchical relationships among the various relational constructs.In order to do this in a generic way for all relational databases, therewould need to be explicit tags such as <SCHEMA>, <TABLE>,<ROW>, etc., and a specification of how they should be nested.This is metalevel information as far as the XML representation is concerned,and could be specified in the DTD. The definition of models, such as therelational model, for organizing data for specific purposes, is independentof XML, and needs to be done separately. The definition of such models(in some cases using XML as their representation) is discussed in the nextsection.

An XML document consists of text, and is basically a linearization ofa tree structure. At every node in the tree there are several characterstrings. The tree structure and character strings together form the informationcontent of an XML document. Some of the character strings serve to definethe tree structure; others are there to define content. In addition tothe basic tree structure, there are mechanisms to define connections betweenarbitrary nodes in the tree. For example, in the following document thereis a root node with three children, with one of the children containinga link to one of the other children:

<p><q id="x7">The first child of type q</q><q id="x8">The second child of type q</q><q href="#x7">The third child of type q</q></p>

In this case, the third child contains an href attribute whichpoints to the first child, using its id attribute as an identifier.

The XML linking model is described in the XLLdraft <>. The full hypertext linkingcapabilities of XML are much more powerful than those of HTML, and arebased on more powerful hypertext technology such as described in HyTime[ISO92] <> and the TextEncoding Initiative (TEI) <>. Thecurrent specification supports both conventional URLs, and TEI extendedpointers. The latter provide support for bidirectional and multi-way links,as well as links to a span of text (i.e., a subset of the document) withinthe same or other documents.

XSL <>is a submission defining stylesheet capabilities for XML documents. XMLstylesheets enable formatting information to be associated with elementsin a source document to produce formatted output. XML stylesheet capabilitiesare based on a subset of those defined in the ISO standard Document StyleSemantics and Specification Language (DSSSL) [ISO96] used in formattingSGML documents. The formatted output is created by formatting a tree offlow objects. A flow object has a class, which represents a kindof formatting task, together with a set of named characteristics, whichfurther specify the formatting. The association of elements in the sourcedocument tree to flow objects is defined using construction rules.A construction rule contains a pattern to identify specific elementsin the source tree, and an action to specify a resulting subtreeof flow objects. The stylesheet processor recursively processes sourceelements to produce a complete flow object tree which defines how the documentis to be presented.

The XML working group is also currently developing a Namespacefacility <> that willallow Generic Identifiers (tag names) to have a prefix which will makethem unique and will prevent name clashes when developing documents thatmix elements from different schemas. This facility allows a document'sprolog to contain a set of Processing Instructions (an SGML concept) ofthe form:

<?xml:namespace name="some-uri" as="some-abbreviation"?>

for example

<?xml:namespace name="" as="RDF"?> <?xml:namespace name="" as="DC"?>

Elements in the document may then use generic identifiers of the form<RDF:assertions> or <DC:Title>. Those element names would expandto URIs such as This workis still under development, and the details of the final specificationmay differ from those described here.

XML provides basic tagged value support, as well as support for nesting,and enhanced link capabilities. Because the Web community is increasinglytargeting XML as its "next generation Web representation", theWeb object model described in Section 3 uses XML as its basic representationof object state. However, additional concepts must also be defined to applyXML to extended data and metadata structuring requirements, and particularlythe requirements for a Web object model that go beyond a richer state representation.Some of these requirements are illustrated both by the relational databaseexample above, and by the RDF and related efforts described in the nextsection. These efforts generally involve defining data model concepts forrepresenting specific kinds of data (as the relational model does for databasedata), and then using the tagged value structures supported by XML as theirrepresentation. These models support various ways of using identifier concepts(URLs plus other identifier concepts) to provide support for graph structureddata. An additional general requirement, not generally addressed by Web-relatedactivities, is the definition of structured database capabilities (e.g.,an algebra or calculus to serve as the basis for database-like query andview facilities for XML data).

2.2 Higher-Level Models and Metadata

Richer representation techniques for Web information, such as XML, arean important component in making the Web an improved basis for enhancedapplications of all kinds. However, additional structure must also be defined.For example, XML provides support for the representation of data in termsof attribute/value pairs, with user-defined tags. However, this alone willnot provide for easy interchange of information, and interoperability amongcomponents since, using XML, different users could define their own waysof using attribute-value pairs to represent the same (or the same typeof) information. Thus, there is also a need to define additional characteristicsof what to represent using representations such as XML.

A data model defines one level of "what to represent". Forexample, the relational data model defines structuring concepts such asrows and tables, and provides one basic organizational framework for representingdata. The example from the previous section of how to represent relationaldata in XML illustrated how using the relational model imposed additionalstructure on the XML representation. Defining a data model for data representedin XML both suggests specific structuring concepts for using XML to organizedata, and may also involve the specification of certain standard tags orattributes (like <TABLE>) to reflect those concepts. Useof particular data models (represented using techniques such as XML) regularizesthe structures that may be encountered, and potentially simplifies thetask of applications that process those structures.

An additional level of "what to represent" is provided bystandardizing the use of domain-specific attribute/value pairs and documentstructures (e.g., standards for specific kinds of reports or forms). SGMLand XML DTDs constitute one way to specify such standards, and there arealready numerous SGML DTDs in use for this purpose (these could, in mostcases, be easily adapted for use with XML).

An important source of efforts to develop such higher-level model specificationsfor use on the Web has been work on developing representation techniquesfor Web metadata, i.e., data available via the Web that describesor helps interpret either other Web resources or non-Web resources. Thismetadata is used both to facilitate Web searches for relevant data, andto either provide direct access to it (if it is Web-accessible) or at leastindicate its existence and possibly describe how to obtain it. The reasonwhy the development of metadata representations has driven the developmentof higher-level models is that the metadata is intended to support indexing,searching, and other automated processes that require more structure thanmay be present in the data itself. Metadata requirements have also driventhe development of structured representations themselves. For example,the SOIF format described in Section 2.1.1 was developed to represent Webmetadata.

Efforts to develop enhanced metadata capabilities have involved severaltypes of activity (a given effort may bundle more than one of them):

Web data/metadata models defined "on top of" representationssuch as XML are relevant to the development of a Web object model in helpingto further define an adequate basis for representing object state. In addition,these models are also relevant in helping to identify ways to establishrelationships between the object state and the specified pieces of codethat serve as object methods. This is based on the idea that an "object"is basically a piece of state with some attached (or associated) programs(methods). For example, a Smalltalk object consists of a set of state variables(data), together with a pointer (link) to a class object which containsits methods. The link between an object and its class is essentially ametadata link, since the class methods are used to help interpret the data.In the Web environment, the idea is that objects can be constructed byenhancing Web resources with additional metadata that allows the resourcesto be considered as objects in some object model. This concept will bedeveloped further in Section 3, but is mentioned here to further explainthe role that metadata structuring principles will play in the developmentof a Web object model.

2.2.1 Dublin Core

The Dublin Core<> is a set of specificmetadata attributes originally developed at the March 1995 Metadata Workshopin Dublin, Ohio. The set has subsequently been modified on the basis oflater Dublin Core Metadata Workshops. The goal of the Dublin Core is todefine a minimal set of descriptive elements that facilitate the descriptionand the automated indexing of document-like networked objects. The Coremetadata set is intended to be suitable for use by resource discovery toolson the Internet, such as the "WebCrawlers" employed by popularWorld Wide Web search engines (e.g., Lycos and Alta Vista). In addition,the core is meant to be sufficiently simple to be understood and used bythe wide range of authors and casual publishers who contribute informationto the Internet. The Dublin Core reflects input from a wide range of communitiesinterested in metadata, including both the Internet and Digital Librarycommunities. The elements of the Dublin Core (as of November 1997) aregiven below. The DublinCore Reference Description <>contains the current definition.

TITLE: The name given to the resource by the CREATOR or PUBLISHER.
CREATOR: The person(s) or organization(s) primarily responsiblefor the intellectual content of the resource.
SUBJECT: Keywords or phrases that describe the subject or contentof the resource. The intent is to use controlled vocabularies and keywords,so the element might include scheme-qualified classification data (forexample, Library of Congress Classification Numbers) or scheme-qualifiedcontrolled vocabularies (such as MEdical Subject Headings).
DESCRIPTION: A textual description of the content of the resource,such as document abstracts or content descriptions of visual resources.This could be extended to include computational content description (e.g.,spectral analysis of a visual resource). In this case this field mightcontain a link to the description rather than the description itself.
PUBLISHER: The entity responsible for making the resource availablein its present form.
CONTRIBUTORS: Person(s) or organization(s) in addition to thosespecified in the CREATOR element who have made significant intellectualcontributions to the resource.
DATE: The date the resource was made available in its presentform.
TYPE: The category of the resource, such as home page, novel,poem, working paper, etc. It is expected that RESOURCE TYPE will be chosenfrom an enumerated list of types that is under development. See current thinking on the application of this element.
FORMAT: The data representation of the resource, such as text/html,ASCII, Postscript file, executable application, or JPEG image (as wellas non-electronic media). FORMAT will be assigned from an enumerated listthat is under development.
IDENTIFIER: String or number used to uniquely identify the resource.Examples for networked resources include URLs and URNs (when implemented).Other globally-unique identifiers,such as International Standard Book Numbers(ISBN) or other formal names would also be candidates for this element.
SOURCE: The work, either print or electronic, from which thisresource is derived, if applicable.
LANGUAGE: Language(s) of the intellectual content of the resource.Where practical, the content of this field should coincide with RFC 1766.See:
RELATION: Relationship to other resources, for example, imagesin a document, chapters in a book, or items in a collection. A formal specificationof RELATION is currently under development.
COVERAGE: The spatial locations and temporal durations characteristicof the resource. Formal specification of COVERAGE is currently under development.
RIGHTS: A link (e.g., a URL or other suitable URI as appropriate)to terms and conditions, copyright statements, or similar information.A formal specification is currently under development.

In addition to enumerating these data elements, the Dublin Workshopreport specified a number of underlying principles that apply to the entirecore metadata set.

These principles illustrate a number of requirements in a general metadatamodel, including:

These same principles are illustrated in a number of the specific metadatamodels described later in this section, such as MCF and RDF.

2.2.2 Warwick Framework

The WarwickFramework <>defines a container architecture that builds on the Dublin Core results.It is a mechanism for aggregating distinct packages of metadata,allowing multiple, separately-managed metadata sets to be defined, managed,and associated with the resources they describe. The report also describesproposals for representing Warwick Framework structures using HTML, MIME,SGML, and a distributed object architecture. (See also the overview papersat

The Warwick Framework has two fundamental components: packages,which are typed metadata sets, and containers, which are the unitsfor aggregating packages.

A container may be either transient or persistent. In its transientform, it exists as a transport object between and among repositories, clients,and agents. In its persistent form, it exists as a first-class object inthe information infrastructure. That is, it is stored on one or more serversand is accessible from these servers using a globally accessible identifier(URI). A container may also be wrapped within another object (i.e., onethat is a wrapper for both data and metadata). In this case the "wrapper"object will have a URI rather than, or in addition to, the metadata containeritself.

Independent of the implementation, the only operation defined for acontainer is one that returns a sequence of packages in the container.There is no provision in this operation for ordering the members of thissequence and thus no way for a client to assume that one package is moresignificant or "better" than another.

Each package is a typed object; its type may be determined after accessby a client or agent. Packages are of three types:

+--------------------+|      container     ||                    ||  +---------------+ ||  |   package     | ||  | (Dublin Core) | ||  +---------------+ ||  +---------------+ ||  |   package     | ||  | (MARC Record) | ||  +---------------+ |       +------------------------+|  +---------------+ | URI   |       package          ||  |   package     |-+------>| (terms and conditions) ||  |  (indirect)   | |       +------------------------+|  +---------------+ |+--------------------+

Figure 1- Metadata container with three packages (one indirect)

Figure 1 illustrates a simple example of a Warwick Framework container.The container in this example contains three logical packages of metadata.The first two, a Dublin Core record and a MARC record, are contained withinthe container as a pair of packages. The third metadata set, which definesthe terms and conditions for access to the content object, is referencedindirectly via a URI in the container (the syntax for terms and conditionsmetadata and administrative metadata is not yet defined).

The mechanisms for associating a Warwick Framework container with acontent object (i.e., a document) depend on the implementation ofthe Framework. The proposed implementations discussed in the cited referenceillustrate some of the options. For example, a simple Warwick Frameworkcontainer may be embedded in a document, as illustrated in the HTML implementationproposal; or an HTML document can include a link to a container storedas a separate file. On the other hand, as illustrated in the distributedobject proposal, a container may be a logical component of a so-calleddigital object, which is a data structure for representing networked objects.

The reverse linkage, which ties a container to a piece of intellectualcontent, is also relevant, since anyone can create descriptive data fora networked resource, without permission or knowledge of the owner or managerof that resource. This metadata is fundamentally different from the metadatathat the owner of a resource chooses to link to or embed with the resource.As a result, an informal distinction is made between two categories ofmetadata containers, which both have the same implementation:

One of the motivations for the development of the Warwick Frameworkwas a recognition that, even if attention is restricted to metadata fordescriptive cataloging (the subject of the Dublin Core), many differentformats for such metadata have been defined (including specialized formsfor particular kinds of data, such as geospatial data), and techniquesmust be defined for organizing the metadata about an object that may appearin these multiple forms.

Another motivation was the recognition that there are many other kindsof metadata besides that used for descriptive cataloging that may needto be recorded and organized. These kinds of metadata include, among others:

The Warwick Framework illustrates a number of very basic structuralrequirements and options that must be supported in representing metadata,and linking it with the data it describes. Like the principles reflectedin the Dublin Core, the Warwick Framework principles are illustrated ina number of the specific metadata models described later in this section,such as the RDF. For example, RDF assertions (see below) correspondclosely to Warwick Framework packages, and the various means provided forassociating RDF assertions with the resources they describe support optionsidentified in the Warwick Framework.

2.2.3 PICS and PICS-NG

PICS (Platform for Internet ContentSelection) <> is an infrastructure forassociating labels (metadata) with Internet content. It was originallydesigned to help parents and teachers control what children access on theInternet, but it also facilitates other uses for labels, including codesigning, privacy, and intellectual property rights management. PICS currentlydefines the following recommendations:

  1. Rating Services andRating Systems (and Their Machine Readable Descriptions) <>(earlier version in World Wide Web Journal, 1(4), Fall 1996, 23-43) definesa language for describing rating services and systems. Software programswill read service descriptions written in this language, in order to interpretcontent labels and assist end-users in configuring selection software.
  2. PICS Label Distribution-- Label Syntax and Communication Protocols <>(earlier version in World Wide Web Journal, 1(4), Fall 1996, 45-69) specifiesthe syntax and semantics of content labels and HTTP-related protocol(s)for distributing labels as part of PICS.
  3. PICSRules 1.1 <>defines a language for writing profiles, which are filtering rules thatallow or block access to URLs based on PICS labels that describe thoseURLs.

In PICS, a rating service is an individual or organization thatprovides content labels for resources on the Internet. The labels it providesare based on a rating system. Each rating service must describeitself using a PICS-defined MIME type application/pics-service.Selection software that relies on ratings from a PICS rating service canfirst load the application/pics-service description. This descriptionallows the software to tailor its user interface to reflect the detailsof a particular rating service.

Each rating service picks a URL as its unique identifier, and includesthis unique identifier in all content labels the service produces. It isintended that the URL, in addition to simply being a unique identifier,also refer to an HTML document which describes both the rating service,but also the rating system used by the service (possibly via a link toa separate document).

A rating system specifies the dimensions used for labeling, thescale of allowable values for each dimension, and a description of thecriteria used in assigning values. For example, the MPAA rates movies inthe U.S. based on a single dimension with allowable values G, PG, PG-13,R, and NC-17. The current PICS specification allows only floating pointvalues.

Each rating system is identified by a URL. This allows multiple servicesto use the same rating system, and refer to it by its identifier. The URLidentifying a rating system can be accessed to obtain a human-readabledescription of the rating system.

A content label, or rating, contains information abouta document. The format of a content label is defined in the Label Formatdocument referenced above, and has three parts:

A new MIME type application/pics-labels is also defined for transmittingone or more content labels.

When an end-user attempts to access a particular URL, a software filterbuilt into the Web client (browser) fetches the document. The client alsoaccesses the document's content label(s) based on rating systems that theclient has been told to pay attention to. The client then compares thecontent label to the rating-system-specified values that the client hasbeen told to base access decisions on, and either allows or denies accessto the document.

Content labels may be:

The following application/pics-service document (taken from the PICSspecification) describes a simple rating service.

((PICS-version 1.1) (rating-system "") (rating-service "") (icon "icons/gcf.gif") (name "The Good Clean Fun Rating System") (description "Everything you ever wanted to know about soap,cleaners, and related products") (category  (transmit-as "suds")  (name "Soapsuds Index")  (min 0.0)  (max 1.0)) (category  (transmit-as "density")  (name "suds density")  (label (name "none") (value 0) (icon "icons/none.gif"))  (label (name "lots") (value 1) (icon "icons/lots.gif"))) (category  (transmit-as "subject")  (name "document subject")  (multivalue true)  (unordered true)  (label (name "soap") (value 0))  (label (name "water") (value 1))  (label (name "soapdish") (value 2))  (label-only)) (category)  (transmit-as "color")  (name "picture color")  (integer)  (category   (transmit-as "hue")   (label (name "blue")  (value 0))   (label (name "red")   (value 1))   (label (name "green") (value 2)))  (category   (transmit-as "intensity")   (min 0)   (max 255))))

There are four top-level categories in this rating system. Each categoryhas a short transmission name to be used in labels (e.g., "suds");some also have longer names that are more easily understood (e.g., "SoapsudsIndex"). The "Soapsuds Index" category rates soapsuds ona scale between 0.0 and 1.0 inclusive. The "suds density" categorycan have ratings from negative to positive infinity, but there are twovalues that have names and icons associated with them. The name "none"is the same as 0, and "lots" is the same as 1. The "documentsubject" category only allows the values 0, 1, and 2, but a singledocument can have any combination of these values. The "picture color"category has two sub-categories.

A label list is used to transmit a set of PICS labels. The followingis a label list for two documents rated using the above rating system.

(PICS-1.1 ""  by "John Doe"  labels on "1994.11.05T08:15-0500"         until "1995.12.31T23:59-0000"         for ""         ratings (suds 0.5 density 0 color/hue 1)         for ""         by "Jane Doe"         ratings (subject 2 density 1 color/hue 1))

PICS-NG (Next Generation) was a W3C effort based on the observationthat the PICS infrastructure could be generalized to support arbitraryWeb metadata, with PICS categories serving as metadata attributes, havingmeanings defined by the rating system. The W3C paper Catalogs:Resource Description and Discovery <>also observes that the structure of a PICS label is similar to:

The PICS-NG effort defined a Metadata Object Model, and its encodingsin XML and as S-expressions, in the note PICS-NGMetadata Model and Label Syntax <>.This model includes a number of extensions to the original PICS representationscheme, in order to support more general forms of metadata. These extensionsinclude such things as:

Other papers related to this effort include:

The PICS-NG effort has been merged with other work to become W3C's ResourceDescription Framework activity (see Section 2.2.6).

PICS illustrates a number of important ideas in data modeling and metadatarepresentation. One such idea is the definition of specific requireddata items (e.g., category, label) having predefinedmeanings in the model. Such specifications are important in supportinginteroperability among applications that use PICS ratings. PICS also illustratesthe use of metalevel pointers. The URLs that identify rating servicesand rating systems in PICS point to information that describes PICS metadata(i.e., to metametadata). These illustrate the idea that a given piece ofdata on the Web, no matter what its intended purpose (e.g., whether itis intended to represent data or metadata), can itself point to (or berelated in some other way to) data that can be used to help interpret it.Finally, PICS illustrates the use of a metalevel (or reflective)architecture. PICS requires that ordinary requests for data on theWeb be interrupted or intercepted, so that rating information about therequested resource can be retrieved, and a decision made about whetherto return the requested data or not. This same basic idea can be used toenhance individual requests with other types of additional processing,often transparently to users. For example, such processing could be usedto bracket a collection of individual requests to form a database-liketransaction, by adding interactions with a transaction processor to theserequests. Examples of such processing are described in [CM93, Man93, SW96].These same ideas are the basis for current OBJS work on an IntermediaryArchitecture <>for the Web.

As illustrated by the existence of a PICS-NG effort, PICS itself requiresextensions to deal with more general metadata requirements. Some of theseare described further in the discussion of the Resource Description Framework(Section 2.2.6). In addition, in order to provide a complete Web objectmodel, PICS and similar ideas must be augmented with an API providing applicationswith easy access to the state, and with mechanisms to link code to thestate represented using models such as PICS. These aspects will be discussedin subsequent sections.

2.2.4 XML-Data

XML-Data <>is a submission to W3C by Microsoft, ArborText, DataChannel, and INSO.XML-Data defines an XML vocabulary for schemas, that is, for definingand documenting object classes. It can be used either for classes whichare strictly syntactic (for example, XML), or which indicate concepts andrelations among concepts (as used in relational databases, knowledge representationgraphs, and RDF). The former are called "syntactic schemas;"the latter "conceptual schemas."

For example, an XML document might contain a "book" elementwhich lexically contains an "author" element and a "title"element. An XML-Data schema can describe such syntax. However, in anothercontext, it may simply be necessary to represent more abstractly that bookshave titles and authors, irrespective of any syntax. XML-Data schemas canalso describe such conceptual relationships. Further, the information aboutbooks, titles and authors might be stored in a relational database, inwhich case XML-Data schemas can describe the database row types and keyrelationships.

One immediate implication of the ideas in XML-Data is that, using XML-Data,XML document types can be described using XML itself, rather than DTD syntax.Another is that XML-Data schemas provide a common vocabulary for ideaswhich overlap between syntactic, database and conceptual schemas. All featurescan be used together as appropriate.

Schemas in XML-Data are composed principally of declarations for:

The following simple example taken from the XML-Data submission showssome data about books and authors, and the XML-Data schema which describesit (note the use of the XML Namespace facility, described in Section 2.1.4,for qualifying names).

Some data:

<?xml:namespace name="" as="bk"/><?xml:namespace name="" as="ecom" ?><bk:booksAndAuthors>    <Person>        <name>Henry Ford</name>        <birthday>1863</birthday>    </Person>    <Person>        <name>Harvey S. Firestone</name>    </Person>    <Person>        <name>Samuel Crowther</name>    </Person>    <Book>        <author>Henry Ford</author>        <author>Samuel Crowther</author>        <title>My Life and Work</title>    </Book>    <Book>        <author>Harvey S. Firestone</author>        <author>Samuel Crowther</author>        <title>Men and Rubber</title>        <ecom:price>23.95</ecom:price>    </Book></bk:booksAndAuthors>

The schema for

<?xml:namespace name="urn:uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882/" as="s"/?><?xml:namespace href="" as="ecom" ?><s:schema>    <elementType id="name">        <string/>    </elementType>    <elementType id="birthday">        <string/>        <dataType dt="date.ISO8601"/>    </elementType>    <elementType id="Person">        <element type="#name" id="p1"/>        <element type="#birthday" occurs="OPTIONAL">            <min>1700-01-01</min><max>2100-01-01</max>        </element>        <key id="k1"><keyPart href="#p1" /></key>    </elementType>    <elementType id="author">        <string/>        <domain type="#Book"/>        <foreignKey range="#Person" key="#k1"/>    </elementType>    <elementType id="writtenWork">        <element type="#author" occurs="ONEORMORE"/>    </elementType>    <elementType id="Book" >        <genus type="#writtenWork"/>        <superType href=""/>        <superType href=""/>        <group groupOrder="SEQ" occurs="OPTIONAL">            <element type="#preface"/>            <element type="#introduction"/>        </group>        <element href=""/>        <element href="ecom:quantityOnHand"/>    </elementType>    <elementTypeEquivalent id="livre" type="#Book"/>    <elementTypeEquivalent id="auteur" type="#author"/></s:schema>

While this example does not illustrate all of the capabilities of XML-Data,it does illustrate the capabilities of declaring such things as:

The submissionshould be referenced for further details and additional examples.

XML-Data is another example of a higher-level model built using XMLas its representation. It is not yet clear how the overlap in metadatacapabilities between such representations as DTDs, RDF, and XML-Data willwork out. The XML-Data approach may prove to be better than DTDs in supportingsome types of processing, such as database-like operations, since it makesno distinctions between data and metadata representations. Like the otherdata models described in this section, XML-Data is not sufficient to forma complete Web object model. In particular, it requires integration withan API facility and a mechanism to access associated code.

2.2.5 Meta Content Framework (MCF)

Netscape's Meta ContentFramework (MCF) <> [GB97] isa proposal for a metadata model based on the increasing need for machine-readabledescriptions of distributed information. MCF is based on the followingprinciples:

The latter point is particularly important. If all applications savetheir data in XML format, this would be somewhat more open than the useof proprietary formats, since any application could access the resultingdocuments. However, in order for applications to meaningfully processthose documents, it would be necessary for the applications to recognizethe various labels and structures used in those documents, and their associatedsemantics. Agreements on data models and vocabularies allow this sort ofmutual recognition of labels and structure among applications, thus supportinginteroperability.

MCF is essentially a structure description language. The basic informationstructure used is the Directed Labeled Graph (DLG). An MCF database isa set of DLGs, consisting of:

Nodes represent things like web pages, images, subject categories, andsites. The labels are nodes that correspond to properties such as sizeor lastRevisionDate used to describe web pages, subject categories,etc., and also to define relations such as hyperlinks, authorship, or parenthood,between these things.

Each label/property type, such as pageSize, is a node (but notall nodes are property types). Since labels are nodes, they can participatein relationships that, e.g., define its semantics. For example, a pageSizenode could have properties that specify its domain (e.g., Document),its range (sizeInBytes), that a Document has only one pageSize,and that provide human-readable documentation of the intended semantics.

An MCF node can be either a primitive data type or a "Unit".The primitive data types are the same as the Java primitive types. In addition,a DATE type should be supported by the low-level MCF machinery. The conceptof a "Unit" corresponds loosely to the Java concept of "Object".

MCF defines a small set of units with predefined semantics in orderto "bootstrap" the type system. These include, among others:

MCF recognizes that, for purposes of interoperability, it would be goodto standardize the vocabulary for commonly-used terms. [GB97] proposessome items for this vocabulary (largely derived from existing standardssuch as the Dublin Core) for describing Web content. [GB97] also definesan XML-based syntax for representing MCF. This essentially defines a typesystem for XML.

Like PICS, MCF illustrates a number of important ideas in data modelingand metadata representation. For example, MCF illustrates both the useof specific required data items having predefined meanings in the model,and metalevel pointers. Unlike PICS, MCF represents a data model that canbe used for more general purposes than content labeling. For example, itincludes a type hierarchy, a richer set of base types, and other aspectsof a full data model. In addition to required data items representing aspectsof the model structure, the MCF reference identifies a list of suggestionsfor standard application-specific item names borrowed from the Dublin Coreand elsewhere. MCF "units" are similar to the individual elementsof the OEM model. Many MCF concepts have been incorporated into W3C's RDF(described in the next section). However, as noted in connection with othermodels in this section, these concepts must be combined with an API anda mechanism for integrating behavior to provide full object model support.

2.2.6 Resource Description Framework (RDF)

The World Wide Web Consortium's ResourceDescription Framework (RDF) effort <>is currently developing a mechanism designed for exchanging machine-understandablemetadata describing Web resources. This type of metadata can be used, e.g.:

The work combines extensions of the PICS technology to support moregeneral metadata requirements with work on metadata models such as Netscape'sMeta Content Framework (MCF) and Microsoft's WebCollections [Hop97]. The currentRDF draft specification <>defines both a data model for representing RDF metadata, and an XML-basedsyntax for expressing and transporting the metadata.

The basis of RDF is a model for representing named propertiesand their values. These properties serve both to represent attributes ofresources (and in this sense correspond to attribute/value pairs) and torepresent relationships between resources. The RDF data model is asyntax-independent way of representing RDF statements.

The core RDF data model is defined in terms of:

  1. a set of Nodes (N)
  2. a set of PropertyTypes (P), a subset of N
  3. a set of 3-tuples T, whose elements are informally known as properties.The first item of each tuple is an element of P, the second item is anelement of N and the third item is either an element of N or an atomicvalue (e.g. a Unicode string).

(thus resembling MCF).

In this data model both the resources being described and the valuesdescribing them are nodes in a directed labeled graph (values maythemselves be resources). The arcs connecting pairs of nodes correspondto the names of the property types. This is represented pictorially as:

     [resource R] ---propertyType P---> [value V]

and can be read "V is the value of the property P for resourceR", or left-to-right; "R has property P with value V". Forexample the statement "John Smith is the Author of the Web page ""would be represented as:

   [] ---author---> "John Smith"

where the notation [URI] denotes the instance of the resource identifiedby URI and "..." denotes a simple Unicode string.

According to the above definition, the property "author",i.e. the arc labeled "author" plus its source and target nodesis the triple (3-tuple):

   {author, [], "John Smith"}

where "author" denotes a node used for labeling this arc.The triple composed of a resource, a property type, and a value is an RDF statement.

A collection of these triples with the same second item is called anassertions. Assertions are particularly useful when describing anumber of properties of the same resource. Assertions are diagramed asfollows:

[resource R]-+---property P1----> [value Vp1]              |              +---property P2----> [value Vp2]

An RDF assertions can be a resource itself and can thereforebe described by properties; that is, an assertions can itself beused as the source node of an arc. The name assertions is suggestive ofthe fact that the properties specified in it are effectively (logical)assertions about the resource being described. This establishes a relationshipbetween RDF and a logic-based interpretation of the data structure whichwill be further developed in Section 3.

Assertions may be associated with the resource they describe in oneof four ways:

  1. the assertions may be contained within the resource (embedded)
  2. the assertions may be external to the resource but supplied by thetransfer mechanism in the same retrieval transaction as that which returnsthe resource (along-with)
  3. the assertions may be retrieved independently from the resource, includingfrom a different source (service bureau)
  4. the assertions may contain the resource (wrapped)

All resources will not support all association methods (e.g., many resourcetypes will not support embedding).

The set of properties in a given assertions, as well as any characteristicsor restrictions of the property values themselves, are defined by one ormore schemas. Schemas are identified by a URL. An assertionsmay contain properties from more than one schema. RDF uses the XML namespacemechanism to associate the schema with the properties in the assertions.The schema URL may be treated merely as an identifier, or it may referto a machine-readable description of the schema. By definition, an applicationthat understands a particular schema used by an assertions understandsthe semantics of each of the contained properties. An application thathas no knowledge of a particular schema will minimally be able to parsethe assertions into the property and property value components, and willbe able to transport the assertions intact (e.g., to a cache or to anotherapplication).

A human- or machine-readable description of an RDF schema may be accessedthrough content negotiation by dereferencing the schema URL. If the schemais machine-readable, it may be possible for an application to dynamicallylearn some of the semantics of the properties named in the schema.

An RDF statement can itself be the target node of an arc (i.e. the valueof some other property) or the source node of an arc (i.e. it can haveproperties). In these cases, the original property (i.e., the statement)must be reified; that is, converted into nodes and arcs. RDF definesa reification mechanism for doing this. Reified properties are drawnas a single node with several arcs emanating from it representing the resource,property name, and value:

   [property P1]-+---PropName---> ["name"]                 |                  +---PropObj----> [resource R]                 |                  +---PropValue--> [value Vp1]

This allows RDF to be used to make statements about other statements;for example, the statement "Joe believes that the document 'The Originof Species' was authored by Charles Darwin" would be diagramed as:

[Joe]--believes-->[stmnt1]+--InstanceOf-> RDF:Property                           |                           +--PropName->"author"                           |                           +--PropObj->[]                           |                           +--PropValue->"Charles Darwin"

To help in reifying properties, RDF defines the InstanceOf relation(property) to provide primitive typing, as shown in the example.

To reify a property, all that is done is to add to the data model anadditional node (with a generated label) and the three triples with firstitems (or arcs with labels) using the predefined names RDF:PropName,RDF:PropObj, and RDF:PropValue respectively, second itemthe generated node label, and third item the corresponding property type,resource node, and value node respectively. In the above example, the threeadded triples would be:

   {PropName, stmnt1, "author"}    {PropObj, stmnt1, []}    {PropValue, stmnt1, "Charles Darwin"}

(The use of the "RDF:" prefix in names illustratesthe use of the XML namespace mechanism to qualify names to indicate theschema in which they are defined.)

Frequently it is necessary to create a collection of nodes; e.g. tostate that a property has multiple values. RDF defines three kindsof collections: ordered lists of nodes, called sequences, unorderedlists of nodes, called bags, and lists that represent alternativesfor the (single) value of a property, called alternatives. To createcollections of nodes, a new node is created that is an RDF:InstanceOfone of the three node types RDF:Seq, RDF:Bag, or RDF:Alternatives.The remaining arcs from that new node point to each of the members of thecollection and are uniquely labeled using the elements from Ord.For the RDF:Alternatives, there must be at least one member whosearc label is RDF:1, and that is the default value for the Alternativesnode.

The RDF data model provides an abstract, conceptual framework for definingand using metadata. A concrete syntax is also needed for the purpose ofauthoring and exchanging this metadata. The syntax does not add to themodel, and APIs could be provided to manipulate RDF metadata without referenceto a concrete syntax. RDF uses XML encoding as its syntax. However, RDFdoes not require an XML DTD for the contents of assertion blocks (and RDFschemas are not required to be XML DTDs). In this respect, RDF requiresat most that its XML representations be well-formed.

RDF defines several XML elements for its XML encoding. The RDF:serializationelement is a simple wrapper that marks the boundaries in an XML document,where the content is explicitly intended to be mappable into an RDF datamodel instance. RDF:assertions and RDF:resource containthe remaining elements that instantiate properties in the model instance.Each XML element E contained by an RDF:assertions or an RDF:resourceresults in the creation of a property (a triple that is an element of theformal set T defined earlier).

With these basic principles defined, directed graph models of arbitrarycomplexity can be constructed and exchanged. A simple example would be"John Smith is the Author of the document whose URL is"(all these examples are taken from the RDF paper cited above, but updatedto use more recent XML namespace syntax). This assertion can be modeledwith the directed graph:

   [] ---bib:author---> "John Smith"

(This report uses a notation where Nodes are represented by items insquare brackets, arcs are represented as arrows, and strings are representedby quoted items.) This small graph can be exchanged in the serializationsyntax as:

<?xml:namespace name="" as="bib"?> <?xml:namespace name="" as="RDF"?> <RDF:serialization>   <RDF:assertions href="">     <bib:author>John Smith</bib:author>   </RDF:assertions> </RDF:serialization>

This example illustrates how the resource, property name, and valueare translated into XML.

A more elaborate model could be created in order to say additional thingsabout John Smith, such as his contact information, as in the model:

[]      |   bib:author      |      V[John Smith]-+---bib:name----> "John Smith"              |              +---bib:email----> ""             |              +---bib:phone----> "+1 (555) 123-4567"

which could be exchanged using the XML serialization representation:

<?xml:namespace name="" as="bib"?> <?xml:namespace name="" as="RDF"?> <RDF:serialization>   <RDF:assertions href="">     <bib:author>       <RDF:resource>         <bib:name>John Smith</bib:name>         <bib:email></bib:email>         <bib:phone>+1 (555) 123-4567</bib:phone>       </RDF:resource>     </bib:author>   </RDF:assertions> </RDF:serialization>

The serialization above is equivalent to this second serialization:

<?xml:namespace name="" as="bib"?> <?xml:namespace name="" as="RDF"?> <RDF:serialization>   <RDF:assertions href="">     <bib:author href="#John_Smith"/>   </RDF:assertions> </RDF:serialization><RDF:resource id="John_Smith">   <bib:name>John Smith</bib:name>   <bib:email></bib:email>   <bib:phone>+1 (555) 123-4567</bib:phone> </RDF:resource>

In these representations, the RDF:resource element createsan in-line resource. Typically such a resource will be a surrogate, orproxy, for some other real resource that does not have a recognizable URI.The id= attribute in the second representation provides a namefor the resource element so that the resource may be referred to elsewhere.

As an example of making a statement about a statement, consider thecase of computing a digital signature on an RDF assertion. (It is assumedthat the signature is computed over a concrete XML representation of theassertion rather than over an internal representation. The figure belowshows a box containing a small graph. This is a convention to indicatethat the XML content whose ID is foo is a concrete representation of thegraph it contains.) What is to be specified in the model is expressed bythe pair of graphs below - that there is an XML encoding of some assertion,and that there is some other XML content that is a digital signature overthat encoding.

+---------------------------------------------------------------+| ID=foo                                                        ||                                                               ||  [] ---DC:creator---> "John Smith" ||                                                               |+---------------------------------------------------------------+[foo]------DSIG:Signature------>"AKGJOERGHJWEJ348GH4GHEIGH4ROI4"


The details could be expressed in the model below:

       "AKGJOERGHJWEJ348GH4GHEIGH4ROI4"<--RDF:PropValue----+                                                           |                     [DSIG:Signature]<----RDF:PropName-----+                                                           |      +--RDF:InstanceOf-->[RDF:Property]<--RDF:InstanceOf--+      |                                                    |                                           |                                                    |    [foo]<----------------RDF:PropObj-----------------[prop-001]      |      |      +---------------------------------------------+      |                                             |      +-----------------------------+               |      |                             |               |   RDF:PropObj                  RDF:PropName   RDF:PropValue      |                             |               |      V                             V               V[] ---DC:creator---> "John Smith"

These models could also be expressed as:

<?xml:namespace name="" as="DC"?> <?xml:namespace name="" as="RDF"?><?xml:namespace name="" as="DSIG"?><RDF:serialization>  <RDF:assertions href="" id="foo">     <DC:Creator>John Smith</DC:Creator>   </RDF:assertions>  <RDF:assertions href="#foo">     <DSIG:Signature>AKGJOERGHJWEJ348GH4HGEIGH4ROI4</DSIG:Signature>   </RDF:assertions></RDF:serialization>

(Note that node labels such as "RDF:Property" are shorthandfor a full URI such as "").

The RDF data model intrinsically only supports binary relations. However,higher arity relations can also be represented, using just binary relations.As an example, consider the subject of one of John Smith's recent articles- library science. The Dewey Decimal Code for library science could beused to categorize that article. While the numeric code is the true Deweyvalue, few people can understand those codes. Therefore, the descriptionof the Dewey categories has been translated into several different languages.In fact, Dewey Decimal codes are far from the only subject categorizationscheme. So, it might be desirable to define a "Subject" nodethat not only specified the subject of a paper, but also indicated thelanguage and categorization scheme it came from. That might look like:

[]      |   DC:subject      |      V[subject_001]-+---DC:scheme----> "Dewey Decimal Code"               |               +---DC:lang----> "English"              |               +---RDF:PropValue----> "020 - Library Science"

which could be exchanged as:

<?xml:namespace name="" as="DC"?> <?xml:namespace name="" as="RDF"?> <RDF:serialization>   <RDF:assertions href="">     <DC:subject>       <RDF:resource id="subject_001">         <DC:scheme>Dewey Decimal Code</DC:scheme>         <DC:lang>English</DC:lang>         <RDF:PropValue>020 - Library Science</RDF:PropValue>       </RDF:resource>     </DC:subject>   </RDF:assertions> </RDF:serialization>

A common use of this higher-arity capability is when dealing with unitsof measure. A person's weight is not just a number like 94, it also requiresspecification of the units on that number. In this case either pounds orkilograms might be used. A relationship with an additional arc might beused to record the fact that John Smith is a rather strapping gentleman:

                                          +--NIST:units--> "pounds"                                           | [John Smith]--NIST:weight-->[weight_001]-+                                          |                                           +--RDF:PropValue--> "200"

 which can be exchanged as:

<?xml:namespace name="" as="NIST"?> <?xml:namespace name="" as="RDF"?> <RDF:serialization>   <RDF:assertions href="John_Smith">     <NIST:weight>       <RDF:resource id="weight_001">         <NIST:units href="#pounds"/>         <RDF:PropValue>200</RDF:PropValue>       </RDF:resource>     </NIST:weight>   </RDF:assertions> </RDF:serialization>

assuming the node "pounds" was defined elsewhere.

The RDF effort is attempting to define a very general abstract metadataarchitecture and associated support facilities. RDF, like MCF, illustrateshow a higher level model can be used together with XML to support specifictypes of application requirements, and illustrates a number of the samemetadata modeling ideas as MCF. The RDF examples above specifically illustratea requirement for metalevel pointers to explicitly link tags to attributedefinitions (by an explicit pointer, not by looking up the name in a dictionary).The more powerful facilities of XML for defining hyperlinks will improvethe ability to define very general relationships between data and metadatathat describes (and can help interpret) it. For example, the advanced XMLlinking facilities defined in XLL would allow assertions to refer to partsof referenced documents. It seems likely that RDF will also investigatemechanisms to automatically provide access to RDF metadata at runtime (implementingthe various association modes such as along-with), similar to themechanisms provided by PICS for content labels. In implementing a Web objectmodel, these techniques will be required to gain access to the object methods(which may be either embedded in the Web page, or located as separate resources).

Because of its generality in representing metadata, and the likelihoodthat it will be the basis of future Web developments in representing metadata,the Web object model described in Section 3 uses RDF (and its XML representation)as part of its structural base (although RDF is currently incomplete, andwill be developed further). Additional aspects of MCF may be used as well,depending on more detailed analysis to be performed later. Section 3 willdescribe further decisions about the nature of the object model, basedon RDF as a starting point.

However, RDF and MCF themselves are not sufficient to support all requirementsof a Web object model. For example, the object model requires an API toits state representation, and thus RDF and MCF must be integrated withparallel work on a Document Object Model (see below), which is not currentlythe case. Also, mechanisms for linking code to RDF and MCF structures mustbe further developed. Finally, structured database capabilities do notexist for these structures, and must be worked out.

2.3 Adding Behavior to Web Pages

Previous sections have noted that what is needed to progress towarda Web object model is:

Section 2.1 described work toward providing the Web with a richer baserepresentation (e.g., XML). The metadata and model work described in Section2.2 described approaches for adding additional structure to this representation.In addition, as noted in the introduction to Section 2.2, these techniquesfor representing metadata and linking it to Web resources provide a conceptualframework for linking behavior to Web resources, by treating thecode implementing that behavior as a form of metadata. Code resources arealready being stored on the Web, e.g., in program libraries supportingreuse, and it is already possible to create links between Web documentsand such resources. However, in using code resources to create objects,it is necessary to reflect the special semantics associated with theselinks. These semantics somewhat resemble those of metadata such as contentlabels, in the sense that rather than the user explicitly following thelinks to retrieve the associated "metadata", some of the "metadata"is automatically retrieved during access to the original resource, in orderto support some special processing. In the case of content labels, thespecial processing involves checking the content labels against user-specifiedrequirements in order to determine whether to allow access to the originalresource. In the case of object methods, the special processing involvesinvoking the retrieved code in order to perform some operation . This particularapproach to representing and invoking object methods will be discussedfurther in Section 3.

This section describes several mechanisms developed within the Web communityfor defining relationships between state and code, and for providing anAPI to state (the second and third bullets above). Specifically, techniquesdeveloped for embedding objects and scripts in Web documents representsone way of associating behavior with the state represented by a Web document.The W3C's Document Object Model (DOM) effort represents another way ofaddressing this issue, as well as the issue of providing an API to thisstate. These two issues are closely related.

A program must gain access to data in order to process it, and so anobject method must have access to the object's state. It is always possibleto pass data as a value to a program. However, the program must understandthe structure of this data in order to access it efficiently. Conventionalobject models provide what is in effect a special API for object methodsto use when accessing state for this purpose. This is also necessarilyin a Web object model. However, the need for such an API becomes especiallyimportant when the state has a rich, complex structure, such as an XMLdocument. Without an API to this state (and its implementation), each programwould have to implement a considerable amount of code simply to parse thestructure, in order to locate the parts of the document required for specificpurposes. An API providing access to the various parts of a document, togetherwith an implementation of this API as part of the general representationof this state's "data type", provides this code as a pre-existingcomponent, allowing the program to concentrate on application-related processing.The DOM provides such an API. At the same time, it provides part of a generalmechanism (albeit a very unconstrained one) for linking code and state,since it provides a straightforward mechanism for code (currently, programssuch as plug-ins or external applications) to access the state it needs.

Finally, the Web Interface Definition Language (described in Section2.3.3) is commercial technology that represents another mechanism for providingan API to state (as well as to Web-based services).

2.3.1 Document Object Model (DOM)

W3C's Document Object Model (DOM)<> effort provides a mechanism for scriptsor programs to access and manipulate parsed HTML and XML content (includingall markup and any Document Type Definitions) as a collection of objects.Specifically, DOM defines an object-oriented API of an HTML or XML documentthat a Web client can present to programs (applications or scripts) thatneed to process the document. The client (at least conceptually) operatesoff this collection of objects in displaying the document. Thus, by operatingon the collection of objects representing a Web page, scripts or programscan change styles and attributes of page elements, or even replace existingelements with new ones, resulting in an immediate change to the data displayedto the user. As a result, DOM makes it easy to implement dynamic contenton the client, rather than forcing all such content to be implemented onthe server, and provides a basic way to integrate a document's data withcode. For example, a client might implement a JavaScript DOM interface,so that scripts in this language could be used within the page itself tomanipulate the page. The client could also provide a DOM interface to externalapplications such as plug-ins allowing them to access the document viathe client. Similarly, an editor might implement a Java DOM interface toallow programs written in Java to interact with the editor to manipulatethe page.

DOM is a generalization of Dynamic HTML facilities defined by Microsoftand Netscape. Functionality equivalent to the Dynamic HTML support providedby Netscape Navigator 3.0 and Microsoft Internet Explorer 3.0 is referredto as "DOM level 0". DOM level 1 extends these capabilities to,for example, allow creation "from scratch" of entire Web documentsin memory by creating the appropriate objects. The DOMWorking Draft specification <> includeslevel 1 Core specifications which apply to both HTML and XML documents,and level 1 specializations for HTML and XML documents. The DOM objectclass definitions in these specifications have their interfaces definedusing OMG IDL. Java interface specifications are also defined (see thespecifications for details).

DOM represents a document as a hierarchy of objects, called nodes,which are derived (by parsing) from a source representation of the document(HTML or XML). The DOM object classes represent generic componentsof a document, and hence define a document object metamodel. The DOM Level1 working draft defines a set of object classes (and their inheritancerelationships) for representing documents. The major classes are:

Node  |  +--Document  |    |  |    +--HTMLDocument  |  +--Element  |    |  |    +--HTMLElement  |         |  |         +--specific HTML elements  |  +--Attribute  |  +--Text  |  +--PI [Processing Instruction, an XML concept from SGML]  |  +--Comment

The Node object is the base type for all objects in the DOM.It may have an arbitrary number (including zero) of sequentially-orderedchild nodes. It usually has a parent Node, the exception being that theroot Node in a document tree has no parent.

Element objects represent the elements in HTML and XML documents.Elements contain, as child nodes, all of the content between the starttag and the corresponding end tag of an element. Aside from Textnodes, the vast majority of node types that applications will encounterwhen traversing a document structure will be Element nodes. Elementobjects also have a list of Attribute objects which representthe set of attributes explicitly defined as part of the element, and thosedefined in the DTD that have default values.

Text objects are used to represent any non-markup values, whetherthe values are intended to represent an integer, date, or some other typeof value. For XML documents, all whitespace between markup results in Textobjects being created.

The Document object is the root node of a document object tree,and represents the entire HTML or XML document. The HTMLDocumentsubtype represents a specialization of the generic Document typefor the specific requirements of HTML documents.

Additional object classes are defined in the working draft for representingXML Document Type Definitions, and auxiliary data structures (e.g., listsof nodes).

Normally, a DOM-compliant implementation will make the main Documentinstance available to the application through some implementation-definedmechanism. For example, a typical implementation would give the applicationa reference to a DocumentContext object. This object describesthe source of the document, as well as related information such as thedate and time the document was last changed. From the DocumentContext,the application may access the Document object, which is the rootof the document object hierarchy. From the Document object, theapplication can use the methods provided for accessing individual nodes,selection of specific node types (such as all images), and so on. For XMLdocuments, the DTD is available through the documentType method(which returns null for HTML documents and XML documents withoutDTDs). Document also defines a getElementsByTagName method.This produces an enumerator that iterates over all Element nodeswithin the document whose tagName matches the input name provided.(The DOM working draft indicates that a future version of the DOM willprovide a more generalized querying mechanism for nodes).

As an example generally illustrating how an XML document might be presentedto an application in the DOM, consider the example described in Section2.1.4 of a simple relational database represented in XML. The DOM for XMLwould present the XML document to an application as a collection (actually,a tree) of objects. Most of these objects would be of type Node,and specifically of its subtypes Element (representing the individualelements) and Text (representing the content). More precisely:

<!doctype mydata ""><mydata>...</mydata>

(the outer markup) would be presented as an object of type Document(a subtype of Node). The children of this node would be objectsrepresenting the Table elements (and, indirectly, their contained rowsand fields). Type Node provides a method getChildren()to access the children. The table delimited by


would be presented as an object of type Element (another subtypeof Node) representing the Authors table. Type Elementprovides a method getTagName() to provide access to the actualtag name (authors in this case). The children of this node wouldbe objects representing Row elements of type Author (and, indirectly, thecontained fields). Similarly,


would be presented as another object of type Element representingthe Editors table.

Each element delimited by


would be presented as an object of type Element representinga particular Author row. The children of this node would be objects representingthe fields contained in the row. Elements delimited by


would similarly be presented as objects of type Element representingEditor rows.

Fields would similarly be presented as Element objects. Forexample, each element delimited by


would be presented as an object of type Element representingthat particular field. Each of these elements would have a child node oftype Text (Text is not a subtype of Element)representing the text value of the field (e.g., "Robert Roberts").The data() method of the Text object type returns theactual string representation. In this case, this would end the nesting.

The representation of a Web page in terms of objects makes it easy toassociate code with the various subcomponents of the page. The DOM requirementsalso identify the need for an event model, to provide a way to schedulethe execution of the code associated with particular parts of a Web pageat appropriate times. This event model (not yet specified) would extendthe current event capabilities provided by most Web clients. The requirementsspecify that:

As noted at the beginning of Section 2.3, the development of the DOMrecognizes the fact that, in enhancing the data structuring capabilitiesof the Web, more is needed than just more complex representations. Therealso must be built-in (and widely-available) capabilities for processingthese representations. The DOM interface (and its implementation by clientsand other tools) provides a general means for applications to access andtraverse these representations without having themselves to perform complexparsing. The more complex the representation can become, the more importantthis capability becomes (and, hence, it is particularly important if XMLis the representation). DOM's support for dynamic documents (documentsmutable on the client) also causes these documents to more closely resemblethe state of general objects. The integration of DOM and XML will providea powerful basis for enriched Web applications.

The DOM remains under development, and further work is required to integrateit both with other Web technology developments, and with capabilities requiredto provide full Web object model support. For example, SGML's DSSSL (describedbriefly in the XML section) defines a very general object model for SGMLdocuments, called groves, which resembles the DOM to some extent.Groves are intended to provide a runtime object model for use while processingSGML documents. However, it is not clear to what extent DOM and grove capabilitieswill be integrated. Groves are extremely general (e.g., using groves itis possible to define each character in a document as a separate element),and it is not clear that the same level of generality is required for DOM.Moreover, groves define an object model for static documents. DOM,on the other hand, is designed to deal with dynamic documents, whichcan be modified by processing applications (via the DOM interface) at runtime.However, the XML stylesheet proposals are based to some extent on DSSSL(and hence presumably on the use of some aspects of groves). Another interestingaspect of this integration is that DSSSL defines a query language calledSDQL for accessing parts of SGML documents for use in stylesheet processing.The provision of a query language (or aspects of one) for XML would providean important base for the development of full-fledged database-like processingcapabilities for Web documents represented in XML. This issue is beingexplored further in a companion OBJS technical report in progress.

The DOM defines its API at a generic level, i.e., at the level of componentsof a document metamodel. Additional work would be required to define "applicationlevel" object interfaces. For example, in the relational databaseexample defined above, DOM provides objects of types node, element,and so on, rather than objects of type author or editor(or even objects of type table or row). Using DOM, anapplication could effectively create such types from the information given,but it would have to "know what to look for", and would haveto traverse the various element objects to find that information.It would be desirable to have a capability for creating DOM-like, but application-oriented,APIs. This could involve using additional metadata (e.g., the DTD, or anXML-Data-like schema) to generate a default API automatically (which thedocument's author could then customize). It might then be possible to attachspecific methods to this API to define application-specific object behavior.An integration of DOM and the embedded OBJECT elements described belowwould be one way to support this. This would effectively permit the creationof objects in the classic object-oriented programming sense.

The DOM work also needs to be integrated with the work on higher-levelmodels described in Section 2.2. One effect of this would be to providea way to add object behavior to documents without the need for referencesto the associated programs to be embedded in the page, as with OBJECT elements.These models might also provide additional support for generating application-specificobject APIs.

2.3.2 Embedded Objects

Web clients generally contain mechanisms for rendering common data typessuch as text, GIF images, colors, fonts, and some graphic elements. Torender data types that do not have built-in support, clients generallyrun external applications (plug-ins or helpers). In addition, Web clientscurrently support mechanisms for including specialized types of "objects"in the rendering process that are not physically located in the document,e.g.:

The recently-adopted HTML4.0 Specification <> definesan OBJECT element (and an associated <OBJECT> tag) which subsumesthese specialized tags (the <OBJECT> tag is already supported insome Web clients). In general, its purpose is to define an insertedrendering mechanism, in order to allow authors to control whether includedobjects are handled by Web clients internally or externally.

In the most general case, an inserted rendering mechanism specifiesthree types of information (although in specific cases not all this informationmay need to be explicitly specified):

(Not surprisingly, this is a variant of the information needed for anobject invocation in an object-oriented programming language).

In HTML 4.0, the OBJECT element specifies the location of a renderingmechanism and the location of data required by the rendering mechanism.This information is specified by the attributes of the OBJECT element.The PARAM element specifies a set of run-time values.

A client interprets an OBJECT element by first trying to render themechanism specified by the element's attribute. If this cannot be donefor some reason (e.g., the client is configured not to, or the client platformcannot support that mechanism), the client must try to render the element'scontents. This provides a way to specify alternate object renderings, sincethe contents of an OBJECT element can be another OBJECT element specifyingan alternative mechanism. The contents of the most deeply embedded elementshould be text. Data to be rendered can be supplied either inline, or froman external resource. An HTML document can be included in another documentby using an OBJECT element with the data attribute specifying the fileto be included.

The following simple Java applet:

<APPLET code="AudioItem" width="15" height="15"><PARAM name="snd" value="|>Java applet that plays a welcoming sound.</APPLET>

may be rewritten as follows using OBJECT:

<OBJECT codetype="application/octet-stream"        code="AudioItem"         width="15" height="15"><PARAM name="snd" value="|">Java applet that plays a welcoming sound.</OBJECT>

The OBJECT element includes, among others, the following attributes:

The HTML OBJECT element illustrates an example of a capability for Webclients to automatically invoke behavior associated with a document whenthe behavior is encountered. The approach to a Web object model describedin Section 3 must both generalize this capability, and integrate it withthe XML, RDF, and DOM technologies described earlier. In particular, theOBJECT element only deals with references to external code that have beenembedded in the document (i.e., the relationship between the code and thedocument is represented physically in the document). A generalization ofthis capability (and an integration of it with PICS/RDF metadata accessconcepts) would allow relationships between code and documents to be specifiedseparately from the code and documents that are interrelated (just as PICScontent ratings may be specified separately from the content they rate),and accessed automatically during the processing of the document. Thiswould permit a more flexible integration of data and code to form Web objects.

The OBJECT element is also the basis of current capabilities that linkWeb pages into CORBA distributed object architectures. This is done byusing Java applets (referenced from OBJECT elements on Web pages) whichdefine CORBA objects, and can interact with other CORBA objects (not necessarilywritten in Java) via CORBA's Internet Inter-ORB Protocol (IIOP), usingan ORB contained in the Web client (Netscape Communicator supports suchan ORB). This is an important capability in merging Web and object technologies,particularly the object service capabilities provided by CORBA architectures.Combining this capability with the facilities of our Web object model wouldprovide a deeper integration of Web and object technology, and an improvedability to apply object services to Web resources. This is discussed furtherin Section 3.

2.3.3 Web Interface Definition Language

The Web Interface DefinitionLanguage (WIDL) <> is commercialtechnology from webMethods, Inc.(information on WIDL is made available at W3C's Web site as a service byW3C, but WIDL is not W3C technology; WIDL is also described in [KR97]).WIDL is an application of XML which allows interactions with Web serversto be defined as functional interfaces. These interfaces can be accessedby remote systems using standard Web protocols, and provides the structurenecessary for generating client code in languages such as Java, C/C++,COBOL, and Visual Basic.

A central feature of WIDL is that programmatic interfaces can be definedand managed for Web resources such as:

These resources need not under the direct control of programs that requiresuch access. WIDL definitions can be co-located with client programs, centrallymanaged in a client/server architecture, or referenced directly from HTML/XMLdocuments.

WIDL definitions provide a mapping between such Web resources and applicationswritten in conventional programming languages such as C/C++, COBOL, VisualBasic, Java, JavaScript, etc., enabling automatic and structured Web accessby compatible client programs, including mainstream business applications,desktop applications, applets, Web agents, and server-side Web programs(CGI, etc.). Using WIDL, programs can request Web data and services bymaking local calls to functions which encapsulate standard Web access protocolsand utilize WIDL definitions to provide naming services, change management,error handling, condition processing and intelligent data binding. A browseris not required to drive Web applications. WIDL requires only that targetsystems be Web-enabled (there are numerous commercial products which allowexisting systems to be Web-enabled).

A service defined by WIDL is equivalent to a function call in standardprogramming languages. At the highest level, WIDL files describe the locations(URLs) of services, input parameters to be submitted (via Get or Post methods)to each service, conditions for successful processing, and output parametersto be returned by each service. In much the same way that DCE or CORBAIDL is used to generate code fragments, or 'stubs', to be included in applicationdevelopment projects, WIDL provides the structure necessary for generatingclient code in languages such as C/C++, Java, COBOL, and Visual Basic.

Many of the features of WIDL require a capability to reliably identifyand extract specific data elements from Web documents. Various mechanismsfor accessing elements of HTML and/or XML documents have been defined,such as the JavaScript Page Object Model, the Document Object Model, andXML-Link. The following capabilities are desirable for accessing elementsof Web documents:

Object referencing mechanisms would ideally support both parsing andpattern matching. Pattern matching extracts data based on regular expressions,and is well suited to raw text files and poorly constructed HTML documents.Parsing, on the other hand, recovers document structure and exposes relationshipsbetween document objects, enabling elements of a document to be accessedwith an object model. WIDL does not define or determine a mechanism foraccessing document data, but rather allows an object model referencingmechanism to be specified on a per-interface basis.

The following example (from the cited reference) illustrates the useof WIDL to define a package tracking service for generic Shipping. By allowinga WIDL definition to reference a 'Template' WIDL definition, a generalclass of shipping services can be defined. 'FoobarShipping' is one implementationof the 'Shipping' interface.

<WIDL NAME="genericShipping" TEMPLATE="Shipping"      BASEURL="" VERSION="2.0"><SERVICE NAME="TrackPackage" METHOD="Get"          URL="/cgi-bin/track_package"         INPUT="TrackInput" OUTPUT="TrackOutput" /><BINDING NAME="TrackInput" TYPE="INPUT">   <VARIABLE NAME="TrackingNum" TYPE="String" FORMNAME="trk_num" />   <VARIABLE NAME="DestCountry" TYPE="String" FORMNAME="dest_cntry" />   <VARIABLE NAME="ShipDate" TYPE="String" FORMNAME="ship_date" /></BINDING><BINDING NAME="TrackOutput" TYPE="OUTPUT">   <CONDITION TYPE="Failure" REFERENCE="doc.title[0].text"               MATCH="Warning Form" REASONREF="doc.p[0].text" />   <CONDITION TYPE="Success" REFERENCE="doc.title[0].text"               MATCH="Foobar Airbill:*" REASONREF="doc.p[1].value" />   <VARIABLE NAME="disposition" TYPE="String" REFERENCE="doc.h[3].value" />   <VARIABLE NAME="deliveredOn" TYPE="String" REFERENCE="doc.h[5].value" />   <VARIABLE NAME="deliveredTo" TYPE="String" REFERENCE="doc.h[7].value" /></BINDING></WIDL>

In this example, the values defined in the 'TrackInput' binding getpassed via HTTP Get as name-value pairs to a service residing at ''.Object References are used in the 'TrackOutput' binding to a) check forsuccessful completion of the service, and b) extract data elements fromthe document returned by the HTTP request.

'Input' and 'Output' bindings specify the input and output variablesof a particular service. Input bindings define the name-value pairs tobe passed via Get or Post methods to a Web-based application. Output bindingsuse object references to identify and extract data elements from documentsreturned by HTTP requests.

Conditions define 'success' and 'failure' states for output bindings,and determine whether a binding attempt should be retried in the case ofa 'server busy' error: Conditions can apply to a binding as a whole, orto a specific object reference. Conditions can define error messages tobe returned as the value of the service; error messages can be a literal,or can be extracted from the returned document.

WIDL is another example of technology that provides an API (an objectinterface) to state. In addition, it supports the definition of similarinterfaces to Web-based services. Facilities for defining such interfacesare helpful tools in integrating Web-based state and behavior.

2.4 Related OMG Technologies

Section 1 briefly described OMG's activities in developing an infrastructurefor distributed object computing. Section 1 also noted the resemblanceof the Web to a simple distributed object system. Given that commonality,practically any of OMG's work could be considered "relevant"to the creation of a Web Object Model. Information on the wide range ofOMG's activities is available at the OMGWeb site <>. This activity includes both platform-relatedwork on infrastructure components, and work related to specific verticalindustry application domains. While much of this OMG activity is proceedingindependently of Internet-related activities, one OMG activity which isdirectly addressing the integration of Internet and distributed objecttechnology is OMG's InternetSpecial Interest Group <>.

While a complete description of OMG activities is outside the scopeof this report, several OMG technologies address structured data representationcapabilities similar to others descrbed in Section 2, and hence are ofdirect interest here. Specifically, the OMG has been considering a TaggedData Facility, and a Mediated Exchange Facility based on it, as part ofits Common Facilities Architecture. The Tagged Data Facility involves theuse of tagged data items to support semantics-based information exchangebetween applications, and also supports nesting and the ability to locateobjects via tags through layers of nesting. The Mediated Exchange Facilityis built on the Tagged Data Facility by adding mediator components andrelated services. Several submissions to OMG's Business Object FacilityRFP describe such capabilities. In addition, the already-approved OMG PropertyService provides similar capabilities. These OMG technologies are of interestin showing that there is a recognized need for tagged "data"representations to pass semantically-rich data structures between clientsand servers within OMG's distributed object architecture, just as the representationsdescribed in Section 2.1 illustrated the need to do the same thing in theWeb. However, there is not yet any coordination between these two communitiesin developing these facilities.

2.4.1 OMG Property Service

The OMG Property Service defines PropertySet objects that act as containersfor sets of properties (name/value pairs). Each property has a differentname. All property values are defined (and represented) as type any.PropertySet objects provide operations for finding the value of a propertygiven its name, adding and deleting properties, modifying the value ofan existing property, and determining whether the object has a propertywith a given name. PropertySet objects are intended to be a dynamic equivalentof CORBA attributes. When an application finds it necessary to add an attributeto an object, and cannot do so by using the IDL interface of the object(either using an existing attribute, or modifying the interface to adda new one), it can create a PropertySet object with the necessary attribute(s)and associate it with the object. A given object may have zero or morePropertySet objects associated with it. The Property Service does not definehow this association is established. It could be done, for example:

PropertySet objects do not have "schemas" as such; that is,there is no declaration that restricts a PropertySet to only contain propertieswith specific names. Nor is there a declaration that specifies that a propertywith a given name must only have values of a specific type. As a result,in the general case a property with any name/value combination can be containedin a given PropertySet (and there is no guarantee that a given name won'tbe used inconsistently by multiple applications in different PropertySetsthe application might define). However, such constraints can be (at leastpartially) defined operationally through the PropertySetFactory objectused to create PropertySet objects (by implementing the appropriate PropertySetFactoriesto enforce the required constraints).

The OMG Property Service essentially provides a simple, dynamic, object-orientedinterface to relatively unstructured property/value pairs. Object models(including OMG's) are generally static, in that they require an objectclass to have a fixed number of attributes and methods. The OMG PropertyService addresses this restriction, and thus adds value to the object model.It does not specify an actual representation (this would presumably bespecified using object externalization capabilities currently being developedby OMG), it is not as rich as XML, nor does it provide the higher-levelmodeling capabilities such as those described in Section 2.2. However,in some respects it resembles a very simple DOM, in that it does providean object interface to an (unspecified) representation.

2.4.2 Tagged Data Facility

The OMG has been considering release of an RFP (Request for Proposal)for a Tagged Data Facility (TDF). The TDF is intended to provide a facilityfor defining semantically-tagged objects that can be passed as parametersbetween ordinary CORBA objects. In particular, the TDF is intended to:

A tagged data object is intended to be an object; unlike a PropertySetobject, its interface is not intended to be part of another object. Moreover,TDF objects are not intended to be "network-visible" objects.They are intended to be passed by value when used as information exchangebetween CORBA objects.

The TDF requirements seem to fit the basic structural capabilities ofOEM and MCF to some extent (the draft TDF RFP explicitly references OEM),in the sense that they seem to call for the ability to construct complexgraph structures of relatively simple labeled nodes. However, MCF in particulargoes much further than TDF in defining the basis of a rather complete objectmodel (which is unnecessary in TDF since TDF objects are already CORBAobjects). TDF also specifies some metadata-related requirements, such asdealing with namespace issues and synonyms. However, like the PropertyService, TDF is not well-integrated with related Web developments. Of course,as an RFP, the TDF leaves a great deal of detail, both of technology andusage scenarios, to be supplied by specific technology proposals submittedin response. As a result, it may be possible that some technology integratingOMG and Web technology, e.g., combining XML and DOM, could be adopted inresponse to the TDF RFP, once it is issued.

3. Building a Web Object Model

Section 2 has described a number of the key technologies that addressissues in creating a Web object model. In this section, we describe a generalapproach to integrating these technologies to support a Web object model.Specifically, the key component technologies we propose to integrate are:

In supporting an object model, XML pages (like HTML pages) can alsobe used as containers for embedded objects and object methods (e.g., Javaapplets)

In addition to using these emerging Web technologies, we also take advantageof other existing aspects of the Web, e.g.:

3.1 Integration Approach

The idea behind integrating these technologies to form a Web objectmodel is that an "object" in a conventional object model is basicallya piece of state with some attached (or associated) programs (methods).In many object model implementations, this idea is exactly reflected inthe physical structure of the objects. For example, a Smalltalk objectconsists of a set of state variables (data), together with a pointer (link)to a class object which contains the object's methods. The structure isroughly:

    Object (state)                Class object  +---------------+              +-------------+  | class pointer |------------->| Class data  |  +---------------+              +-------------+  | variable 1    |              | method 1    |  | variable 2    |              | method 2    |  |   ...         |              |   ...       |  | variable n    |              | method m    |  +---------------+              +-------------+

C++ implementations use similar structures. The state is a collectionof programming language variables, which (usually) are not visible to anythingbut the methods (this is referred to as encapsulation). A typicalobject model has a tight coupling between the methods and state. All thestructures (class objects, internal representation of methods and state,etc.) are determined by the programming language implementation, and arecreated together as necessary. The class (in particular, the methods itdefines) defines the way the state should (and will) be interpreted withinthe system, and hence is a form of metadata for the state. As aresult, the link between an object and its class is essentially a metadatalink.

Extending this idea to the Web environment, the idea is that Web pagescan be considered as state, and objects can be constructed by enhancingthose pages with additional metadata that allows the pages to be consideredas objects in some object model. In particular, we want to enhance Webpages with metadata consisting of programs that act as object methods withrespect to the "state" represented by the Web page. The resultingstructure would, at a minimum, conceptually be something like:

                       +----------+           +---------->| method 1 |+-------+  |           +----------+|  Web  |--+              ...|  page |--+           +-------+  |           +----------+           +---------->| method n |                        +----------+

The NCITS Object ModelFeatures Matrix [Man97] identifies many different object models, withwidely differing characteristics. Different object models could also bedefined for the Web. The details of the structures to be supported in aWeb object model depend on the details of the object model we choose todefine. For example, many object models are class-based, such asthe Smalltalk and C++ models mentioned above. Choosing a class-based modelfor the Web would require defining separate class objects to define thevarious classes. Other object models are prototype-based, and donot require a class object (each object essentially defines itself). Eitherof these forms (plus others) could be supported by the basic mechanismwe propose.

In a Web object model, some of the tight coupling that exists in programminglanguage object models would probably be relaxed, and the connection betweenthe state and code would be somewhat "looser". This would allowmore flexibility in defining associations between programs and Web pagesin the model. For example, unless special constraints prohibited such access,a user would probably be able to directly access the state (and manipulateit as well) using standard Web document viewing and creation tools, withoutnecessarily using any associated methods (just as users today can oftenusefully access pages containing Java applets even when Java is inactiveor unsupported on their browsers). In these cases, encapsulation wouldbe relaxed and access to any methods related to the state would be optional.

Constructing these object model structures requires a number of "pieces"of technology, as we have already observed several times. These piecesare:

Code resources are already being stored on the Web, e.g., in programlibraries supporting reuse, and it is already possible to create relationships(links) between Web documents and such resources. However, in using coderesources to create objects, it is necessary to not only define the linksbetween the code and its associated state, but also to reflect the specialsemantics associated with these links. These semantics somewhat resemblethose of metadata such as PICS content labels, in the sense that insteadof the user explicitly following the links to retrieve the associated "metadata",some of the "metadata" is automatically retrieved during accessto the original resource, in order to support some special processing.This processing involves a form of what is variously called a metalevel,reflective, or intermediary architecture, in the sense thatthe processing requires that ordinary requests for data on the Web be interruptedor intercepted, so that the necessary special processing can be performed.In the case of content labels, the special processing involves checkingthe content labels against user-specified requirements in order to determinewhether to allow access to the original resource. In the case of objectmethods, the special processing involves accessing the code, and invokingthat code, in order to perform some operation.

In the approach we propose, relationships between the state and themethods will be defined in either of two ways:

In order to define relationships between Web pages and methods withoutthese relationships being explicitly contained in the Web pages, it isnecessary to have a way to determine the existence of these relationshipsat runtime, so that the client can download those methods, and invoke themto provide object behavior. PICS provides a mechanism for doing this. PICSdefines metadata (content labels) that need not be embedded in the pagedescribed by that metadata. In PICS, the client specifies the sources andtypes of content labels it wants to use to evaluate the Web pages it accesses.Whenever an attempt is made to access a page, content labels from thosesources are implicitly accessed (either from the site supplying the page,or from a separate rating service), and evaluated to determine whetheraccess to the page should be allowed. It seems likely that RDF will definea similar (but possibly more general) mechanism for transparently accessingmetadata about a given page when the page is accessed, and providing thatmetadata to the Web client. This mechanism would provide the basis of ourmetadata access mechanism as well. (If such a mechanism is not definedin RDF, we would define one as an extension. This would probably be relativelystraightforward, given the existence of the PICS mechanism already mentioned).In our case, however, the metadata will contain methods that can operateon the data in the page, and perform various functions based on that data.

The two mechanisms identified above (embedded OBJECT elements and RDFresources associated with the page) potentially provide a way to accessthe methods when the state is accessed. In addition, a mechanism is requiredto invoke the code as it is needed. The OBJECT element already providessuch a mechanism which can be used in some cases (for example, this isused to invoke Java applets embedded in pages). A more general mechanismwould necessary for methods defined in RDF resources. There may be a wayto do this provided within a general RDF-supported metadata access mechanism(this is currently not clear, since RDF is still under development). Alternatively,it may be necessary to define this as an extension. Again, this would probablybe relatively straightforward.

Many details of this technology integration must still be worked out(partially because some of the key technologies we have identified arestill under development). Nevertheless, we feel that the capabilities inherentin these technologies provide the necessary support for the object modelintegration we propose.

3.2 Discussion

A number of projects have investigated developing object capabilitiesfor the Web, e.g., the Harvest Object System [CHHM+94], W3Objects[ILCS95], and ANSAWeb[REMB+95]. A thorough review of such projects has been undertaken, andthe descriptions of W3Objects and ANSAWeb below are taken from a forthcomingtechnical report "Web + Object Integration", by Gil Hansen (OBJS),resulting from that review.

The Harvest Object System (HOS) [CHHM+94] modified the Mosaicbrowser to include a Harvest Object Broker, allowing users to interactwith remote objects via a special Harvest Object Protocol (HOP). HOS definesobjects from existing files and programs by recording metadata roughlyof the form:

     user-defined type name          URL --> file data          URL --> method (program)          URL --> method          URL --> method          ...          URL --> method

using SOIF to hold that metadata. The HOP is used for retrieving IDLinformation, moving object code and data, and invoking objects. A commandsuch as GETOBJS hop://URL/some.obj (where URL/some.objdesignates a file) returns the object data for some.obj alongwith its metadata, including a set of methods.

ANSAWeb <>provides a strategy for interoperability between the Web and CORBA usingHTTP-IIOP gateways -- the I2H gateway converts IIOP requests to HTTP, andH2I converts HTTP requests to IIOP. The H2I gateway allows WWW clientsto access CORBA services; the I2H gateway allows CORBA clients to accessWeb resources. The pair of gateways together behave like an HTTP proxyto the client and server. A CORBA IDL mapping of HTTP represents HTTP operationsas methods and headers as parameters. An IDL compiler generates clientstubs and server skeletons for the gateways. H2I is both a gateway to IIOPand a full HTTP proxy so a client can access resources from a server thatdoes not have an I2H gateway. A locator service decides when to use IIOPor HTTP. If the locator can find an interface reference to a I2H server-sidegateway, IIOP is used; otherwise, the H2I gateway passes the request viaHTTP.

The W3Objects <>project at the University of NewCastle upon Tyne provides facilities fortransforming standard Web resources (HTML documents, GIF images, PostScriptfiles, audio files, and the like) from file-based resources into objectscalled W3Objects, i.e., encapsulated resources possessing internal stateand well-defined behaviors. The motivating notion is that the current Webcan be viewed as an object-based system with a single class of object --all objects are accessed via an HTTP daemon. W3Objects are responsiblefor managing their own security, persistence, and concurrency control.These common capabilities are made available to derived application classesfrom system base classes. A W3Objects server supports multiple protocolsby which client objects can access server objects. When using HTTP, theURL binds to the server object and the permitted object operations aredefined by the HTTP protocol. Or, the RPC protocol can be used to passoperation invocations to a client-stub generated from a description ofthe server object interface. W3Objects uses C++ as the interface definitionlanguage, although CORBA IDL and ILU ISL can be used. W3Objects can alsobe accessed though a gateway, implemented as a plug-in module for an extensibleWeb server, such as Apache <>.URLs beginning with /w3o/ are passed by the server to the gateway;the remainder of the URL identifies the requested service and its parameters.Using a Name Server, the appropriate HTTP method is invoked on the requestedservice.

These projects have identified a number of important ideas in supportingobjects on the Web (in particular, objects constructed in the HOS resemblein many respects those that would be constructed using the approach describedin Section 3.1). However, they based their attempts to develop object capabilitiesfor the Web on the existing Web infrastructure. As a result, theyhad to use a number of non-standard Web extensions (e.g., special protocolsreferenced in URLs to trigger the loading of object methods), which limittheir widespread usability. Dependence on the existing Web infrastructurealso limits the ability of the resulting objects to support more complexWeb applications. Our work, on the other hand, is based on what will likelybe the next-generation Web infrastructure. This infrastructure isstill evolving, and hence some extensions to it may yet be necessary. However,based on our analysis, these new Web technologies seem likely to providea much better basis for providing powerful Web object facilities, thatare at the same time based on standard (hence, widely accessible) Web protocolsand components.

An approach similar to that provided by ANSAWeb is becoming increasinglypopular, and is potentially very powerful. This involves placing Java appletson Web pages (using the APPLET or OBJECT elements in HTML). Once on theWeb client, these objects then communicate with other objects on remoteservers using various protocols. A particularly important variant of thisapproach is to use it to combine Java and CORBA. In this variant, Javaapplets downloaded to the client communicate with other CORBA objects overthe Internet via CORBA's IIOP (Internet Inter-ORB Protocol), which is supportedby all CORBA Object Request Brokers. This approach is, for example, supportedby Netscape Communicator, which includes Visigenic's Java ORB. Using thisapproach, the advantages of CORBA's object services are potentially availableto Internet objects. This also allows non-Java objects to be integratedinto the Internet, since CORBA objects can be written in many languages.Java has also been the basis of proposals to improve Web capabilities byrepresenting more and more Web content directly as Java objects, usingthe existing Web largely as a transport mechanism for these objects.

Such approaches provide important new mechanisms for supporting morepowerful Web capabilities, and integrating enterprise distributed objectsystems (which are likely to be CORBA-based) with the Internet. However,these approaches suffer from a number of disadvantages when used by themselves,e.g.:

What we are proposing is a general way to merge objects and theWeb. Our approach subsumes these Java-based approaches, since all thesemechanisms for integrating Java (and CORBA) objects with Web pages arestill available. However, our approach goes beyond these approaches inproviding richer Web content that is more amenable to application processing(XML pages accessible via DOM), together with a more general way to linknon-embedded methods with that Web content.

There are a number of potential ways to use the "objects"constructed using the mechanism we are proposing. One approach would beto use the methods associated with a document in the same way that Javaapplets are used now. The difference would be that the code would not needto be embedded in the document. (In fact, depending on the exact detailsof the DOM, if the methods were separately-located OBJECT elements, theycould presumably be embedded dynamically in the document at the clientusing the DOM interface, and act just the way embedded OBJECTs would act).A more conventional "object-like" use would be to allow the associatedmethods to be invoked via an enhanced DOM interface by programs actingthrough the client. That is, the DOM effectively implements a generic interfaceof a type something like XML-document (for XML documents). Application-specificsubtypes of this generic type could be created which included the application-specificmethods associated with the document as parts of the interfaces definedfor those subtypes. Programs acting through the client could then invokethese methods through the new interfaces just as they invoke the methodsof other objects.

The mechanism defined here provides a form of "component-oriented"development, in that it allows the arbitrary composition of objects fromdata and code resources found on the Internet. Using this approach, a clientcould have multiple "object views" of the same base data (e.g.,access the same data resources using different classes), by simply changingthe collection of methods it uses when accessing the data (this would belike using different annotation sets or PICS-like labels in accessing adocument).

The approach may appear somewhat "heavyweight", in the sensethat it involves additional mechanism, and may involve delays in accessingthe code that implements object methods. However:

In this connection, it is useful to compare the architecture that resultsfrom using this approach to that of an Object DBMS (ODBMS). In most currentODBMS client/server architectures, methods typically reside in class librarieslocated on the client, rather than being stored as complete objects onthe server. Only object state resides on the server. When objects are neededby the client, the state is accessed from the server, moved to the client,and complete objects are created locally using the client-based class libraries.In our approach, both the methods and the state (at least conceptually)reside remotely; the client only contains references to the objects. TheWeb delivers the state to the client just the way an ODBMS server does,and delivers the methods as well.

So far, our work has focused on identifying new Web technologies toserve as a base, analyzing their capabilities, and developing the basicprinciples for integrating them. Further work needs to be done to workout the additional details required to build a prototype implementation.For example, we have already noted that there are many object models thatcould be supported using the principles we have identified. It will benecessary to choose a particular object model (or possibly more than one)to use for our Web object model. This, in turn, will affect the structureof the metadata that must be supported. For example, if a class-based modelis chosen, additional metadata will need to be defined to support the classobjects (these could be recorded as Web objects too, using RDF, possiblytogether with techniques from MCF or XML-Data). Further work will be necessaryto determine an appropriate type of object model for use on the Web.

Additional work is also required to define the mechanism that invokesthe object methods once they are returned to the client. This will dependon the details of how the RDF standard evolves. As noted at the end ofSection 3.1, the general RDF-supported metadata access mechanism may providea way to insert this method invocation mechanism. Alternatively, it maybe necessary to define this as an extension to the RDF mechanism.

Finally, as noted already, the DOM currently defines its API at a genericlevel, i.e., at the level of components of a document metamodel. Additionalwork is required to define "application level" object interfaceswhich include interfaces to the methods associated with the objects. Forexample, in the relational database example described in Section 2.3.1,DOM provides objects of types node, element, and so on,rather than objects of type author or editor (or evenobjects of type table or row). Using DOM, an applicationcould effectively create such interfaces from the information given, butit would have to "know what to look for", and would have to traversethe various element objects to find that information. It wouldbe desirable to have a capability for creating DOM-like, but application-oriented,APIs. This could involve using additional metadata (e.g., the DTD, or anXML-Data-like schema) to generate a default API automatically (it mightthen be possible for the document's author to customize this API or, alternatively,define the API explicitly). It might then be possible to attach specificmethods to this API to define application-specific object behavior. Anintegration of DOM and embedded OBJECT elements would be one way to supportthis. This would effectively permit the creation of objects in the classicobject-oriented programming sense.

3.3 Formal Principles

The approach to creating a Web object model described in the previoussections provides the basis for creating genuine objects, having both stateand behavior, on the Web. This would greatly increase the structuring powerof the Web, enabling it to support increasingly complex applications. However,as noted in Section 1, it is also important to have higher level objectservices available for these objects, such as those provided for CORBAobjects in OMG's Object Management Architecture. In providing this additionalsupport, it is important to have a formal foundation for the object modeland its operations. For example, such a formal foundation is essentialas a basis for defining query processing and view facilities (just as theformal foundation of the relational database model is essential for definingquery processing and view facilities for relational databases). A formalfoundation is also helpful as a basis for defining extensions to the model,and generally understanding its capabilities.

In this section, we describe some basic ideas behind work on a formaldefinition for our Web object model. The ideas are derived from work onthe foundations of Web metadata concepts, work on object-oriented logics,and our own prior work on object model formalization. Many of these sameideas are currently being reflected in W3C's ongoing RDF activity.

3.3.1 Logic Basis

Section 2 described a number of different representation techniquesand models for Web-related data. While these models have individual variations,in most cases these models are basically the same model: graphs, with labelededges (although some models are based on a tree structure, they generallyprovide graph capabilities through the use of pointers of one form or another,usually URLs). This is essentially a model of the Web itself: Web resources,identified by URLs, which point to each other by including the URLs ofrelated resources as hyperlinks. Papers describing these models often acknowledgetheir similarity to each other.

Common features of these representational models are:

There are a number of reasons for adopting this form of model to dealwith Web data:

The relationship identified in the last bullet between these representationalmodels and logic-based formalisms is very important, and is explicitlycalled out in a number of papers introducing or analyzing these models.The relationship is, as noted above, important in establishing a formalframework in which to understand these models, as well as in suggestingpossible extensions. The relationship is also important in establishinga way to add more "intelligence", through the use of knowledge-basedcomponents such as "mediators" or "intelligent agents".Such components, for example, will need to have a formal way of interpretingthe data they will be dealing with. The ability to understand Web representationsin terms of logic provides a basis for applying KIF-based technologies,for example. In addition, the fact that these models can be understoodin a common way (have a common semantics expressible in terms of logic)is important in providing a basis for defining translations/conversions(in terms of logic-based rules) between apparently different representations.This is similar to the use of logic-based formalisms to define translationsin federated database systems (see, e.g., [FR97]).

As an example of work within the W3C addressing the relationship oflogic and metadata, Describingand Linking Web Resources is an early W3C note which discusses generalideas and issues for describing and linking Web resources. It referenceswork such as PICS, SOIF, and MCF, and notes that, though these differentformats exhibit a range of syntactic variations, semantically they attemptto convey similar information. The architectural model that is common tothem is the basic structure of the web: a directed graph with labeled arcs.The nodes (or points, or vertices) of the graph are URLs--anchor or resourceaddresses. The arcs are links. The labels are link relationships. Associatedwith each node is a set of attributes, or slots, or fields. Each attributehas a name and a value. Values are defined in a media-type specific manner.

The note also identifies the relationship of these attribute/value-basedschemes to basic concepts in propositional logic. This allows the identificationof the basic principles of the model independently of particular representations.R(S, T) can be used to denote a link from S to T with relationship R. Thesame notation can be used for attributes, writing N(S, V) for an attributenamed N on an anchor at S with value V. For example, both the SOIF description

@FILE {""         Author{4}: Fred         Supersedes{30}: }

and the HTML

<about href="">       <meta name=author content="Fred">       <link rel=Supersedes href=""></about>

can be interpreted as:

Author(, "Fred") Supersedes(,

Link semantics can be modeled by observing that anything can be considereda point in the web--including people, organizations, dates, and subjectcategories--by giving it a URL. A link or attribute in the web can be interpretedas an assertion, given an understanding of the semantics of the link relationshipor attribute name. For example, given the definitions:

the HTML or SOIF data above can be interpreted as the assertions:

A straightforward application of this approach permits the descriptionof a set of assertions about an individual concept, identified by an identifier.Tim Berners-Lee's paper MetadataArchitecture [Ber97] carries these ideas further, and this approachis being reflected in the W3C's RDF specifications.

In addition to the description of simple, flat, sets of attribute/valuepairs describing individual entities, it is necessary for these structuralmodels to be able to handle more complex structures, such as trees (e.g.,repeating groups) and networks (directed graphs). In defining these morecomplex structures, the ability to assign identifiers to both resources,and individual (or groups of) attribute/value pairs is important. Thisallows a given (sub)structure to be assigned an identity, and then referencedfrom multiple places within a data structure. In actual representations,such substructures are indicated not by assigning them separate identifiers,but by some distinct representation technique (e.g., by nesting them withina larger tag). Such substructures need to be understood as being "flattened",with separate identifiers defined, in interpreting them within a logic-basedframework (just as, in the relational data model, data must at least berepresented in unnested "first normal form"). Techniques forfactoring nested parts of a hierarchical structure into a "flat"logical form, and the need for both AND and OR logical operators, are illustratedand discussed in OnInformation Factoring in Dublin Metadata Records <>.

Various specific representation techniques for metadata, such RDF, MCF,SOIF, OEM, etc., can be understood in the context of these observationsas simply involving different encodings of the basic logic-based structures.Each encoding selects specific attributes, identifiers, etc. to clustertogether in specific data representations, and selects others to representas separate entities. Also, they select some relationships to representexplicitly by using identifiers as pointers, and some to represent implicitlyby grouping related constructs in the same data structure. This interpretationof attribute/value pairs (and associated structures) as logical assertionsis a key element in the development of a formal basis for our Web objectmodel, and is explicitly reflected in RDF as well.

3.3.2 Representation of Higher Level Semantics

What is metadata to one application is often data to another, and vice-versa.Hence, it is often important to be able to define metadata which describesother metadata descriptions, or parts of them. For example, it is importantto be able to define the semantics of the individual attributes used inmetadata descriptions, and to define the characteristics of the valuesthat may be assigned to them (e.g., their types, their units, what theysignify). Discussions of structural or "lightweight" models oftenrefer to tagged values as "self-describing", as allowing arbitraryattribute names to be introduced, and as not requiring the use of centralizedattribute or type registration. However, this is only true to a certainextent. These representations are really "self-describing" ina truly useful way only if there is a common understanding of the meaningof the attribute names (and their associated values) by accessing applications.To support general interoperability, the definitions of attribute namesand types must either be actually distributed, or distributed access mustbe provided to them.

A number of abstract models for Web metadata describe the ability tolink metadata individually to tagged items (attributes). For example, theDublin Core describes the ability to access the definition of an individualattribute. This, for example, allows the attributes used in a particulardescription to be linked to an ontology that defines the attributes, andthe set of concepts used in the context that the attributes are intendedto describe. (A resource pointing to its ontology is similar to an objectpointing to its methods, in a sense: it provides an interpretation (themethods are a "procedural specification" of the meaning/behaviorappropriate to the data, while an ontology is human-readable). Work bygroups such as the Stanford knowledge group is intended to merge theseideas and make the ontology readable/usable by knowledge-based software,the idea being that one could have a logic-based or other semantic specificationwhich is declarative, and machine-interpretable.) The relationship betweenattribute/value pairs and formal logic described above also provides abasis for representing these additional kinds of links.

Describing and LinkingWeb Resources discusses how higher level information (such as beliefs),and information about the attributes or relationships themselves, can alsobe encoded using predicate logic. The basic approach is to assign eachrelationship (or attribute) its own URL (object identity), thus reifyingthe relationship (or attribute). Once a relationship has a URL (or otherunique identifier), it can have its own metadata, by recording additionalassertions about that identifier. If the relationship is identified witha URL, dereferencing the URL should access a definition of the link relationship,in either human-readable or machine-readable form. In addition, informationabout the association between a given attribute or assertion and a givenresource can also be recorded. For example, in addition to recording anassertion like cost(o1, $26.95), information as to who made thatassertion, and when, can also be recorded, e.g.:

who( (o1,cost), "fred")when( (o1,cost), "04/07/97")

In this case, (o1,cost) acts as a new unique identifier whichis the identity of the use within (or for) o1 of the attribute "cost"(this is a form of identifier construction mechanism supported byobject logics, such as F-logic, described below).

Metadata Architecture[Ber97] observes that the URL space is an appropriate space for the definitionof attribute names in the Web because it effectively provides for a federatedname space, within which users can freely define attribute names withoutnecessarily "registering" them with a central authority. However,the URLs that identify relationships or attributes need not necessarilybe used locally (within a given resource). Instead, local names from anamespace defined by the resource can be used as abbreviations. However,it should always be possible to translate from a local name to the globalURL that represents the actual definition of the relationship or attribute.Relationships such as the following could be defined to represent theseconcepts:

These ideas are being reflected in the RDF, XML, and other W3C specifications.Such reification of attributes and relationships (and also of types andmethods) is also a key element in the development of a formal basis forour Web object model.

3.3.3 Object Logics

Along with the development of object technology, a number of attemptshave been made to extend logical formalisms to represent the specific characteristicsof objects. A particular goal in the development of object logicshas been to provide the same type of solid theoretical foundation for object-orienteddatabase systems that the relational model provides for relational databasesystems. The foundation of the relational model (specifically relationalcalculus) is a restricted subset of conventional predicate logic. The reasoningwas thus that, in order to have the same sort of theoretical foundationsfor object-oriented database systems, it would be necessary to have a logicanalogous to predicate calculus, but one that would incorporate objectconcepts such as objects, classes, methods, inheritance, etc. A numberof object logics have been introduced, one of the more thoroughly-developedof which is F-logic(Frame Logic) [KL89, KLW95].

A full exposition of F-logic is outside the scope of this paper (andin any case can be obtained from the cited references). However, F-logicincludes a number of capabilities that are relevant to this discussion.For example, F-logic supports operations on both flat data structures (alongthe lines of the conventional relational model) and nested data structures(path traversal). F-logic also supports id-terms representing objectidentities. These are logical terms which which use object constructorfunctions that can be interpreted as constructing object identitiesthat are functionally dependent on their arguments. These terms are usedto represent derived objects (e.g., objects to be constructed on the left-handsides of rules), with the arguments of the function indicating the baseobjects from which the new objects were derived (effectively, the derivedidentity can be considered as the labeled tuple of the base identities).The ability to construct derived objects is crucial in describing the semanticsof queries which produce new objects from existing ones (as a relationaljoin operation does) and of views.

Finally, F-logic introduces higher-order capabilities, in order to effectivelydescribe inheritance, and operations on metadata (e.g., database schemas),while retaining first-order semantics. This is done, as suggested in theprevious section, by reifying concepts such as predicates, functions, andatomic formulas, allowing them to be manipulated as first-class objects.This reification allows the use of higher-order syntax, while retainingfirst order semantics. Under first-order semantics, predicates and functionshave associated objects, called intensions, which can be manipulateddirectly. Depending on the context in which they appear, these intensionsmay assume different roles, acting as relations, functions, or propositions.For example, in F-logic, id-terms are handled as individuals when theyoccur as object identities, viewed as functions when they appear as objectlabels (attributes), and as sets when representing classes of objects.When functions or predicates are treated as objects, they are manipulatedas terms through their intensions; when being applied to arguments, theyare evaluated as functions or relations through their extensions.

The use of F-logic concepts in helping define query language conceptsfor object-oriented databases is described in [KKS92], including querylanguage support for:

In addition, the higher-order capabilities of F-logic are those neededto formally define the use of mixtures of data and metadata within theWeb. For example, in dealing with an RDF description of a Web resource,in some cases we may want to treat one of the RDF properties as simplya property of the described resource. In other cases, we may want to treatthe property as an object in its own right (by following its URL), withproperties of its own (e.g., its definition, or the ontology it is a partof). RDF explicitly allows this, using the sort of reification we havealready described. Using F-logic (or possibly a variant), we hope to providea formal basis for describing such operations, and for the developmentof both our Web object model, and query languages and other services basedon it.

4. Conclusions

In this paper, we have:

At the moment, we have only identified an approach toward integratingthese technologies. Many details of this technology integration must stillbe worked out (partially because some of the key technologies we have identifiedare still under development). Nevertheless, we feel that the capabilitiesinherent in these technologies provide the necessary support for the objectmodel integration we propose.

We feel that a particularly important aspect of this work is the attemptto rely to the greatest possible extent on standards (commonly-acceptedor likely-to-be-accepted Web technology) in developing our integrationapproach, and on working within standards-developing organizations suchas W3C and OMG in further refining it and developing additional capabilities.This both takes maximum advantage of existing work, and improves the chancesthat the technology that is developed will become widely available (albeitpossibly in some modified form) in commercial software products.

Further work on this project will include:


[AQMW+96] S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener,"The Lorel Query Language for Semistructured Data", also the other papers available at the StanfordDB group Publications page <>.

[BBBC+97] R. Bayardo, Jr., W. Bohrer, R. Brice, A. Cichocki, J. Fowler,A. Helal, V. Kashyap, T. Ksiezyk, G. Martin, M. Nodine, M. Rashid, M. Rusinkiewicz,R. Shea, C. Unnikrishnan, A. Unruh, and D. Woelk, "InfoSleuth: Agent-BasedSemantic Integration of Information in Open and Dynamic Environments",Proc. 1997 ACM SIGMOD Conf., SIGMOD Record, 26(2), June 1997.

[BDHS96] P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu, "AQuery Language and Optimization Technique for Unstructured Data",Proc. SIGMOD'96, 505-516.

[BDFS97] P. Buneman, S. Davidson, M. Fernandez, and D. Suciu, "AddingStructure to Unstructured Data", Proc. ICDT, 1997.

[Ber97] T. Berners-Lee, MetadataArchitecture, <>.

[Bor95] A. Borgida, "Description Logics in Data Management",IEEE Trans. on Knowledge and Data Engineering, 7(5), October 1995,671-682.

[Bos97] J. Bosak, XML,Java, and the Future of the Web, <>,1997.

[CHHM+94] B. Chhabra, D. Hardy, A. Hundhausen, D. Merkel, J. Noble,M. Schwartz, "Integrating Complex Data Access Methods into the Mosaic/WWWEnvironment", Proc. Second Intl. World Wide Web Conf., Oct.1994, 909-919.

[CM93] S. Chiba and T. Masuda, "Designing an Extensible DistributedLanguage with a Meta-Level Architecture", Proc. ECOOP '93,LNCS 707, Springer-Verlag, July 1993, 482-501.

[DeR97] S. DeRose, The SGML FAQ Book, Kluwer, 1997.

[FR97] G. Fahl and T. Risch, "Query Processing over Object Viewsof Relational Data", VLDB Journal 6(1997) 4, 261-281.

[GB97] R. Guha and T. Bray, MetaContent Framework Using XML, <>,June 6, 1997.

[GW97] R. Goldman and J. Widom, "DataGuides: Enabling Query Formulationand Optimization in Semistructured Databases", Technical Report, StanfordUniversity, 1997,

[Hop97] A. Hopmann, et. al., WebCollections using XML, 1997 <>.

[IK96} T. Isakowitz and R. J. Kauffman, "Supporting Search forReusable Software Objects", IEEE Trans. Software Engrg. 22(6),June 1996, 407-423.

[ILCS95] D. Ingham, M. Little, S. Caughey, S. Shrivastava, "W3Objects:Bringing Object-Oriented Technology to the Web", Proc. Fourth Intl.World Wide Web Conf., World Wide Web Journal, December, 1995,89-105.

[ISO86] International Standard ISO 8879:1986(E), Information Processsing- Text and Office Systems - Standard Generalized Markup Language (SGML),International Organization for Standardization, 1986.

[ISO92] International Standard ISO/IEC 10744:1992, Information Technology- Hypermedia/Time-based Structuring Language (HyTime), InternationalOrganization for Standardization, 1992.

[ISO96] International Standard ISO/IEC 10179:1996(E), Information Technology- Processing languages - Document Style Semantics and SpecificationLanguage (DSSSL), International Organization for Standardization, 1996.

[KKS92] M. Kifer, W. Kim, and Y. Sagiv, "Querying Object-OrientedDatabases", Proc. ACM SIGMOD Conf., 1992, 393-402.

[KL89] M. Kifer and G. Lausen, "F-Logic": A Higher-Order Languagefor Reasoning about Object, Inheritance, and Scheme", Proc. 1989ACM-SIGMOD Intl. Conf. on Management of Data, 1989. See also otherpapers on F-logic and relatedformalisms <>.

[KLW95] M. Kifer, G. Lausen, and J. Wu, "Logical Foundations ofObject-Oriented and Frame-Based Languages", Journal of the ACM,July 1995, 741-843.

[KR97] R. Khare and A. Rifkin, "XML: A Door to Automated Web Applications",IEEE Internet Computing, 1(4), July-August 1997, 78-87.

[Man93] F. Manola, "MetaObject Protocol Concepts for a 'RISC' ObjectModel", TR-0244-12-93-165, GTE Laboratories Incorporated, 1993 <,directory pub/dom>.

[Man97] F. Manola (ed.), "NICTS Technical Committee H7 Object ModelFeatures Matrix", X3H7-93-007v12b, May 25, 1997,

[MGHH+97] F. Manola, D. Georgakopoulos, S. Heiler, B. Hurwitz, G. Mitchell,F. Nayeri, "Supporting Cooperation in Enterprise-Scale DistributedObject Systems", in M. Papzoglou and G. Schlageter, eds., CooperativeInformation Systems, Academic Press, 1997.

[NUWC97] S. Nestorov, J. Ullman, J. Wiener, and S. Chawathe, "RepresentativeObjects: Concise Representations of Semistructured Hierarchical Data",in Proc. Thirteenth Intl. Conf. on Data Engineering, Birmingham,U.K., April 1997.

[OMG95] Object Management Group, The Common Object Request Broker:Architecture and Specification, Revision 2, July, 1995.

[OMG97] Object Management Group, A Discussion of the Object ManagementArchitecture, June, 1997,

[PGW95] Y. Papakonstantinou, H. Garcia-Molina, and J. Widom, "ObjectExchange Across Heterogeneous Information Sources", IEEE Intl. Conf.on Data Engineering, 251-260, Taipei, March 1995. See also the other papersavailable at the TSIMMISPublications page <>.

[REMB+95] O. Rees, N. Edwards, M. Madsen, M. Beasley, A. McClenaghan,"A Web of Distributed Objects", Proc. Fourth Intl. World WideWeb Conf., World Wide Web Journal, December, 1995, 75-87.

[SG95] N. Singh and M. Gisi, "Coordinating Distributed Objectswith Declarative Interfaces",

[SW96] R. Stroud and Z. Wu, "Using Metaobject Protocols to SatisfyNon-Functional Requirements", in C. Zimmermann (ed.), Advancesin Object-Oriented Metalevel Architectures and Reflection, CRC Press,Boca Raton, 1996, 31-52.

This research is sponsored by the Defense Advanced ResearchProjects Agency and managed by the U.S. Army Research Laboratory undercontract DAAL01-95-C-0112. The views and conclusions contained in thisdocument are those of the authors and should not be interpreted as necessarilyrepresenting the official policies, either expressed or implied of theDefense Advanced Research Projects Agency, U.S. Army Research Laboratory,or the United States Government.

© Copyright 1997, 1998 Object Services and Consulting,Inc. Permission is granted to copy this document provided this copyrightstatement is retained in all copies. Disclaimer: OBJS does not warrantthe accuracy or completeness of the information in this survey.

This page was written by Frank Manola. Send questionsand comments about it to

Last updated: 2/10/98 fam