Fundamentally, a grid is an integrating mechanism or concept, and can be applied at different computer system levels, e.g.:
The definitions of "grid" found in dictionaries generally imply some concept of a "network" or "mesh". This is certainly the generic idea of a grid. Many things can be referred to as "grids" in this sense, including, in the context of computer systems, the Internet, the Web, or the objects in a CORBA-based distributed object system (which form an interconnected network by virtue of the references they have to each other). However, the grid concept used in computational grids, computing fabrics, and the CoABS grid implies additional requirements, a stronger cohesiveness. Typically, such "true grids" are formed by starting with networks of computing resources, and adding capabilities or services that help further integrate the interconnected resources. The integration found in grids involves:
In addition to the need for these individual types of grids (computer, data, object, agent), there is also a need to combine these various levels of grid capabilities. An agent-level grid supporting this requirement should provide both grid capabilities at the computation and data/object levels in support of agents, as well as grid capabilities at these other levels enabled by agents. Both these types of support are important in making the maximum use of agent-level capabilities. For example, agent-level grids (and also object-level grids) can take advantage of the capabilities of underlying computational grids in supporting their load balancing and quality-of-service requirements (particularly where the higher-level grids can interact directly with the lower levels to exert control). Operational agent grids will also need to interact with data and object systems (which hopefully will become grids at these levels), since much information and software functionality that will need to be accessible to agent grids will continue to exist in these systems. At the same time, the technical demands of grid concepts at all levels require increasing amounts of "intelligence", collaborative ability, adaptability, component mobility, etc.; in other words, characteristics frequently associated with agents. The integration of these different levels requires the use of object/component technology, together with reflective (self-referencing) capabilities combined with extensive metadata. This is because objects provide a generic modeling or abstraction mechanism for looking at the wide range of resources that need to be included at all levels in such a combined system.
In such an integrated architecture, in addition to the technical levels already discussed, there is also a need to define additional forms of organization on the available resources. These include the use of multiple tiers, the use of Common Schema concepts or enterprise-level ontologies (enterprise-wide agreements on common semantics), and specialized schemas/ontologies for use by specialized user communities (together with mappings to the common definitions where possible). Semantics-based mappings between the different technical levels in such an architecture are also required. Such additional levels of organization provide the basis for more interoperability among the various technical levels of objects contained in the system, and hence help enhance the ability of these objects to operate as a "true grid".
The grid concept is being applied to computer systems at several different "levels" (e.g., to both systems of computers and systems of agents). As a result, this study attempts to identify some general characteristics which seem to apply to all sorts of grids, in order to provide a "big picture" in terms of which grid concepts can be better understood, rather than presenting the details of technical issues associated with specific grid concepts (although this is also important). In Section 2, we give examples of several "grid-like" concepts, in order to provide a background for understanding the grid concept. In Section 3, we identify some general grid characteristics, based on common characteristics of these examples. We also look at some important types of computer systems, such as database and distributed object systems, examine the extent to which they resemble grids, and identify some types of facilities which, when added to those systems, would cause them to be considered more "grid-like", based on the characteristics we have identified. We also discuss the need to combine these various types of grids together into unified architectures, and describe the start of a general approach to doing this.
The paper identifies five major application classes for computational grids:
The paper also notes that "computational infrastructure, like other infrastructures, is fractal, or self-similar at different scales. We have networks between countries, organizations, clusters, and computers; between components of a computer, and even within a single computer." The paper describes systems at the scales of end system, cluster, intranet, and internet, the basic idea being that these constitute different scales at which similar computational services should be provided (mimicing those provided at the smallest scale, in the individual computer). Of course, it is then necessary to look at how those similar services must be provided as the scale changes, since different technologies must typically be employed.
[GFLH98] and [FK99b] describe several projects developing technology for computational grids. A simple example is PVM (Parallel Virtual Machine) <http://www.epm.ornl.gov/pvm/pvm_home.html>. PVM is a software package that permits a heterogeneous collection of Unix computers hooked together by a network to be used as a single large parallel computer. PVM allows users to exploit existing computer hardware to solve large computational problems at minimal additional cost. PVM is very portable, and the source code has been compiled on a wide variety of machines. PVM is very widely used, and is a de facto standard for distributed computing world-wide. A wide range of PVM-related links is available at the PVM home page cited above. Related facilities are provided by MPI (Message Passing Interface) [GLS94], a community-generated standard for message passing used to interconnect multiple machines. UCLA's Project Appleseed <http://exodus.physics.ucla.edu/appleseed/appleseed.html> is an example of how MPI can be used to link together a cluster of computers (Macintoshes in this case) to provide "a plug and play parallel computer" in support of numerically-intensive processing. The Appleseed Web site also contains pointers to further information on MPI.
Legion <http://www.cs.virginia.edu/~legion/> [GG99] provides an environment in which a collection of workstations, vector supercomputers, and parallel supercomputers connected by LANs and larger-scale networks appears to the user as a single very powerful computer. Legion uses object-oriented design techniques to "simplify the definition, deployment, application, and long-term evolution of grid components". The Legion architecture defines a complete object model that includes object abstractions for compute resources (called host objects), storage systems (called data vault objects), as well as other object classes. Users can use inheritance to specialize the behavior of these objects to support specific requirements, as well as to develop new objects. Legion supports PVM's libraries via emulation libraries. Legion aims to provide a single, coherent, virtual machine addressing scalability, programming ease, heterogeneity, fault tolerance, security for users and resource providers, site autonomy, multilanguage support, and interoperability. The use of reflection (the representation of parts of the underlying system as objects that can be directly operated on to access and change system behavior) is particularly important in Legion. For example, host objects represent Legion processors. One or more host objects run on each computing resources included in Legion. These objects create and manage processes for application-level Legion objects. Object classes invoke the operations of host objects to activate their instances on the computing resources that the host objects represent. Representing computing resources as Legion objects abstracts the heterogeneity of different host computing platforms, and allows resource owners to manage and control their resources within the context of the system.
Globus <http://www-fp.globus.org/> [FK99c] is developing basic software infrastructure for computations that integrate geographically distributed computational and information resources. Globus is based on the assumptions that:
[FF97a,b,c] discuss the concept of "High-Performance Commodity Computing", the idea that computational grids should be based on emerging commodity network computing technologies such as CORBA, DCOM, and JavaBeans, together with the Web and conventional networking approaches. The papers discuss a three-tier architecture which integrates these technologies. This approach is in contrast with the more specialized grid architectures proposed in Legion and Globus (although these could be integrated to support lower-tier services). The authors particularly emphasize the importance of the emerging "Object Web", integrating the Web, distributed objects, and databases, in the development of computational grid technology.
The focus of much of this work appears to be on large-scale computing problems, although the technology is clearly not limited to those applications. Other grid concepts extrapolate ideas in distributed supercomputing to more complex applications. For example, in distributed supercomputing, the model is that of a single computing "job". The program is run, and a result is produced. In other applications, the application is of a more continuous nature. This means it must be possible for participants to enter and leave the grid, load distribution is even more dynamic (because the load and its requirements change more dynamically), etc. The next section describes a new twist on more familiar applications supported by computational grid concepts.
The articles give ubiquitous network computing as an example of an application made possible by the Fabric. The first aspect of the application is network computing: each user can access their individual "desktop" (configuration, including all applications, data, etc.) from anywhere on the network. To this is added ubiquitous computing, in which processors, displays, and input devices are everywhere. Users are tracked by sensors, and their location information is used to direct their applications and data to the appropriate devices that are located where the user is located. This changes as the person moves. There is no need for users to explicitly login to access their computing spaces, they are just "there". The Fabric helps avoid the need for the universal presence of sufficient computing power, displays, and input devices necessary to run whatever applications the user wishes to run locally. In this scenario, processors are located all over, e.g., throughout buildings ("as populous as wall sockets, perhaps more so"), and are "richly interconnected by low latency high bandwidth connections. When the user is stationary, the user's tasks run on a local cell, consisting of processors in the general vicinity, which work together as a single system. If the tasks require it (and they can be paid for), additional processors can be added (thousands of them, if necessary); the computing resources are configured as required to run the software the user wants to run. As the user moves, their cell moves with them. Processing nodes leave the user's cell as their distance makes their communications latencies more than a threshold level, replaced by nodes that enter the cell as the user gets near them. A new generation of wearable processor, display, and input devices rounds out the picture.
Technically the concept of Computing Fabrics involves ideas that are somewhat similar to those of the computational grid, but the application focus is somewhat different. Technologies relevant to the creation of the Computing Fabric concept include:
These bullets suggest that the CoABS grid knows not only about agents, but about their computational requirements (e.g., how they can be broken up into processes, so they can be distributed across multiple computers), and about available computational (and other) resources. Hence, the CoABS grid concept appears to incorporate both the concepts of "grid" as used in Section 2.1, and Computing Fabric as used in Section 2.2, in the sense of providing a unified, heterogeneous distributed computing environment in which computing resources are seamlessly linked. In addition, the CoABS grid extends the idea upward to the agents that are the "applications" of this distributed computing environment. Agents become both applications whose computations can be distributed within this computing environment, and also resources that can be used by this environment. At the same time, there appears to be an interface between these two layers, so that at least some agents, e.g., those that do load balancing, can operate on the computing level grid.
Building the grid suggested by the above bullets would appear to involve all the computational grid issues of system management (and associated metadata), distributed computation and load balancing, mobile code, security, etc., as well as the "agent-level" versions of those issues. This requires a way of describing resources and capabilities, and resource requirements and tasks, and a way to map between them, at both agent and computational levels. This grid also apparently involves the need for a way of defining higher-level goals, i.e., a way to define the goals of the grid itself, that are optimized by the load-balancing, etc. that is going on. These goals are presumably at a higher-level than those of individual agents (although these might also be characterized as the goals of higher-level agents, or agents of higher authority, rather than goals of the grid per se.)
This suggests that one view of the CoABS grid could be that of a combination of a computational grid and an Agent System architecture (or at least a form of one, aimed at federating conventional agent system architectures). This would mean that it would need to incorporate typical agent system architecture services. A typical list of these services is given below (see, e.g., [Pis98a; KT98; Paz98a,b; Tho98a (slide 13)].
The problem with ascription is that it allows practically anything to be described as an agent, making communication about agent concepts difficult among people who don't share the same point of view. A useful "filter" for using "agent" to describe a piece of software is that it should be useful to do so; that is, calling something an agent should in some useful sense distinguish it from concepts we already understand. For example, [Bra97b] quotes [Sho93] as observing:
"It is perfectly coherent to treat a light switch as a (very cooperative) agent with the capability of transmitting current at will, who invariably transmits current when it believes that we want it transmitted and not otherwise; flicking the switch is simply our way of communicating our desires. However, while this is a coherent view, it does not buy us anything, since we essentially understand the mechanism sufficiently to have a simpler, mechanistic description of its behavior."
A descriptive definition of an agent, on the other hand, typically involves a set of attributes, which a given agent might have to a greater or lesser extent, one such set being:
A similar situation exists in attempting to precisely define "grid". We can get a general idea of what "gridness" is from the "family resemblance" of the grid examples presented in earlier sections. Further examples of grid-like ideas are presented in Characterizing the CoABS Grid. In addition, that report contains a set of general grid properties which could be used in a descriptive definition of a grid. [FK99a] contains other sets of grid attributes. In addition, considering an agent grid as a generalization of an agent system architecture, the list of services in Section 2.3 could be used as descriptive attributes of grids, together with sets of attributes given in [HS98].
The definitions of "grid" found in dictionaries generally imply some concept of a "network" or "mesh". This is certainly the generic idea of a grid. Many things can be referred to as "grids" in this sense, including, in the context of computer systems, the Internet, the Web, or the objects in a CORBA-based distributed object system (which form an interconnected network by virtue of the references they have to each other). However, the grid concepts described in Sections 2.1-2.3 imply additional requirements, a stronger cohesiveness. If we are going to use the term "grid" in a computer context, the example of "mis-ascription" cited above becomes relevant: in the same sense that it buys us nothing to refer to a light switch as an "agent", it buys us nothing to refer to the Web as a "grid". In other words, if we are going to use a new term such as "grid" to describe particular computer-based systems, it would be helpful to explicitly identify the properties we want to associate with those systems that distinguish them from computer-based systems we are already familiar with (such as the Internet, the Web, distributed object systems, etc.), and for which we already have other names.
In addition, a problem with current descriptions is that the grid concept is relatively new. As a result, the focus of descriptions is on individual grid concepts and applications, and little attempt has been made to provide a "big picture" that might help unify the various concepts and related technologies. For example, what is the relationship between a computational grid or computing fabric, the Web (as a form of "information grid"), distributed object systems, and agent grids? For example, the CoABS grid does not (at least not yet) consider integration of data, distributed object systems, or the Web to any great extent, although it is clear that it will have to function in connection with these technologies.
In the sections that follow, we describe some basic ideas for use in characterizing computer-related grids. In Section 3.2, we discuss some general attributes that seem to apply to computer-related grids. In Section 3.3, we look at some important types of computer systems, examine their "gridness", and identify some types of facilities which, when added to those systems, would cause them to be considered more "grid-like". In Section 3.4, we discuss the need to combine the various levels of "grids" together into unified architectures, and the start of a general approach to doing this. We present some concluding remarks in Section 3.5.
"Gridness" can be thought of as a continuum. At one end, there is the simple interconnection or network of resources, as in the dictionary definitions of "grid". We can think of such a network as a "loose grid", if we must use the term "grid" for these networks at all. At the other end, there are the systems that allow the interconnected resources to function as well-integrated units, as in the systems described in Sections 2.1-2.3 (particularly the CoABS grid as described in [CoA98]). We can refer to these systems, which exhibit the characteristics described above (and possibly other defining characteristics not yet identified) in the strongest sense, as "true grids". This "true grid" endpoint of the "gridness" continuum is, of course, an arbitrary designation. Systems will exist at various points along the continuum, becoming "stronger grids" as they exhibit these "gridness" characteristics to a greater extent.
A key aspect of grids is composition of resources in a sense that goes beyond simply interconnecting them (although that is clearly required). The compositional facilities provided by grids can apply at all levels, including hardware/computational power, data and software (software including both individual components and services, and composition including such things as interoperability and formation of aggregates), and agents and people (e.g., formation of communities and teams). These compositions of resources are applied to "composed tasks" (i.e., tasks that go beyond separately accessing or invoking the individual resources): in the transportation grid, the composed task is generically "provide access to resources"; in the power grid, it's "provide power"; in computers, it's presumably "provide computation" (or, more abstractly, "perform service/task"). At the agent/human level, the tasks are suitably abstract (e.g., as "translate document" is enabled by the CoABS Grid knowing that there's a connected person who understands Arabic). Ideally, we want these compositions to exhibit a fractal property or, looking at them the other way around, we want composition to exhibit a closure property. This means that the resource compositions have the same characteristics as the individual resources at the same level of abstraction, so that we can treat the compositions as resources themselves. For example, the computational grid seamlessly forms a large virtual computer from individual computers in a network, forming something that looks like yet another computer (which itself could be further aggregated). Similarly, relational database theory emphasizes the idea that operations on data such as joins should exhibit a closure property, permitting newly formed aggregates of data to be operated on in the same way as the pieces from which they were formed. A similar idea can apply to agents. It should be possible to form teams or communities of agents that are interacted with as if they were single agents, with the group transparently dividing up any resulting work that has to be done. Grids also tend to emphasize the dynamic aspects of composition, i.e., that it should be possible to easily form compositions of resources, then break them up when the resources are no longer needed, for recomposition elsewhere. In addition, grids tend to involve some level of unified (but not necessarily centralized) management, since grids tend to be thought of as "units". However, care is needed to match the level of abstraction of the management with the level of abstraction of the grid. For example, the Internet has a certain amount of load management at the network level, but this does not make it a computational grid, even though it does connect numerous computers. A level of management at the computational level would be required for that.
Whether the composition of resources involves movement of the resources depends on the kind of grid and its applications (there is invariably movement of some sort, but not necessarily of the resources). For example, the composition of resources in a transportation grid necessarily involves moving those resources from where they are to where they are needed. In a computational grid, the resources are generically "computational capacity". In conventional computer networks, the capacity itself doesn't move, instead, the load is moved. However, specific groupings of capacity ("virtual capacity") can seem to move as sharing arrangements and interconnections are set up and torn down (as in the case of the Computing Fabric of Section 2.2). Data is moved in a computer network in the same way that resources are moved on a transportation grid. In the case of distributed object systems, there can be either movement of load alone (e.g., in CORBA systems, where objects are static, messages representing load are sent to them, and messages representing results are returned), movement of resources (in the case of Java objects), or both (e.g., even in a Java-based network, some services, or special purpose devices such as sensors, may not be able to move). Similar considerations apply to agent systems.
Grids involve the participants providing to the grid as well as taking from it. There is a great deal of asymmetry in some grid-related technologies that sometimes must be dealt with in order to build "true grids" from these technologies. For example, it is straightforward to think of connecting personal computers to the Internet in order to access information. It is less straightforward to think of these personal computers as being part of the Internet in the sense of having their file systems and computational facilities fully integrated with the Internet in order to form a "true grid". To do this, additional technical (and security) issues must be addressed. From another point of view, it is generally more straightforward to integrate data than it is computational capabilities. Typically this is because (a) the interfaces (for others to gain access to attached computing resources) are not as well developed as they are for data, and (b) the mechanisms for effectively using the added computation are not as well developed either (e.g., in a local network it may be possible to run an application located on someone else's machine, but it is not as easy to distribute a computation over several machines).
The relationship between a grid in a "loose" sense and a grid in the stronger sense of Sections 2.1-2.3 is often that the "loose" grid is or can be used in an organization that constitutes a "true grid". Finding the actual grid may sometimes require considering a wider context, or adding additional technology. For example, the transport grid (or a subset, like "the railroad grid") may be viewed as just the network of transport connections and the points connected. However, this grid was created in the context of higher-level desires by people to move/share resources (food and other goods). It is the unification of the transport links, together with the higher-level control mechanisms (and to some extent the economic system that provides the "tasking") that creates a grid in the stronger sense. Internet email is another example. At one level, the Internet may be thought of as a loose form of grid, because it provides network connectivity among multiple computers. However, via Internet email, it is possible for people to organize collaborative efforts, integrating the activities of widely-scattered people. In this case, considering the connected people as part of the "system" enables it to be thought of more realistically as a grid, with the Internet as a part, and with higher-level organizational strategy and goals being provided by the people involved. Similarly, distributed computer networks are at the heart of the computational grids described in Section 2.1, but additional mechanisms must be added to those networks in order to form grids in the stronger sense. Expanding the context can help us see both the grid that was intended, and also what additional components and mechanisms would be necessary to form a "true grid". This suggests that we might want to look at technologies, such as the Web and distributed object systems, that clearly exhibit certain characteristics that we associate with grids, but look at them not as grids, but as "proto-grids", and look carefully for the additional technologies that could be added to them to create grids in the stronger sense.
Finally, as stated in the final bullet above, "gridness" seems to imply that the system is "aware of itself" to a certain extent, and has the ability to carry out its tasks "itself", without a great deal of manual intervention. For example, any interconnected group of distributed computers could be used as a much larger "virtual computer" by employing programmers to cope with all of the distributed programming and other problems necessary to use these resources for specific problems. That does not mean that this set of distributed computers by itself is a grid. What differentiates a computational grid is the fact that the grid itself provides those services over and above the computers and network that provide the "virtual computer" illusion (possibly to a greater or lesser extent) without the detailed programming that would otherwise be necessary. Similar comments apply to grids at other levels.
We need not say much about grids at the level of computation, since the computational grid is our original, paradigmatic computer-related grid. Computational grids combine an interconnected network of computers with the necessary control and other technologies necessary to form a "true grid" from these computing resources, and the grid exists to form compositions that are bigger, virtual computers. The technologies that need to be added to the interconnected computers to form the grid have been introduced in Sections 2.1 and 2.2, and are thoroughly discussed in the cited references.
The dividing line between the technologies needed at this level and at other levels is necessarily fuzzy. For example, some of the technologies involved at this level are those that provide composition of "computation", not just of "computers", e.g., parallel and distributed programming technologies, such as those provided by PVM. The need for composition at the level of "computation", not just "computer" (but nevertheless at a fairly low level) is further illustrated by Jini's inclusion of a distributed transaction facility as an integral part of what is essentially a rather basic set of facilities. Transactions essentially define compositions of computations that are to be considered, from the outside, as single units, and hence help simplify the programming of distributed concurrent computations.
Grid-like systems also exist at the level of data. By analogy with general grid principles, data-level grids would interconnect pieces of data, and enable the interconnected collection of data to be treated as a unit for various purposes. An obvious candidate for "gridness" at this level is a database. A database constitutes a data grid in the loose sense, since it forms an interconnected collection of related pieces of data. However, a database system can also be thought of as more of a "true grid" by considering the compositional and other technologies typically associated with modern database systems. For example:
At the same time, conventional DBMSs are limited in their support to just data, and data of relatively limited types at that (object DBMSs are considered below). We might expect true "data grids" to have much wider coverage of data types than current DBMSs. In addition, DBMSs would more closely resemble "true grids" by incorporating additional self-management and organizing facilities. For example, an active DBMS that monitored its own content, and could automatically incorporate attached new data sources, would exhibit more "true grid" characteristics than current "static" DBMSs. Ideally, such capabilities would also be extended to allow the connection of heterogeneous databases into federations, based on common metadata, ontology, conceptual schema concepts much more readily than is now the case. DBMS functionality could also be distributed into the network so that "the network is the DBMS". This trend is related to the information mediator architectures of the DARPA I*3 and BADD programs, as well as to information agents [Tho98a (slide 14)].
The Web is in many respects a primitive form of distributed database (using its own particular data representations), similar in many respects to early network databases. Once a page is posted to a Web server, it potentially (assuming it points to other pages, and other pages point to it) becomes part of an interconnected collection of data whose component pages can be readily and uniformly accessed. However, the mechanisms needed for unifying this collection into a more coherent whole are at a relatively early stage. Examples of the additional technology needed to make the Web more of a "true grid" include:
The Web can increasingly be thought of as a form of object grid as well [Man98a,b; Man99], due to:
However, it is not enough for such services to be defined; they must also be implemented, and integrated in a seamless way in a given system, in order for that system to begin to have grid properties. This is a general issue with distributed object systems today: while the systems provide the basic distributed object interconnection facilities, the additional services which would allow the objects to be combined and used in flexible ways are generally either not very well developed, or not integrated in a very transparent way either with the objects themselves or with each other. Ideally, what is desired is a seamless "sea of objects" which eliminates or minimizes distinctions between local, persistent, or distributed objects, and in which services are transparently available. For example, an object DBMS attempts to both minimize the distinction between transient and persistent objects (including the largely-automatic movement of objects off and onto persistent storage) and seamlessly integrate services that can be used with such objects. A great deal of additional work must be done when using any of today's distributed object systems (including CORBA) to achieve even this level of seamlessness and integration, let alone transparently support such capabilities as load balancing or object replication (there is an OMG Replication Service RFP to which responses are currently being submitted).
Both the sets of higher-level services available with current distributed object systems (CORBA, DCOM, Java, and their developments) and the maturity of these services differ greatly. Some facilities are rapidly being developed for Java which are becoming more slowly available in CORBA (due in some respects to the need in CORBA to deal with platform and language heterogeneity). Also, for various technical reasons, many of the techniques used in these distributed object systems do yet not scale well to systems containing many millions of objects (although, e.g., such systems can and have been implemented using CORBA-based technologies).
There is also a great deal of work needed on better object composition mechanisms, including improved techniques for forming basic objects from separate pieces of data (state) and code (software), and improved techniques for forming higher level components (or "business objects") or other object aggregations, complete with object interfaces, from collections of individual objects. Better facilities are also needed in many other areas, including:
At the agent level, a considerable amount of additional work also needs to be done, as illustrated by the existence of the CoABS program itself, and work on the CoABS grid. An agent grid exhibits all the general requirements (and associated services and issues) of the other grid levels, but "translated" into the agent level. For example, load balancing at the agent level involves balancing the loads of agents (and thus requires a way to describe the "load" of an agent, and how to tell if an agent is "overloaded"), and composition must address the requirements of agent composition (e.g., into teams), and agent-level division of labor. The references cited in Section 2.3 describe some of the many issues connected with the development of the CoABS grid, a particular agent-level grid.
At the same time, the technical demands of grid concepts at all levels require increasing amounts of "intelligence", collaborative ability, adaptability, component mobility, etc.; in other words, characteristics frequently associated with agents. For example [Bra97b] discusses the use of agent technology in simplifying and enhancing distributed computing capabilities, and in particular enhancing intelligent interoperability in such systems. One such use is the incorporation of agents as resource managers. He notes: "A higher level of interoperability would require knowledge of the capabilities of each system, so that secure task planning, resource allocation, execution, monitoring, and possibly, intervention between the systems could take place. To accomplish this, an intelligent agent could function as a global resource manager." Further distributing these functions among multiple agents, "A further step toward intelligent interoperability is to embed one or more peer agents within each cooperating system. Applications request services through these agents at a higher level corresponding more to user intentions than to specific implementations, thus providing a level of encapsulation at the planning level, analogous to the encapsulation provided at the lower level of basic communications protocols." Agents can also assist in providing better user interfaces for such distributed systems. As [Bra97b] observes, "In the future, assistant agents at the user interface and resource-managing agents behind the scenes will increasingly pair up to provide an unprecedented level of functionality to people."
[Gen97] also describes the role of agents in enabling interoperability in distributed systems. In his approach, agents and facilitators are organized into a federated system, in which agents surrender autonomy in exchange for the facilitator's services. Facilitators coordinate the activities of agents and provide other services such as locating other agents by name (white pages) or by capability (yellow pages), direct communication, content-based routing, message translation, problem decomposition, and monitoring. On startup, an agent initiates an ACL connection to the local facilitator and provides a description of its capabilities. It then sends the facilitator requests when it cannot supply its own needs, and is expected to act to the best of its ability to satisfy the facilitator's requests.
The integration of agents with other levels requires the use of object/component technology, together with reflective (self-referencing) capabilities combined with extensive metadata. For example, [Bra97b] observes: "A key enabler is the packaging of data and software into components that can provide comprehensive information about themselves at a fine-grain level to the agents that act upon them. Over time, large undifferentiated data sets will be restructured into smaller elements that are well-described by rich metadata, and complex monolithic applications will be transformed into a dynamic collection of simpler parts with self-describing programming interfaces. Ultimately, all data will reside in a "knowledge soup", where agents assemble and present small bits of information from a variety of data sources on the fly as appropriate to a given context. In such an environment, individuals and groups would no longer be forced to manage a passive collection of disparate documents to get something done. Instead, they would interact with active knowledge media that integrate needed resources and actively collaborate with them on their tasks." The Web, in its role as the beginnings of a data/object grid, can be said to be moving in this direction now. This is particularly true when technologies for addressing finer-grained portions of Web documents (XLink, XPointer) and for attaching behavior to Web data [Man98a,b] are considered. [Bra97b] also identifies the need for such agents systems to be able to interact with both object systems and more conventional software: "Ideally, each software component would be "agent-enabled", however, for practical reasons components may at times still rely on traditional interapplication communication mechanisms rather than agent-to-agent protocols."
Objects provide a generic modeling or abstraction mechanism for looking at the wide range of resources that need to be included at all levels in such a combined system. An object in this sense is simply an encapsulated unit that has identity, an interface (possibly more than one), and communicates via messages with other objects and the "outside". This use of objects mirrors the use of objects as a general modeling mechanism in the ISO Reference Model of Open Distributed Processing [ISO95], which is intended to describe any distributed processing system (including, in some cases, the roles of humans that may be involved in the system), not just systems actually implemented using objects. However, while object abstractions need not necessarily be implemented using object-oriented programming techniques, this makes the integration of object technologies such as CORBA, Jini, etc. relatively straightforward.
Representing the computational and communication components of a computational grid as objects, as illustrated in the Legion system's reflective capabilities, allows these components to be both uniformly represented within the architecture, and managed in a straightforward way by higher level components. The approach of representing computer or network components as objects for management purposes is well-known in both network and computer system management technologies. Data can be represented as objects in a straightforward fashion, by defining object interfaces containing get (read) and set (write) operations. The World Wide Web Consortium Document Object Model is an example of a set of such interfaces designed to provide object-oriented interfaces to Web data. Such interfaces provide programs and agents with more uniform access to information represented both as data (e.g., in databases, on file systems, or in the Web) in distributed object systems, and also support the integration of more "intelligence", in the form of behavior, with such data. Finally, object interfaces can encapsulate "smart things", e.g., more or less smart agents, and human beings. For example, agents can be modeled as objects (independently of whether they are implemented as objects), in the sense that they are encapsulated things with independent identity, present interfaces to the rest of the world, and communicate to anything outside them via messages sent to interfaces. Similarly, people can be modeled as objects: "fmanola@objs.com" is the identifier of an interface to which messages can be sent. In some cases the messaging protocols between these various kinds of objects will be relatively simple (e.g., conventional object RPC between distributed software objects, or commands sent to hardware), while in other cases they will be more complicated (agent communication language (ACL) sent between agents, or the email flow between people); however, similar abstraction principles can apply to objects at all levels.
In such an integrated architecture, there is also a need to define additional forms of organization on the available resources in addition to the technical levels already discussed, together with associated metadata. For example, large scale distributed object systems increasingly are being designed with 3- (or sometimes multi-) tier architectures [MGHH+98]. These architectures involve the division of the system's components (and object definitions) into functional tiers based on the different functional concerns they address. For example, a typical 3-tier architecture has a tier for objects representing user interface elements, a tier for business or application objects, and a tier for database servers. The business object tier separates out the common definitions of enterprise operations and semantics from the more specialized concerns addressed in the other tiers. Other examples of such organization include the use of Common Schema concepts or enterprise-level ontologies (enterprise-wide agreements on common semantics), and specialized schemas/ontologies for use by specialized user communities (together with mappings to the common definitions where possible).
Semantics-based mappings between the different technical levels in such an architecture are also required. For example, the ATAIS architecture document [BFHH+98] describes a series of interoperability levels: isolated, co-habitable, syntactic, semantic, seamless, and adaptive. The computational grid idea can be characterized as emphasizing high levels of interoperability on this spectrum, but at a low level of abstraction (i.e., in terms of computing resources). The agent grid often involves a much higher level of abstraction. Other levels (e.g., data, objects) are, in a sense, in between these extremes. Raising the level of abstraction complicates providing "gridness" (deep integration) because the requirements on one side, and the available resources/services on the other, are more semantically heterogeneous (unlike, e.g., "memory" and "CPU bandwidth"), and thus both characterizing them, and matching requirements with resources, becomes harder. An example of this is the complexity of addressing quality-of-service (QoS) issues, which involves defining mappings between "quality" measures at higher levels, and resource allocations at lower levels.
Such additional levels of organization provide the basis for more interoperability among the various technical levels of objects contained in the system, and hence help enhance the ability of these objects to operate as a "true grid".
I wish to acknowledge the helpful discussions and input of Craig
Thompson, Venu Vasudevan, and Paul Pazandak, all of OBJS, as well as the
cited references, for important contributions to the ideas in this paper.
[Bra97a] J. M. Bradshaw (ed.), Software Agents, American Assn. for Artificial Intelligence/MIT Press, 1997.
[Bra97b] J. M. Bradshaw, "An Introduction to Software Agents", in [Bra97a].
[CoA98] DARPA CoABS Read Ahead Package and CoABS Kickoff Meeting, Pittsburgh, July 22-23, 1998.
[FF97a] G. Fox and W. Furmanski, "Petaops and Exaops: Supercomputing on the Web", IEEE Internet Computing 1(2), March-April 1997.
[FF97b] G. Fox and W. Furmanski, "HPcc as High Performance Commodity Computing", Technical Report, December 1997, http://www.npac.syr.edu/users/gcf/hpdcbook/HPcc.html.
[FF97c] G.Fox and W. Furmanski, "High-Performance Commodity Computing", in [FK99a].
[FK99a] I. Foster and C. Kesselman (eds.). The Grid : Blueprint for a New Computing Infrastructure, Morgan Kaufmann, 1999. ISBN 1-55860-475-8, Hardcover @ $62.95.
[FK99b] I. Foster and C. Kesselman, "Computational Grids", in [FK99a].
[FK99c] I. Foster and C. Kesselman, "The Globus Toolkit", in [FK99a].
[Gen97] M. R. Genesereth, "An Agent-Based Framework for Interoperability", in [Bra97a].
[GLS94] W. Gropp, E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the Message Passing Interface, MIT Press, Cambridge, 1994.
[GFLH98] A. Grimshaw, A. Ferrari, G. Lindahl, and K. Holcomb, "Metasystems", Comm. ACM 41(11), November 1998.
[GG99] D. Gannon and A. Grimshaw, "Object-Based Approaches", in [FK99a].
[HS98] N. Huhns and M. Singh (eds.), Readings in Agents, Morgan Kaufmann, 1998.
[ISO95] ISO/IEC JTC1/SC21/WG7 (1995), Reference Model of Open Distributed Processing <http://www.iso.ch:8000/RM-ODP/> (see also <http://www-cs.open.ac.uk/~m_newton/odissey/RMODP.html> and <http://www.dstc.edu.au/AU/research_news/odp/ref_model/ref_model.html>).
[Ket98] B. Kettler, The CoABS Grid Vision, draft 2.1, 11/18/98, Brian Kettler, ISX Corporation.
[KT98] N. Karnik and A. Tripathi, "Design Issues in Mobile-Agent Programming Systems", IEEE Concurrency 5(3), July-September 1998.
[Man98a] F. Manola, Towards a Web Object Model, Technical Report, Object Services and Consulting, Inc., <http://www.objs.com/OSA/wom.htm>, 1998.
[Man98b] F. Manola, Some Web Object Model Construction Technologies, Technical Report, Object Services and Consulting, Inc., <http://www.objs.com/OSA/wom-II.htm>, 1998.
[Man99] F. Manola, "Technologies for a Web Object Model", to appear, IEEE Internet Computing, January/February, 1999.
[MGHH+98] F. Manola, et.al., "Supporting Cooperation in Enterprise-Scale Distributed Object Systems", in M. P. Papazoglou and G. Schlageter (eds.), Cooperative Information Systems: Trends and Directions, Academic Press, 1998.
[Paz98a] P. Pazandak, Best of Class Agent System Features, <http://www.objs.com/agility/tech-reports/9809-best-of-class-capabilities.htm>, 1998.
[Paz98b] P. Pazandak, Next Generation Agent Systems & the CoABS Grid, draft Technical Report, <http://www.objs.com/agility/tech-reports/9810-NGAS.htm>, 1998.
[Pis98a] A. Piszcz, "Background on Agents for DARPA's NGII Architecture", Mitre Techical Report MTR 98W0000085, August 1998.
[Pis98b] A. Piszcz, Grid Metaservice Considerations for Control of Agent Based Systems, draft, 3 September, 1998.
[Sho93] Y. Shoham, "Agent-Oriented Programming", Artificial Intelligence 60(1), 51-92.
[Tho98a] C. Thompson, Strawman Agent Reference Architecture, slide presentation, <http://www.objs.com/agility/tech-reports/9808-agent-ref-arch-draft2.ppt>, 1998.
[Tho98b] C. Thompson, Characterizing the CoABS Grid, Technical Report, <http://www.objs.com/agility/tech-reports/9811-grid.html>, 1998.
[VV95] W. Van de Velde, "Cognitive Architectures--From Knowledge Level to Structured Coupling", in L. Steels (ed.), The Biology and Technology of Intelligent Autonomous Agents, Springer Verlag, Berlin, 1995.
[WWWK94] J. Waldo, G. Wyant, A. Wollrath, and S. Kendall, A
Note on Distributed Computing, SMLI TR-94-29, Sun Microsystems Laboratories,
Inc., November 1994 <http://www.smli.com/techrep/1994/abstract-29.html>.
© Copyright 1998 Object Services and Consulting,
Inc. (OBJS)
© Copyright 1998 Institute for Defense Analyses
(IDA)
Permission is granted to copy this document provided this copyright statement is retained in all copies.
Disclaimer: Neither OBJS nor IDA warrant the accuracy
or completeness of the information in this report.