Middleware Requirements for Remote Monitoring and Control

Frank Olken, H.-Arno Jacobsen, Chuck McParland
Lawrence Berkeley National Laboratory,
Berkeley, CA 94720
{olken,hajacobsen,cpmcparland}@lbl.gov
 

1.0 Introduction

We discuss middleware requirements for remote monitoring and control applications. These requirements reflect our experiences with the RBMO Project (remote building monitoring and operations) of commercial buildings HVAC systems (heating, ventilating, and air conditioning) and our participation in standards efforts related to control of large electric utility grids. Similar requirements exist for process control, and communications network monitoring and control applications.

The RBMO project is developing software systems to support remote monitoring and control of multiple buildings with heterogeneous legacy Energy Management Control Systems. across the Internet, using CORBA protocols. More details on the RBMO project can be found in Olken (Olken et al. 96) or on the RBMO web page. We have also recently become involved in some electric utility standards work, e.g., EPRI's UCA (Utility Communication Architecture and the newly formed OMG SIG on utility applications. The requirements described below reflect our experiences with these two application areas. We discuss only requirements which we deem (1) not largely and sufficiently supported by existing midddleware platforms and (2) mainly incurred by the remote monitoring and operation character of our system.

Key requirements we identified for this class of application:

We have classified these requirements into the more general categories: basic services, security, distributed system management, and performance. The remainder of the paper is organized as follows. Sections 2 to 7 discuss in greater detail the above listed requirements for remote operation of devices in distributed systems. Section 3 relates these requirements to existing middleware platforms.

2.0 Basic Services

2.1 Event Notification Services

Central to any remote monitoring and control system is some mechanism for the remote sites to notify the monitoring sites of "interesting" events. Such event notifications need to be asynchronous, persistent, and (often) multicast. Ideally, events should be typed (so that certain processes can listen for specific types of events).

Event notification services address the need to propagate notification of significant occurrences to relevant system components. Depending on system functional requirements, this notification may be considered unreliable (e.g. periodic monitoring value updates) or may be required to be reliable (e.g. equipment status changes). In a distributed environment where system components can fail or become unreachable, the technical requirements for implementing these services can be substantial.

Although low level event delivery protocols are defined in CORBA, the semantics of describing and specifying higher level event service behavior remains (thus far) vendor-specific. Event service reliability and delivery persistence capabilities are not yet consistently defined and implemented. It is therefore important to identify and describe a significant subset of high level event service functionality. These event service functions should be sufficiently well described to allow multiple implementations that exhibit interoperable behavior. The ultimate goal will be to have a well understood set of high level event services available to any distributed control application in much the same way as TCP/IP naming services now are.

2.2 Data Push Architectures

Data push architectures are becoming an increasingly important method for distributing data to a large and variable set of remote applications. When combined with publish/subscribe techniques, the push model allows multiple applications to subscribe to data published by a single platform. Subscribing applications can be local to the publisher (i.e. located on it's LAN) or at some remote Internet site. Since the network topology between the publisher and subscriber can be determined, the publisher can use the most appropriate network protocol (i.e. unicast, multicast, unicast tunnel to multicast daemon, etc.).

Unfortunately, there are some conceptual problems in matching this model to the pure, synchronous, remote object method invocation model found in most ORBs. Ideally, control systems designers will find it most convenient to map individual data source and control objects onto single software objects. Readout of a data source by a remote application consists of a remote method invocation on the data source's software object. Unfortunately, data push architectures perform object method invocations at their own instigation and push data out at their own pace. Furthermore, for efficiency sake, data collected from a variety of objects are collected together and pushed out as a single message. Since the semantics that "tie" software object methods and data together becomes lost, these data collections end up being treated in various special, ad hoc ways.

As data push architectures grow in popularity, it will become increasingly important to bridge the gap between these two models. One method would be to formalize the data publishing protocol by making specific references to ORB interface repository entries. These references would "tie" elements of the data message to specific method invocations on specific objects. Upon reception, elements of the data message would unmarshalled in accordance with interface repository information.

2.3 Time Synchronization

Electric Power Grid control applications have stringent requirements for time synchronization across the distributed system (e.g., the entire Western United States). Phase differences among generator are key performance indicators for electric power systems. Assuming 60 Hz power generation (16.7 ms/cycle) 1 degree phase difference amounts to 46 microseconds. To reliably measure this we will need time synchronization of about 10 microseconds. Recent practice (in the electric power industry and some network monitoring applications) is to use GPS satellite time bases for time synchronization.

3.0 Security and Privacy

Remote monitoring and operation of devices over the Internet requires strict security measures for several reasons: (1) to protect the monitored data from being stolen, corrupted and intentionally falsified; (2) to protect the device from malicious use (abuse) by impersonators; (3) to protect the device from unauthorized use; (4) commercial or national security concerns may require steps to preserve the privacy of monitoring data.

Most of our attention has focused on authentication concerns. Unauthorized operation of HVAC or electric power grid devices involves potential damage to expensive equipment and to the security of the electric power grid. It is therefore imperative to prevent unauthorized operation of this equipment.

To enforce these security requirements the following middleware functionality is needed:

Whereas the first four items are covered by SSL (Secure Socket Layer), delegation (needed for item 5), policies and roles (needed for 5 and 6) are not covered by these protocols. SSL is a simple and widely available security mechanism, developed particularly for web applications.

Secure invocation based on policies and roles, delegation of credentials and non-repudiation are part of more elaborate security mechanisms. The OMG CORBA Security Service specification, for example, identifies techniques to achieve such functionality. Vendor implementations of these services are unfortunately lagging behind, often not fully standard conform implemented, and modified through proprietary solutions.

Note that current practice typically employs private (physically separate) networks for building automation and power grid operations. If (to reduce cost) shared communications networks (e.g., Internet) are to be employed then security services will have to be available at the device controller level.

Privacy considerations may arise in building monitoring data, inasmuch as some of this data could disclose information about unusual levels on activity by a firm or agency, e.g., merger preparations or war planning. Detailed information on electricity consumption of competitors may also be useful for competitive intelligence. Conventional encryption services should suffice for these concerns.

4.0 Semantic Heterogeneity

4.1 Naming and Directory Services

Effective configuration description and control is the bane of distributed systems. Distributed systems are constructed, in part, by binding together objects distributed over many platforms. Without a mechanism for describing these bindings, the resulting system lacks flexibility and it's complexity remains hidden to all but the original implementors. Naming services provide a way to bind meaningful human names to object instances.

These problems take on greater importance in real-time control environments where systems must accommodate equipment replacement and/or upgrade cycles as well as "live" system reconfiguration. Stable and ubiquitous naming and directory services are a key tool in the design of systems that will accommodate these changes. Examples include the CORBA naming services and LDAP Directory services.

The function of these services goes well beyond the nuts and bolts issues of publishing specific object names and attributes. An effective distributed naming and directory service can expose, through the use of multiple hierarchical views, objects aggregated by function, location, control sub-system, etc. In addition, these services can provide a single, ubiquitous repository for object meta information. In the case of control systems, this can include information about sensors and instrumentation used in generating object method values, resolution of object name aliases due to differing local and supervisory object namespaces, and provide automated mechanisms for locating physical and/or logical control system components.

4.2 Mediation Services

While standards efforts offer some hope in the long run for reconciling semantic heterogeneity among various network devices and controllers (HVAC or electric power grid), the existence of many incompatible legacy systems requires that some facilities for mapping between the global schema and local schemas be provided. At present, e.g., in RBMO (and other systems) this is often done in an ad hoc fashion with hand-crafted mapping tables. Such facilities are variously referred to as mediators (partial mappings, late binding) or schema integration tools (nearly complete mappings, early binding). Such tools will require access to metadata (e.g. DB or object schemas). Examples of related work include OMG Object Interface Repository, and Meta Object Facilities.

4.2 Unit Conversion Services

We have found it necessary to provide measurement unit conversion capabilities for our RBMO project. Our problems arise in part from heterogeneous use communities employing variously English or metric units (SI). However, even among users of metric units there are often idiosyncratic units conventions in particular disciplines (e.g., the use of Angstroms vs. nanometers). Units information either needs to accompany measurements, or being readily attainable via metadata queries. The ubiquity of these issues (in both industrial measurement and commercial trading) and their invariance suggests the provision of standard services. Note that unit conversions may be either: invariant (true equivalences), or material and state dependent. The latter type of conversions typically involve material properties, e.g., density, and conversions between mass and volume.

5.0 System Management

5.1 Static vs. Dynamic Schemas

Traditional database designs specify complete detailed schemas statically. Such static schemas facilitate systematic design, query optimization, etc. However, for many monitoring and control applications (shipped as shrink wrap software), it is desirable to be able to alter the schema (e.g., to accommodate new types of equipment) without rebuilding the DB or recompiling the code. Examples of such approaches include: frame-based knowledge representation systems, and the dynamic invocation interface of CORBA, together with interface repositories. Because such dynamic schema capabilities are typically significantly slower than static interfaces, some groups have built hybrid schemas - examples include the International Alliance for Interoperability (building data model) which provide both static schemas (e.g., for geometry) and dynamic schemas (frames) for extensibility. Conceivably, on-the-fly compilation techniques might provide a reasonable performance compromise.

5.2 Configuration management and component distribution

The remote entity to be monitored and controlled, may often be located at great distances from the central system development and installation site, and it may not always be possible to travel to the remote site to install, upgrade and maintain the system. It is therefore essential to provide support for static and dynamic software distribution over the network onto heterogeneous target platforms.

With static software distribution we mean the distribution of patches, upgrades, new versions, and entire releases, as well as their automatic installation. For intranets and homogeneous environments this problem has already been addressed and tools are available to support the developer with the tasks of configuration management and version control; automatic installation, however, is still an ad hoc procedure not globally supported. Moreover, the more general issue of wide-area network software distribution aiming at highly heterogeneous environments is still an open question. Additionally, critical sites will often be secured by firewalls which opens up the question of secure software distribution over the Internet supported by middleware services.

Unless the software has been designed for runtime maintenance, it is often not feasible to halt a system in operation to exchange components, since continuous operation is required (e.g., for security reasons, due to continuous data acquisition.) With dynamic software distribution we mean the substitution of software components, modules and objects at runtime and design to permit such use.

5.2 Message logging, monitoring and debugging support

In the development process of a distributed application it becomes often necessary to monitor and debug the evolving system. Monitoring serves mainly to pinpoint performance bottlenecks and fine tune the system. In later stages monitoring may also be applied to build network aware applications, i.e., applications which adapt their communication pattern to available resources.

Little support is available that aids the developer monitor and debug the distributed system under development. Only, recently have tools been announced for specific middleware implementations (OrbixManager [Iona 97] and Object/ Observer, [Black and White 97]). A standards effort aiming to specify a monitoring service (including a set of performance metrics) and debugging hooks in the middleware platform would greatly benefit the system developer. Within the OMG Utility SIG effort is underway to design a logging facility for events to better captured causality relationships of the components involved.

6.0 Performance Issues

Our experience with existing middleware systems (i.e., CORBA) was that the facilities were too slow to permit the modeling of individual measurement points as separate objects with separate operations. It proved necessary to aggregate many of these measurement points together into named collections and to query the groups of points. This dramatically reduced the number of CORBA invocations (messages) and improved performance. We conclude that we need both faster ORBs and automatic (or convenient semi-automatic) tools for facilitating such aggregations.

7.0 Discussion and conclusion

In this section we summarize our experience by comparing the capabilities of existing middleware platforms with respect to the issues raised in the previous section. These platforms are: CORBA, DCOM, MMS (Manufacturing Message Specification), Java/RMI, DCE. The following two table constitute the comparison.
Requirement/ Platform Event Notific- ation  Data Push Time Sync.  Security  Naming/ Directory  Unit conversion 
CORBA  Event Service, limited event typing  no  not yet  Security service (roles, policies, delegation)  yes  no 
DCOM  no  comparable to DCE  yes  no 
MMS  Event service with typed events  no explicit support no  no (primarily used in LANs)  use of other's services possible  no 
Java/RMI supported; EventClass  supported; JavaBeans via events  no  Java's security model use of other's services possible  no 
DCE  No Event Service available  no explicit support  Time service & synchronization  Security service (ACLs, delegation)  yes  no 
 
Requirement/ Platform Dynamic Schemata  Config. Manag.  Logging & Debugging  Real time capabilities 
CORBA  Reflexive sys. arch.; IR; dyn. invocation  no  vendor specific support  in progress 
DCOM  yes  yes (versioning only) 
MMS  no  no  in progress  goal: real-time operation & control 
Java/RMI JavaBeans: Discovery & Registration  no  in progress  no 
DCE  no  Distributed System Management specifications (in progress)  RFPs issued  in progress 
 

Bibliography