How to Componentize Data-Intensive Legacy Applications:

Issues and Initial Approaches

Arnon Rosenthal, Robert Hyland [attendee], Eric Hughes {arnie, hyland, hughes}@mitre.org The MITRE Corporation Bedford MA 01730, USA Abstract

Compositional software approaches are particularly difficult to employ when most of the necessary function resides in legacy systems. We describe some large scale efforts toward very basic composability, discuss our experiences in migrating legacy functions to a compositional architecture, and transmit some questions that the architecture and developer communities have asked us. The topics discussed are:

some efforts in the Department of Defense (DOD) developer mainstream
experiences in moving legacy functionality to component frameworks
design patterns and frameworks to help in migrating legacy systems' databases and electronic data interchange

Finally, we ask the community to identify best practices for development groups who need to make decisions in the next few years, in the absence of an accepted compositional architecture.

1. Current Status in the DOD Mainstream

The DOD has an enormous inventory of legacy systems, and is well aware of their shortcomings. Hopes for improvement have rested on standardization, but now a more sophisticated notion is gaining sway: the ability to configure instances of a general architecture, employing reusable components, to meet new mission needs.

For the Air Force, the Chief Architects Office, with significant MITRE involvement, has described configurability as a prime requirement for software architectures. Implicitly, these architects are adopting the compositional software perspective. However, the current compositional perspective exists only in foil-ware. It still needs to be sold up the hierarchy, and to be refined to provide architectural, design, and implementation frameworks. Certainly it will be several years before DOD begins to settle on a global architecture for compositional software. In the meantime, existing development efforts continue.

The DOD's Common Operating Environment (COE) [DISA97], a mainstream effort that is intended to govern most fielded systems, is currently exploring a less ambitious vision. It aims to make (subsets of) a standard collection of services available on each machine, but also addresses one componentization issue: assuring that installed applications do not interfere with each other.

Question 1: What specifications and practices are needed to ensure that newly-added components do not interfere with the operation of existing components?

The COE provides a technically conservative infrastructure (operating systems, relational DBMSs, TCP/IP), with object-orientation as an option rather than the foundation. The basic component (called a segment) is a module of software, database schema, or data that can be installed on a COE workstation. The installer service is responsible for removing conflicts (e.g., of directory or table names). There are also guidelines on how a good segment should behave, and some tools for testing compliance with these guidelines.

A richer infrastructure model (e.g., everything is an object, separation of processes, clearer control of naming and execution contexts) would simplify and enhance the COE . However, shared resources (e.g., data, screen space) could benefit from even further flexibility in conflict-handling. Also, for performance reasons, it may be desirable to break down the wall between components (e.g., for "data blades" added to a DBMS). Organizations will need to specify policies to govern such decisions.

2. Experiment in Migrating to a Framework

Future technical architectures are called on to support a high degree of configurability and scalability. Software being acquired and developed on current contracts (in mutual ignorance) will at least need to coexist, and, to an ever increasing degree, interoperate with each other. In light of these trends, component frameworks seem to be a reasonable technology to pursue, but much learning needs to take place. We are developing expertise to give advice on legacy migration, and we also wish to help our customers design some of their special-purpose software to utilize frameworks. Our situation is probably shared by advanced technology groups in other organizations.

We started small, experimenting with porting legacy software to an existing framework, while simultaneously upgrading a crucial service. This year, we have begun learning how to develop frameworks by developing mini-frameworks for patterns that describe tasks in the database migration process.

Few treatments in the literature [CACM 97] actually define frameworks, i.e., give a full list of criteria for recognizing frameworks and excluding other sorts of software. Our definition (elaborated in [RH97]) says that a framework should: be guided by a metaphor, describe obligations on components, be extensible, and be (nearly) a complete application.

There seem to be two main questions in migrating our very large legacy systems to component architectures:

How can we use legacy functionality to specify or implement new components?
How can we incrementally substitute new components for legacy systems?

Like corporate conglomerates, DOD has a huge, redundant installed base of software. Each branch has its own systems for each major task (e.g., logistics, personnel). Many differences must persist - the procedure for turning a ship differs from that for turning a regiment. Given this situation, it frequently becomes necessary to combine multiple systems into one. Often one uses the "best of breed" as the starting point, and migrates to a structure that reuses many of its capabilities. In the future, the "best" may be chosen partly based on friendliness to compositional architecture. Our initial goal, as advisers to large programs, was to gain experience with componentware, and to answer questions like:

Question 2: What methods should be used, and risks expected, in migrating legacy software to a component framework?

We began working with OpenDoc for its elegance, to educate ourselves as component developers. We also did a less thorough study of JavaBeans and ActiveX. We then created an experiment: to port an existing application which determines conflicts among air routes to OpenDoc, and simultaneously to see if it could employ a new, improved, map-display service. Our experiment was redirected to JavaBeans when development of OpenDoc ceased.

Most of our lessons learned concern migrating legacy software to CORBA, and are described in [DOMIS96, Hughes97]. We also learned a few lessons about migration to component frameworks. For example, since the framework controlled display calls, most of the application's display calls (originally X-Windows) needed to be rewritten. That is, even though the map display service was reasonably modular, the calls to it were sprinkled throughout client applications, so substitution was tough. We also learned the limitations of these frameworks for complex displays (maps) that are composed of layers.

3. Learning to Build Mini-frameworks (for Database Migration Problems)

For this year, we want to learn how to develop our own frameworks. In the long term, MITRE may do this for Air Force operational tasks. For now, we will experiment with mini-frameworks in a domain we know better - database migration. In this arena, commercial software exists for specific tasks (e.g., extraction), but we know of no general-purpose frameworks to tie these together. We are hoping to perform several experiments in the next few years.

Experiment: Incrementally migrate a system to a new data manager, while keeping the system operational.

In our candidate area (operations planning), systems currently communicate by multicasting formatted messages (akin to Electronic Data Interchange). One day's tentative operational plan is sent out as a formatted message, and other systems read it into their own private databases before processing it. Each transmitted message is, in effect, a read-only database, with a copy sent to each of many systems. The software that builds and parses messages requires a separate system of data administration and expensive maintenance (estimated at $3M in one Air Force planning arena over the next few years). Also, data quality can be poor, due to a lack of data controls.

We would like to impose constraints (both required and advisory), to improve data quality. We would also like to permit a shared representation to be annotated or otherwise updated, with propagation to other copies. The constraint subsystem, the annotation subsystem, and updatability each could be described using design patterns, and each might be a mini-framework or extension to an existing framework.

The constraint and annotation capabilities would be easier to provide if one could see the plan as a database. Development is proceeding in that direction. However, it is not feasible to rewrite all applications (US and Allied) simultaneously, so it will be necessary to maintain each database and message representation simultaneously. Since we want the plan to be updatable (at least, annotatable), we will need to transmit the updates to the message representation also. Fortunately, there are techniques and design patterns to govern this. Hence migration itself seems a good target for a mini-framework.

We believe that the selected experiment addresses important sore points in this and other DOD systems, for whose solution executives would be willing to bear the risks of new technologies. However, we avoid many technical challenges of the general database migration problem [BS95]. 7x24 operation is not required, and efficiency is not paramount. A typical multicast message is small (<10 MB), and updates are disallowed. The coupling among the applications, via multicast, is quite loose, so any improvement would be appreciated.

Component software tends to be based on business objects, but legacy systems are based largely on the schemas of implemented databases. We are therefore contemplating an experiment that would use legacy databases and functionality to incrementally provide support for business objects. In particular, the data portions of business objects might be (relatively) easy to extract from the legacy databases.

4. Interim Advice

It will be years before DOD has devised, approved, and mandated complete architectural guidelines for compositional software. The process will include design of such an architecture, provision of the essential supporting tools, formal approvals from appropriate authorities, support contracts, and acceptance by development programs.

In the meantime, billions are being spent on software development. Technology insertion groups like ours have some small leverage to affect current development, especially if suggested actions are low risk, low cost, and (ideally) provide benefits even in the short term. The concerns of this paper are summarized in our final question:

Question 3: What advice can we give now, so tomorrow's new systems will fit better into day-after-tomorrow's compositional software architectures?

References

[BS95] M. L. Brodie, M. Stonebraker, Migrating Legacy Systems: Gateways, Interfaces, and the Incremental Approach, Morgan-Kaufman, 1995.

[CACM97] "Object-Oriented Application Frameworks", Special section of Communications of the ACM, Oct. 1997.

[DISA97] The Common Operating Environment (COE) is described on the Defense Information Systems Agency web pages at http://spider.osfl.disa.mil/dii/.

[DOMIS96] The Distributed Object Management Integration System project investigated CORBA and framework technologies for legacy systems FY94-96, Web pages at http://www.mitre.org/research/domi s/.

[Hughes97] E. Hughes, R. Hyland, S. Litvintchouk, A. Rosenthal, A. Schafer, S. Surer, A Methodology for Migration of Legacy Applications to Distributed Object Management, presented at Enterprise Distributed Object Computing Workshop (EDOC97), Gold Coast, Australia, Oct. 1997.

[RH97] A. Rosenthal and E. Hughes, What is an Application Framework? submitted for publication (soon to be on WWW).

Last Modified: 12/22/97

Rob Hyland <hyland@mitre.org >