1 Project Objectives

1.1 Introduction

Software-based products are rarely without relatives. Over time, products evolve to accommodate specialised demands either by becoming variants of the same product or by becoming separate but similar products. Examined in its entirety, this set of products presents a “family” of related products.

The product line approach, which is based on this insight, dates back to a seminal work from [Parnas 76] in 1976. By characterising the commonalities among these products it provides a methodological basis for finding reuseable product components. The essential components of the approach are a feature model to characterise the variabilities among the products and an architecture that reconciles commonalities and variabilities. In recent years it has become a topic of intensive research and widespread industrial interest and trial as it offers seductive benefits for industry:

► Reduced cost

► Better quality

► Shorter time-to-market.

The product line approach divides the development of software-based products into two processes: domain engineering and application engineering

Domain Engineering – Development for Reuse

In domain engineering, the domain or domains relevant to the product line are characterized and the commonality and variability in the affiliated products are abstracted to produce generic domain feature or requirements models, a platform architecture and various ancillary models. These models are used in turn to produce the core assets, which include, among other things, requirements templates, generic software components, test suites and documentation.

The intention is that these core assets be (re)used to accelerate and shorten the development or creation of all the products used in the lifecycle of a software product, including, naturally, the product itself. In order to accomplish this as efficiently as possible, another part of the domain engineering process is dedicated to tracing and documenting relationships between related assets so that the assets appropriate to the task of building a specific product can be quickly identified, understood and used.

Note that the product line models separate commonalities and variabilities. Variability with respect to a particular aspect does not mean that a product is unique within the product family with respect to that aspect. There are groups of products with the same or similar features. Therefore, a third task of domain engineering (after the construction of the assets and their traceability), is the grouping of variants. This grouping is then expressed as parameters with respect to the individual variation points in the feature models, architecture models, etc.

Application Engineering – Development with Reuse

In application engineering, the specific product goal has been defined, and the first task is to trace the relationships to find the appropriate assets. Ideally, the developer should be able to identify the points of modification / development for a new product by comparing the requirements for the product with a feature model, as features, quite naturally are the aspects that differentiate products from another. This process of determining, or deriving, the set of components to be assembled by selecting the appropriate features from the set of all features in the product family is often termed product derivation.

The optimal situation is when the (new) product requirements correspond to the variability foreseen in the domain models, as the components that have to be assembled are already known. Then the developer must only assemble the components and realize the parameters with the methods foreseen by the domain analysis. The process of assembling the components and setting the parameters is termed configuration. In routine configuration the generic software components must not be modified. Configuration of slightly or fundamentally modified components are termed innovative and creative configuration, respectively.

1.2 Problem

Note the rationale of the product line approach. By investing in the design and implementation of a platform common to a number of products, the unit cost of development should be reduced in comparison to developing them using ad-hoc reuse. Mastery of this development process is therefore absolutely key to reaping the benefits of the approach.

While in the area of domain engineering, framework methodologies such as feature modelling, use cases, views or architectural patterns have been defined or refined to structure the realisation of product lines, corresponding framework methodologies for application engineering are still scarce.

The basic problems faced by the practitioner are:

► Mapping specific product requirements to the generic requirements foreseen in the platform

► Mapping the generic requirements identified to specific core assets

► Verifying that the set of assets chosen actually fulfills the foreseen purpose

1.3 Approach

This project will apply and adapt configuration methodologies developed in artificial intelligence to the problem of configuration of product lines. These methodologies permit the selection of components and parameter values for a specific product based on a generic product structure.

There are similar approaches such as FORM [Kang 98] or FAST [Weiss & Lai 99] in the product line community, but they lack a well-founded basis, and unlike the AI approaches, have yet to benefit from the experience of application for product derivation. There are also a number of commercial tools that support some or all aspects of the AI approach.

1.4 Objectives

The main objective of the project is to define and validate a product line product derivation methodology that is practicable in industrial application. Sub-objectives relate to the fulfillment of the following criteria / constraints associated with that objective.

► Benefit Feature Models, Artefact Models and Intermediate Representations:
There are two basic types of models. One type, which we term benefit feature model, specifies the effects of the functionality of a product i.e. this product sounds an alarm. The other, which we term an artefact model, specifies the components that realise that benefit. i.e. an alarm could be a blinking light or a bell. These two models rarely synchronise entirely. Experience has shown that mixture of the two types of models in generic specifications causes maintainability problems as products evolve. The related objective is to define models at the benefit level and artefact levels and define intermediate representations to allow the translation from benefit to artefact level.

► Industrial Realism:
Methodologies often sound plausible with intuitively manageable examples. Only when the examples reflect an industrial scale, is a methodology truly tested. We have no quibble that there are already methodologies or approaches for product line derivation for manageable situations, but the situations at the industrial partners are unmanageable and the methodologies are not realistic. Our objective is that the methodology be appropriate to an industrial scale.

► Configuration Description Language:
The expressiveness of the notations of conventional CASE tools is insufficient to handle the amount of variability in industrial product lines. Thus another objective of the project is to define a representation of the selected components and parameter values that is readily verifiable.

► Maximise routine configuration :
In practice, this objective has two aspects. Firstly, the objective is that, as early as possible, those products that can be developed quickly and with little effort be differentiated from those which require development effort. This has significant impact on reducing the number of missed delivery dates and cost-overruns. Secondly, the objective is ultimately to offer the sales force support in determining where the boundaries lie in the product specifications, so that customers can be steered to specifying products that can be built “off the shelf” or can explicitly choose to have a customised product.

2 List of Participants

Partic. Role*	Partic. no.	Participant name	Participant short name	Country	Date enter project**	Date exit project**
C	CO1	Robert Bosch GmbH	Bosch	D	Start of Project	End of Project
P	Thales	Thales Nederland B.V.	Thales	NL	Start of Project	End of Project
P	RuG	University of Groningen	RuG	NL	Start of Project	End of Project
P	LKI	Universitaet Hamburg	LKI	D	Start of Project	End of Project

3 Contribution to the programme/Key Action objectives

Key Action IV: Essential technologies and infrastructures

Action Line IV.3.1: Software architecture

The main objective of the project is to develop methods for supporting the design and configuration process of software product lines. This is done by considering applications from the automotive as well as the defence command and control domains. These application domains have in common:

► the variability of requirements, features, components,

► the need of deriving quality product lines,

► and a technical nature, allowing a high degree of descriptive modelling.

Thus, the methodologies developed will be in the context these aspects span.

The methodologies are generic, i.e. can be applied to any technical domain in that context. This is done by making a clear distinction between configuration related inference processes and declarative domain knowledge. The inference processes are common for each domain and application while the specific domain knowledge and data can be changed from domain to domain. Thus, the new technology will be applicable to more than one application, instead of focussing on a single product design.

The basic methodologies being considered in the project are product line and configuration. The project’s emphasis lies on the combination of these methods in a coherent methodology and architecture. Since configuration technology is already relatively mature, tools already exist and can be used for fast prototyping to demonstrate possible applications quickly. Furthermore, the combination of methods will enhance the applicability of the technologies, as well as the trust and confidence in that technology.

The project methodology will yield kinds of software architectures where control structures and inference algorithms are inherent in the configuration system and must not be programmed explicitly by application modellers. Instead declarative description of knowledge by using description languages for modelling ontologies as well as control knowledge will be developed by domain modellers.

Software and hardware artefacts will be described in knowledge bases. Thus, architecting with such components and specifying the structural variability of systems with semantically clear languages will be ensured. Furthermore representing application requirements and using them to infer artefacts ensures that the final software systems will meet those requirements.

Relations between functional and non-functional aspects are handled by reconciling those aspects with user defined requirements and features. Thus, not only are components assembled together, conflicting requirements are resolved a priori. By this means standardised complex reusable functions can be integrated yielding an architecture based engineering.

Summarizing, the results of the project will improve the development of integrated hardware/software systems for distinct devices used by consumers or workers.

4 Innovation

Quick development of high quality software product against a small effort is a long standing ambition of the software engineering community. Software reuse has been and still is considered to be the most promising approach to achieving ambition. Several paradigms have claimed the silver bullet to achieving high degrees of reuse with the object-oriented paradigm being most explicit about it. One of the successful projects in this respect was the REBOOT project [Karlsson 95]. A highly useful distinction made in that project is the difference between design for reuse and design with reuse.

Although the two types of activities are intrinsically related, our position is that most research in this field has focussed on design for reuse, making claims about design with reuse without actually providing any evidence that the proposals actually provide the claimed benefits.

4.1 Positioning with respect to Product Line and Software Architecture

In addition to the object-oriented frameworks put forward as a reuse approach, work on domain analysis, especially focussed on the use of features has taken place. The first approach using features in the context of software reuse can be found in the FODA method discussed in [Kang et al. 90]. In this domain analysis method, feature graphs play a central role. The FORM method presented in [Kang 98] can be seen as an elaboration of this method. In FORM, feature diagrams are recognized as a tool for identifying commonality between products. We take the point of view that it is more important to identify the variability between architectures than to identify the commonalities since the goal of developing a software product line is to be able to vary the resulting system. In order to do that, the system has to be flexible enough to support the changes. The FORM method uses four layers to classify features (capability, operating environment, domain technology and implementation technique).

Later work, e.g. [Griss et al. 98], identified that features are suitable for defining commonalities and variabilities, but that other representations are to be preferred for the software artefacts. [Griss et al. 98] employed concepts from FODA and combined them with use-case modelling. In [Griss 2000] the feature graph notation is used as an important asset in a method for implementing software product lines. [Gurp et al. 01] extended their notation, aside from graphical differences, among others, with the notion of external feature and the explicit specification of earliest binding time.

Generally, approaches that have included consideration of the application engineering process have assumed a deep and encompassing understanding of the software components that have to be assembled, such as the mix-in approaches developed by Batory [Batory & O’Malley 92] or developed Domain Specific Languages (DSL) [Weiss& Lai 99] [Basset, Frames] which are based on concepts found in the domain.

Over the last years, the notion of software product families has emerged as a promising approach to intra-organizational software reuse. Also software product family engineering considers domain engineering (design for reuse) and application engineering (design with reuse) as independent activities. Outside Europe, the primary location where product line engineering is studied is the Software Engineering Institute (SEI) at Carnegie Mellon University. The approach proposed by SEI consists of a substantial number (close to 30) of practice areas organized in software engineering practice areas, technical management practice areas and organizational management practice areas. We refer to [Clements & Northrop 01] for a more detailed description.

Over the last decade, a number of European research projects have studied different aspects of software product lines: among others, the ARES [ARES] and PRAISE [PRAISE] projects, funded by the European Commission, and ESAPS [ESAPS] and CAFÉ [CAFÉ], nationally funded projects through the Eureka/ITEA initiative.

The ARES project focussed on the software architecture for the product line, in particular, an architectural framework for dealing with variation, description for product families and dealing with quality requirements, e.g. real-time issues in a product family. The PRAISE project primarily addressed the different processes surrounding product line centric software development, defining the product line scope, variability and commonality within the product line and traceability between assets in a product-line. One of the major findings of the PRAISE project was that current CASE tools are woefully inadequate for managing the interplay between commonality and variability endemic in product lines. This was not a problem of implementation. Their underlying methodology lacked the necessary expressiveness. The ESAPS project built upon the ideas of ARES and PRAISE. The project addressed the process related aspects, but used an explicit architecture that involves a component-based design. Topics explicitly addressed in ESAPS included architecture analysis & verification including aspect analysis for system families, domain analysis, improvement of development processes for system families and building assets, such as reference architecture, platform and components. The CAFÉ project has started summer 2001 and extends the ESAPS work by explicitly addressing the aspects related to organisation. It assumes that an explicit architecture and process are in place and, within that context, it studies, among others, adoption of a system-family approach in the organisation, asset management, validation, and testing of the assets produced.

4.2 Positioning with respect to Configuration within Artificial Intelligence

The configuration of technical systems is one of the most successful application areas of knowledge-based systems. Note that most configuration tasks from the principle-, variant- and adaptive construction areas can be equated with configuration. [Günter & Kühn 99] made a general analysis of configuration problems in which four central aspects (of knowledge types) concerned with configuration tasks were identified:

► A set of objects in the application domain and their properties (parameters).

► A set of relations between the domain objects. Taxonomical and compositional relations are of particular importance for configuration.

► A task specification (configuration objectives) that specifies the demands a created configuration must fulfil.

► Control knowledge about the configuration process.

In the so called structure-based approach a compositional, hierarchical structure of the domain objects serves as a guideline for controlling the solution process. The constraint-based approach consists of representing restrictions between objects or their properties with constraints and evaluating these by constraint propagation. This approach does not conflict with the structure-based approach but is frequently combined with it. Other approaches are resource-based and case-based configuration. Concepts used in the area of configuration have well-defined, system independent semantics which are manifested in implementations termed configuration systems (CS). Such systems provide a more formal notion of consistency and completion than software configuration management (SCM) systems [Männistö et al. 01]. For instance, the consistency of the hierarchy is well-defined and will be checked by a specific module of a CS. Constraint solution uses methods that have been proven-correct and property values of components can be inferred. The control mechanism (fourth point above) determines that all open issues (e.g. properties, parts) of the configuration are handled by the configuration process. All these modules of a CS are general and thus domain independent. A domain specific configuration system can be obtained by implementing a domain specific application user interface over the domain specific configuration model. Such a user interface maps those kinds of knowledge and inference processes to interface artefacts that are suitable for the domain specific configuration process being supported,. For example, in the case of software development a user interface may consist of presentation tools for software objects using the concepts mentioned, presenting the evolutionary process etc. CS map the configuration process to an operational computer supported process. Thus, CS provides a generic way for describing and (automatically) determining configurations in distinct domains, as far as traditional domains (traditional for configuration systems) such as hardware components etc. are concerned.

For software product lines which are the objects of our attention, the non-static aspect of running software is important. Research on these aspects are just beginning [Männistö 98] [Frühauf et al 99] and must be thoroughly examined. The main challenge is to understand the specific concepts of the software development process in terms of the general concepts of the logic-based configuration terminology

4.3 Positioning with respect to Software Configuration Management

Conventional software configuration management systems (SCM) focus on controlling the whole evolutionary process of software system development. This process is seen as a continuous process which does not stop as long the software is in use. To support this constantly changing process more elaborate SCM systems have been developed. There are a number of SCM systems which support the main concepts of SCM systems more or less. These concepts are:

► The representation of software objects as atoms by giving them a rudimentary name and a not further specified content "description" (e.g. "program code", "documentation", "test result" etc.), or as configurations by enumerating independently changing atoms or other configurations.

► Version control examines how interim products are produced in the course of product development and how development of different aspects of the product can proceed in parallel.

► Change control examines how changes to the software objects are more or less formally described by giving information about the change like the objective of the change, the state of the change (e.g. open, rejected) etc.

► Process control supports the whole software development process, by describing the process in terms of “completion, acceptance, integration & test, takeover”.

► Distribution is related to distributed development of software by locally distributed developers.

In considering this project in relation to conventional configuration management, note that product line engineering differs from conventional software engineering in its emphasis on a feature model that expresses the common and variable aspects of the capabilities of product line products and a platform architecture that identifies and separates common and variable components. In product line, a specific configuration is selected by traversing the feature tree.

The make file in conventional configuration management systems, such as ClearCase, automates the product generation process by using dependency and implication rules to generate the components and ensures that all components are built from up-to-date source while minimising unnecessary build steps. The make procedures are triggered from “targets” which specify a product to be built. Note that the targets are static. Theoretically, in order to generate all members of a product family, each member would have to be entered as a separate target.

The system Adele [Estublier, et al., 94], (see also the discussion in [Dart, 91]) is based on an entity relationship database with more elaborate data modeling capabilities than those offered by file lists. However, drawbacks of such a system are: non-object-oriented modeling, incomplete support of attributes (the user has to manage some declared attributes instead of the system), and the underlying semantics are not system independent, e.g. logic-based. Also the inference methods (based on so-called interfaces) are only centered around attributes and relations, not around software objects as a whole. However, it is possible with Adele to capture the evolution of all architectural elements in a single system model. Similar systems are the tools developed in Proteus [Tryggeseth et al., 95] or ShapeTools [Kuusela et al., 99] which also include modeling capabilities, but do not support system level modeling [van der Hoek et al. 01]. As such, SCM systems are specific systems dealing with software products oriented at the practical needs of software developers. In Proteus, development decisions are supported but in general not enforced by the SCM tools. This means in effect that requirements and constraints are not met based on formal inferences but as a result of an informal user-based process.

Conventional configuration management systems offer the ability to express some amount of the variability through versioning mechanisms. These mechanisms sometimes allow the embedding of editor commands in source files to derive a particular version from a common core, somewhat similar to Basset’s framing techniques. The difference, however, is that even as the term versioning implies, the differences cascade from the earliest, in this case the core commonality, to the latest variant, which in this case is a specific variant. The branching mechanism allows for the select of different variants.

Summarising, conventional configuration management tools are intended to aid the production of single products, or products with limited and specifiable variability. Although they offer some support for configuration, using them for families of products, which developers are currently forced to do, quickly leads to considerable adaptation effort through support utilities. Over and above this, there is no support for checking the consistency of configurations or for using inference to generate parts of the configuration. For these reasons the configuration facilities of conventional configuration management tools is best termed “vestigial”.

4.4 Integrative and Innovative Aspects of ConIPF

The research results discussed above share as a common characteristic that their main focus is on design with reuse or domain engineering based on assumptions about application engineering. This can lead to the quite unsatisfactory situation where effort invested during domain engineering cannot be recouped during application or product engineering because it was based on wrong assumptions.

We make the distinction between three types of product derivation in software product families: routine, innovative and creative derivation. Routine derivation is concerned with deriving a standard product and typically requires little effort. Innovative derivation addresses more complicated or deviating products that required more effort. Creative derivation creates novel products that exploit the existing functionality in ways not intended by the original developers and typically requires substantial effort from the organization. Since in virtually all organizations exploiting software product families the amount of staff concerned with application or product engineering by far exceeds the amount of staff working on domain engineering, we believe research has to start from the perspective of the product engineering, i.e. design with reuse.

Through our extensive experience with product-line based development and evolution we have come to realize that one of the major challenges is the derivation of individual members of the product line. Most of the activities in the aforementioned projects address the domain engineering phase, i.e. the phase during which the software artefacts are developed from which products can be derived. Although these artefacts are developed with the intention to be reused in individual products, we have learned that too little attention has been paid to application engineering during which product derivation takes place. We therefore identify the following issues which our approach will resolve:

► Dependencies between features: Typically, feature models are used as a means to communicate between marketing and development. These feature models are logical descriptions of the functionality that a product may provide. However, dependencies between features in the system are not visible. Consequently, when deriving new products or planning the evolution of existing products, decisions may be taken that can only be implemented with substantial effort.

► Dependencies between variation points: The necessary variability is implemented in the software product line artefacts through so-called variation points. However, typically the implementation of these variation points is not fully orthogonal. Thus, dependencies between the variation points also exist. The problem is that these dependencies have no first class, explicit representation. This typically leads to substantial overhead due to inconsistencies during the derivation that surface late in development process and require repair with substantial effort.

► Evolution of realisation: In the DSL (Domain Specific Language) approaches, the language is usually defined through “concept mining” from a domain dictionary of terms. These methodologies do not provide specific differentiation between declarative knowledge and inference-based knowledge thus leaving it to the practitioner to either recognise the root problem or compensate for it with semantics. This ultimately leads to maintainability problems as the systems evolve and new implementations or combinations of functionality smear the boundaries in the descriptive parts of the languages.

► Overwhelming complexity: The increasing innate complexity of software-intensive systems, combined with the complexity inherent with a large number of variations is taxing the abilities of the best engineers. Strategies to divide and conquer by creating specialists for specific problem sub-domains are foundering on the subsequent integration problems.

► Traceability of features: A subsequent problem, due to the difficulty of visualising variation points, is the fact that features are difficult to trace from the feature model to the software artefacts that implement the variant feature.

► The difference between software and hardware as artefacts: The derivation of software and hardware artefacts of a product requires extensive modelling of and inference about artefacts. The difference thereby is the dynamic nature of running software compared to static nature of hardware components. This non-static aspect must be taken into account when configuring product lines.

► Software as subject of configuration methodologies:: ConIPF will apply methods proven in the field of configuration to software product lines.. Doing so would be highly innovative for the configuration community which has not yet dealt with software configuration in a formal way.

► Maximizing routine derivation: In product line application engineering, all three types of derivation, routine, innovative and creative, occur. Routine derivation is the most cost-effective form and we therefore intend to maximise routine derivation.

► Using, improving, and applying configuration methods for software configuration: Existing configuration methods deal mainly with component based artefacts. They are mostly applied in industrial applications e.g. in production engineering. Configuration methods’ generic nature may allow them to be applied to software engineering. For instance, additional mechanisms like behaviour analysis, model-based simulation, or verification techniques might be incorporated in software configuration methods. Thus, further innovations will result in techniques for configuration methods as well as software configuration.

► Applying description logics to software configuration: Description logics give a formalised approach to representing and inferring configurations. In the project these techniques will be applied to software configuration thus gaining the advantages which they offer:. automatic assessment of the correctness and completeness of the configured software.

► Development of a configuration description language: The knowledge types necessary to support inference will be identified in the analysis of the software product line domain. A configuration description language which includes aspects of software is a further innovation. Findings about how usable known representation formalisms such as UML will are for modelling and inferring about software product lines will also be documented.

The contribution of this proposal is that it addresses the aforementioned problems and issues, and advances solutions.