European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Open Source Software Reuse Service for SMEs

Final Report Summary - OPEN-SME (Open Source Software Reuse Service for SMEs)

Executive summary:

The OPEN-SME main idea is to introduce a reuse service that will be operated by SME Association Groups (AGs) on behalf of their SME software development members. This service will be operated by software experts of the SME AGs who will produce components from OSS projects, test them, generate documentation, resolve licensing etc. asynchronously to application development by SMEs and independently from the SMEs. The components will be related to domains that are relevant to the SMEs. Therefore when the SMEs will want to reuse them, the components will already be there.

The OPEN-SME project collectively provides two processes and three tools, namely:
1.The Reuse-Oriented Domain Engineering (RODE) process.
2.The Application Engineering process.
3.The OCEAN tool. OCEAN is a tool for searching OSS code search engines. Essentially a meta-search engine.
4.The COPE tool. COPE is a tool for extracting, testing, documenting and packaging software components originating from OSS projects.
5.The COMPARE tool. COMPARE is a repository for storing the extracted component packages and delivering them to SMEs.

Project Context and Objectives:
Overview

Open Source Software (OSS) reuse has the potential to improve software quality, shorten time-to-market and bring competitive advantages to Software Development small and medium-sized entreprises (SME).

However, currently OSS reuse is restricted to:
-Whole OSS projects (e.g. Apache web server, MySQL Database)
-Opportunistic reuse of isolated classes (i.e. copy-paste-adapt reuse).
-Well-known selected infrastructure components (e.g. Apache Commons)

The OPEN-SME proposal is to extend the landscape of OSS reuse to domain-specific components extracted by arbitrary OSS projects. Achieving this goal however involves a number of challenges:
-Valuable OSS components exist in every OSS project. However it is difficult to recognize them, extract them, test them, document them etc.
-During software development, usually there is no time for the aforementioned activities. Developers often prefer to develop new code from scratch although this code has been written before many times by many others.
-Even when developers recognize the opportunity to reuse OSS code there are several uncertainties related to the provided functionality and quality.
-What the component does exactly?
-How well it does it?

OCEAN
The Open-Source Search Engine (OCEAN) is a meta-search engine that provides unified access to existing Open Source Software (OSS) search engines. This allows the reuse-engineer to find open source software assets (i.e projects, packages, files etc.) satisfying certain criteria, such as software that is written in a specific programming language, containing certain keywords, having a specific license etc. Moreover, it allows the re-user to detect a software asset that is of some value and place an order to adapt that specific asset to the reuse-engineer. OCEAN is a web portal (see http://ocean.gnomon.com.gr/web/guest/home , root/test online) that allows mainly locating and browsing open-source files and projects that are available on popular open-source search engines. OCEAN is extensible to incorporate any open-source search engine available regardless of the integration strategy. What this means is that the integration of an arbitrary search engine can be performed in any way possible (i.e. use of provided api, web-scrapping, etc).

COPE
The Component Adaptation Environment (COPE) is a tool-chain that provides an environment for the enactment of the domain engineering process of OPEN-SME, thus allowing the reuse-engineer produce reusable components for the domain(s) of interest (see http://opensme.eu/deliverables/86-deliverable-d32b , trialsuser/opensmeuser online). COPE is a desktop application to perform the following tasks in order to achieve the aforementioned result:
-Identify and model primary concepts of the domain (using: Knowledge Manager)
-Analyse the different aspects (using: Static Analysis, Design-pattern Analysis, etc.) of an Open-Source project
-Comprehend the project (using: the outcome of the Analysis, Documentation Generation, in-project Search)
-Detect candidate components (using: the outcomes of the project Analysis (ii) and project Comprehension (iii) )
-Generate components (using: the various Component Makers)
-Validate them (using: Dynamic Analysis)
-Classify the produced component (using: Knowledge Manager)
-Upload the Component to COMPARE component repository and search engine.

COMPARE
The Component Repository and Search Engine (COMPARE) is a web portal (see http://www.teletel-projects.net/compare , demo/1234 online), that allows SME software re-users to search and discover software artefacts, technical documents, test suites related to open-source software. In addition, it allows the stakeholders of the Domain Engineering Process (reuse engineers, domain experts, component testers and certifiers) to manage and maintain the assets stored in the repository. The end-users can be endowed by using the advanced filtering capabilities as well as by accessing information about the verification and certification attributes of a component. Finally, it provides a communication mechanism between the re-users and the reuse-engineers

System Architecture as a whole
The system architecture is based on a decentralized topology. The reason being topology that the end-users perceive this topology as a robust, fault-tolerant system. So, if one of the servers malfunctions, the rest of the functionalities provided by the system do not cease to exist, but on the contrary the associated users can still perform their tasks without being affected by a malfunction that is irrelevant with what they have to perform. Moreover, this architecture makes the evolution of the services independent from each other which is both desirable and necessary. It is desirable, not only for purposes of robustness and fault-tolerance but also for tracking and maintaining reasons. It is also necessary because at any moment during the trials the end-users may require additions or enhancements in order to successfully use the services, so the services should be easily maintainable thus independent from each other. Nevertheless these services can be hosted on a single physical server and thus do not impose additional costs to the SME-AGs.
Project Results:

The OPEN-SME main idea is to introduce a reuse service that will be operated by SME Association Groups (AGs) (e.g. Greek Association of Computer Engineers, Vasteras Science Park etc.) on behalf of their SME software development members. This service will be operated by software experts of the SME AGs who will produce components from OSS projects, test them, generate documentation, resolve licensing etc. asynchronously to application development by SMEs and independently from the SMEs. The components will be related to domains that are relevant to the SMEs. Therefore when the SMEs will want to reuse them, the components will already be there.

The OPEN-SME project collectively provides two processes and three tools that we will describe in some detail in the following sections:
Domain Engineering Process (RODE)
In this section we will discuss a domain engineering process for the creation of domain models based on existing OSS projects. We believe OSS projects provide not only a quality alternative to commercial software but also a knowledge resource that we can exploit in developing the necessary domain knowledge for the domain engineering. Domain engineering methods invariably propose the use of so-called exemplar projects that are existing projects used during domain analysis and design. We propose a domain engineering process that uses OSS projects as exemplars during all phases of domain engineering, including the domain implementation phase in which existing OSS components are reused for the partial implementation of the domain artifacts. The process is suitable for Small and Medium Enterprises (SMEs) that experience limitation of resources and characterized by a limited portfolio of owned projects having difficulties in applying systematic reuse methods based on domain engineering approaches.
Introduction
Systematic software reuse is divided into a) activities and/or processes related to building reusable assets, referred as domain engineering processes or methodologies, and b) activities and/or processes related to reusing these assets in the context of a software application development, referred as application engineering processes. The authors in [1] define domain engineering as “the set of activities involved in developing reusable assets across an entire application domain, or family of applications”. In domain engineering a number of applications, belonging to a specific domain, are identified and their similarities and variabilities are analysed in order to produce a domain model. Thereafter the model is designed, implemented, and concrete artefacts of the implemented model are produced to be reused in a number of applications.

RODE: A domain engineering process based on OSS projects

The RODE process comprises of distinct phases. Each one of the phases, is performed only once with the exception of the Evolution Phase. In the RODE process, we try to build all the necessary tools, reusable assets, artefacts, documents, models, etc. until they reach a certain level of maturity thus allowing SME-AGs to provide the services of OPEN-SME, and perform a continuous, on-going, evolution of the assets.

Process Definition Phase
This phase aims at organizing the usage of resources and the way the process as a whole will be carried out. In this phase the reuse Engineer should create, document and execute a domain engineering plan including standards, methods, activities, assignments, and responsibilities for performing domain engineering including the candidate stakeholders. S/he will also select any additional representation forms to be used for the domain models.

Process Configuration Phase
The purpose of this phase is to configure (if necessary) the process itself to address the specificities of the domain of interest by performing the following activities:

1.OSS Search Engine selection: Refers to the selection of the most suitable Open Source Software Search Engines for the domain of interest. Selected engines will be the only ones used in order to discover OSS Projects.
2.OSS Search Engine Integration plan: In this (optional) activity the reuse engineer decides whether any OSS search engine, identified in the 'OSS Search Engines selection' activity, should be integrated into OCEAN tool or used 'as-is'. The reuse engineer should design the integration of the OSS search engine into OCEAN, or design how the results of the OSS Search Engines can be exploited by COPE, respectively.
3.OSS Search Engine Integration: In this (optional) activity the reuse engineer implements either the integration of the selected OSS search engines into OCEAN or the process and tool, if required, to exploit the search engine externally.
4.Tool selection: The reuse engineer selects any additional tools that might be necessary for the implementation of the Domain Engineering Process and/or for instantiation of COPE. For the selection of the most appropriate tools, the reuse engineer can use a decision analysis method.
5.Tool Integration plan: In this (optional) activity the reuse engineer decides whether any additional tools should be used independently or integrated into COPE. The reuse engineer should design how the results of the additional tools can be exploited by COPE or design the integration of the additional tools into COPE, respectively.
6.Tool Integration: In this (optional) activity the reuse engineer implements the integration of any additional tools with COPE (resulting in a new instance of COPE) or the process and tool, if required, to exploit the assets produced by the specific tool externally.

Domain Analysis Phase
In this phase the reuse Engineer has to analyze the domain(s) of interest by performing the following activities:
1.Domain Boundary Definition: The reuse engineer, assisted by the domain expert, should define the boundaries of the domain.
2.Primary Concept Identification and Modelling: In this iterative activity, the reuse engineer while analyzing the domain of interest identifies primary concepts of the domain and models them in the Ontology provided by the Knowledge Manager of COPE.
3.Exemplar Selection Plan: In this activity the reuse Engineer should create and document in which way the exemplars will be selected. S/he should identify and document the criteria, as well as their relative importance by which an exemplar is more suitable to be selected for reuse. These should include functional, technical, business criteria. Finally, s/he should estimate the number of exemplars required to cover the primary concepts.

Domain Design Phase

In this phase the reuse Engineer selects exemplar projects for the domain of interest and validates whether they are within the domain scope by performing the following activities:

1.Exemplar Selection: In this activity the reuse engineer executes the Exemplar Selection Plan and discovers, selects and retrieves the most representative OSS projects to be used as exemplars. Based on the criteria defined in the exemplar execution plan s/he evaluates them using a decision analysis method and selects the most appropriate.
2.Domain Validation: While the reuse engineer searches exemplars, the domain expert should validate whether the exemplars are out of the domain boundaries or the domain boundaries are too strict. In that case, the reuse engineer can either exclude the exemplar or modify the domain boundaries at his/hers discretion.

Domain Implementation Phase
In this phase the reuse Engineer has to implement all the assets assimilating the exemplars integration and incorporate the selected exemplars. This phase is broken down to the following activities:
1.Exemplar Assimilation: This iteration, performed mainly by the reuse engineer, aims at the assimilation of each exemplar by following the activities 2 - 7.
2.Component identification: Using reverse engineering tools, static and dynamic analysis tools provided by the instantiation of COPE, the reuse engineer identifies reusable components within the project.
3.Component Analysis and Evaluation: Afterwards, the reuse engineer analyzes each component, identifies concepts of the components related to the domain of interest, and evaluates its suitability using decision analysis methods.
4. Component Adaptation: Using model driven development tools and the adaptation pattern library of COPE, the reuse engineer adapts the component and documents the resulting asset.
5.Component Validation: In this task, the tester validates the component making use of the validation tools provided by COPE.
6.Component Certification: In this (optional) task the Certifier using advanced certification techniques, such as model-checking, certifies that a specific component has a set of desired properties.
7.Asset Storage: Upon successful completion of the previous activities, the reuse engineer models into the Ontology the concepts that are related to the component and gathers all the produced artifacts. S/he then stores the component into COMPARE along with its metadata or other assets (Metrics, Use cases, UML Diagrams, Test Cases, etc.)
8.Redefine Domain Boundaries: While the reuse engineer executes the 'Exemplar Assimilation' s/he may have to redefine the domain boundaries.

Evolution Phase
In this perpetual and iterative phase the reuse engineer assimilates new projects into the Reuse Repository while maintaining the already embodied assets and thus evolves the domain engineering process as a whole. This is performed by following the activities described below.

1.New OSS Search Engine Discovery and Integration: In this (optional) activity the reuse engineer performs the corresponding activities described in the 'Domain Configuration' phase.
2.New Tool Discovery and Integration: In this (optional) activity the reuse engineer performs the corresponding activities described in the 'Domain Configuration' phase.
3.Exemplar Selection: The reuse engineer performs corresponding activities described in the 'Domain Design' phase.
4.Project Assimilation: In this iteration, performs the corresponding activities described in the 'Domain Implementation' phase.
5.Component Certification: In this task the Certifier using advanced certification techniques, such as model-checking, certifies that a specific component has a set of desired properties.

Application Engineering Process
Component based software engineering had received significant focus from the research community during the last decade and several interesting models have been proposed. At the same time, Open source software development also had become popular, thanks to the dedicated efforts of the developer community. Both communities have a lot to learn from each other and a proper blending of their processes and methods could provide the software developers with greater opportunities and well as cost efficiency.

Introduction
In spite of the large research efforts on the component based software engineering (CBSE) as well as the growing development efforts of the open source software community, we are yet to see any strong efforts in bringing a synergy between these two communities. We believe that understanding the models and processes proposed by CBSE and blending them carefully to their processes and models could provide the open-source community with much greater re-use capability and hence cost-efficiency.

Current technology limitations being addressed by the OPEN-SME project are:
1.Absence of component-based Application Engineering Process specifications that consider a cross- organisation software reuse environment: In the context of the OPEN-SME use cases, a component- based application engineering process should be centered on the exploitation of the outcomes (use case models, feature models, software artefacts, architecture metamodels, etc) of an external domain engineering process.
2.Limitations of existing software reuse repository solutions: The Reuse repositories are an essential factor for the success of any component-based and reuse-oriented application engineering effort, since they allow searching and retrieval of reusable software artefacts.

Overview
This includes description of domain engineering as well from the point of view of the application engineering process. The OPEN-SME Application Engineering Process will form a generic software development and lifecycle methodology that will be component-based, reuse-oriented and applicable (customizable) across different Application Domains.

PROPOSED APPLICATION ENGINEERING PROCESS
In the following subsections we define the OPEN-SME Application Engineering process in detail. Based on the type of application domain under consideration (whether embedded system or Enterprise application), the process will include a specific set of phases and activities from those described.

Inputs to the application engineering process
On a higher level, the inputs to the application engineering Process are a) the application requirements and b) available components produced by the domain engineering and stored in the reuse repositories. The application requirements either come as a specification in an order for product development or could evolve through discussions with domain experts and the system developer. Since components are the major inputs to the applications engineering process (as the assets stored in COMPARE), we provide more details on what a component contains and try to exemplify.

Application engineering phases
The main phases of the Application Engineering/Development in comparison with “classical” software development and lifecycle phases, and in relation to the outcomes of the domain engineering activities (as described in Section B) are as follows:
P1. Application requirements phase
P2. Physical architecture definition phase
P3. Application Design Phase
P4. Implementation- Component Realization
P5. System integration phase
P6. System testing phase
P7. Release Phase
P8. Maintenance Phase
The above phases are described in detail in the following subsections along with the main activities, inputs, roles and outputs in each one of them.

Phase#1: Application Requirements

In a non-component-based approach the requirements specification is the main input for development of the system. In a component-based approach the requirements specification will also consider the availability of existing components. Within OPEN-SME, the requirements should correlate to the assortment of the components, i.e. the requirements specification will not only be input to further development, but also a result of the activities that took place during both the Domain Analysis and Domain Implementation phases. For example, certain requirements are not essential for a project and/or can be slightly modified in order to reuse as-is an existing component that is too difficult or too expensive to implement from scratch. However this search is more focused on internal component repository as well as the goal is to identify a set of candidate (potential) components by looking at the compatibility in a macro level. In this phase the reuser performs the following activities:

Phase #2 -Physical architecture definition

The role of the Physical Architecture is to provide a model-level description of the relevant hardware of the system. A physical architecture specification can assist in decision making during component search and selection. Also this can later on get refined based on the software reference architecture. In the physical architecture the following elements are described:
1.Processing units are units that have a general-purpose processing capability.
2.Equipments / Instruments / Remote terminals
3.The interconnection between the elements above, in terms of buses or point-to-point links, etc.

Phase#3: Application Design

The OPEN-SME application design phase will follow the same pattern as a design phase of software in general; it will start with a system analysis and a conceptual design providing the system overall architecture and continue with the detailed design. However, a major deviation from traditional approaches will be taken as the system architecture will need to adhere to the Domain Software Architecture and incorporate assemblies of the existing components stored in the Reuse Repository.

Phase#4: Implementation- Component Realization
The component realization activities will only partially consist of coding - actually the more pure a component-based and reuse-oriented approach is achieved, the less coding will be needed. The main emphasis is put on component selection and its adaptation into the system. This process can require additional efforts. First the selection process should ensure that appropriate components have been selected with respect to their functional and extra-functional properties. This requires verification of the component specification, or testing of some of the component's properties that are important but possibly not documented in the Reuse Repository. Provided that the system architecture adheres to the Domain Architecture the effort required for the adaptation of components (from the resuer perspective) will be very small or ideally zero. In any case, using the already tested and documented components from the COMPARE reuse repository will significantly reduce the burden on the reusers. In this phase the resuer will perform the following activities:
1.Component Selection - the reuser selects the most appropriate components between the component candidates from the domain component repository. The existing components that are closest to the component specification from the design phase will be selected. Note that this specification considers both functional and non-functional properties. Note also, that the selection process does not only consider the component candidates, but also different component versions and variants. The candidate components found, will be compared and ranked.

2.Component adaptation - When a particular component has been selected it may happen that it does not comply with the specification (either functional or non-functional properties). These components should be adapted to meet the specifications. A simplest form of adaptation is to creation of adapters. Adapters are mediators between components with a goal to make the components compatible. A typical adapter will change type of the interface but not the interface itself. A next level of the adaptation is s.c. wrapper - a new 'component' that adjust the interface of the selected component with the component specification from the design phase.
3.In-house development - in some cases no components for a specific service will be found in the domain repository. In some cases the company developing the application encapsulates its business advantage and do not want to share this knowledge with the domain or other competitors. This implies that the application engineer (the reuser) will develop specific components - using the application development tool.
4.Component verification - when a component is selected and adapted according to the requirements from the design phase, or when developed, it must be verified. This verification corresponds to a unit test, so it includes the verification of the functional properties. In addition some of the non-functional properties can also be verified (for example memory size, response time, and other component attributes).

Phase #5 - System integration

This phase includes activities that support integration of the selected, or the newly created components into the application. In the component-based approach this phase, although consists of many complex activities, most of them are integrated parts of many component technologies and are done automatically or semi-automatically.

The list of the integration activities is detailed as follows:
1.Component Instance definition - Component instances are defined from selected or new component implementation. The component instance is the component entity that gets concrete values of functional and non-functional properties. For example, a component can have parameterised interface, which in the instantiation process gets some concrete values.
2.Allocation of component instances - The allocation of the components is supposed to be done in the design phase. Here, according to that input, the component instances are allocated on the physical structure - by this the component instances get the concrete values of some properties. Instances of components are allocated to processing units defined in the physical architecture.
3.Component deployment - this is the activity that integrates the component into the application - i.e. it creates a connection to the underlying platform, middleware or component containers. This is usually a matter of the component technology. Containers are special type of the components/wrappers that are carriers of certain properties (for example they implement authentication mechanisms that are activated when the components from that container are being accessed.

We distinguish four types of support:
(i) Exogenous Management. The EFP management is provided outside the components,
(ii) Endogenous Management. The EFP management is implemented in the components, i.e. the component developers are responsible to implement it;
(iii) Management per Collaboration. The EFP management is realized in direct interactions between components;
(iv) System-wide Management.

The EFP management is provided by the component framework, or underlying middleware. By a combination of these types we get four possible types of the EFP support:
-Approach A (endogenous per collaboration). A component model does not provide any support for EFP management, but it is expected that a component developer implements it. This approach makes it possible to include EFP management policies that are optimized towards a specific system, and also can cater for adopting multiple policies in one system.
-Approach B (endogenous system-wide). In this approach, there is a mechanism in the component execution platform that contains policies for managing EFPs for individual components as well as for EFPs involving multiple components.
-Approach C (exogenous per collaboration). In this approach, components are designed such that they address only functional aspects and are oblivious to EFP. Consequently, in the execution environment, these components are surrounded by a container. This container contains the knowledge on how to manage EFPs. In this approach, containers are connected to other containers. Connected containers can manage the EFPs for the components that they encapsulate. The container approach is a way of realizing the separation of concerns in which components concentrate on functional aspects and containers concentrate on extra-functional aspects.
-Approach D (exogenous system-wide). This approach is similar to approach C, except that the system can coordinate the management of an EFP from a global system-wide perspective (e.g. global load balancing). Consequently, a more complex support need to be built into the component execution platform.
4.Component binding - this is the activity in which a component implements connections to other components (components binding). Component bindings are established between component instances. The binding is established between the required interface of a component instance and the provided interface of another component instance. The binding is subject to a static check to ensure that the candidate provided interfaces fulfils the functional needs of the client required interface.

Phase #6 - System testing

Due to the fact that the tests that will have been performed in isolated components are usually not enough, since their behaviour can be different in the assemblies and in other environments, thorough system and subsystem tests will need to be performed. In case of embedded systems, multiple levels of verification and validation often need to be performed using simulations, hardware-in-loop, etc., before the system can be deployed in actual operational environments. In the waterfall model the test is performed after the system integrations, whereas in CBD Tests are present in all phases. Tests are performed on isolated components (unit testing), component assemblies and finally on the system. In this phase the developed system is verified against the system specification.

Phase#7 - Release Phase

The release phase includes packaging of the software in forms suitable for delivery and installation. The component-based development release phase will not be significantly different from that of a 'classical' software development process.

Phase#8- Maintenance Phase

The maintenance of a software system is a necessity mainly due to the changes of the environment that the software operates in. Even if a system functions properly, as time goes by, it has to be maintained. The approach of a component-based development process is to provide maintenance by replacing old components by new components or by adding new components into the systems. The paradigm of the maintenance process is similar to this for the development: Find a proper component, test it, adopt it if necessary, and integrate it into the system. These activities are essentially those discussed earlier as part of component realization and hence are not repeated here.

OCEAN
Source code search engines assist the software development process by providing a way of searching for free source code in code repositories. Although their use is rather straightforward, there exist a few of them and the differences in the way they index and provide access to their assets require considerable time and effort from the programmer to use them.

Introduction
The concepts of Software Reuse [16] and Rapid Development have been adopted by large software development companies, small and medium enterprises (SMEs), research institutes and freelancers. According to a survey conducted in [17], software reuse in general and Free/Libre Open Source Software (F/LOSS) reuse in particular are important for the software development SMEs for a series of reasons:
-Reuse has a positive effect on lowering the development costs (91%), shortening the development and testing time (83%), increasing the quality of the final product (76%) and shortening time to market (72%).
-In relation to the different artifacts that can be reused, source code is the most important (87%), followed by design (80%) and documentation (75%).
-Almost half of the organizations (51%) have an in-house reuse repository whereas 39% have some formal process for reusing components they develop.
-The vast majority of the respondents (80%) said that their organization supports OSS reuse.

OCEAN High Level Design
The federated code search engine we propose should be flexible enough to incorporate individual existing or future code search engines. Typically, one can retrieve data from another web source either through an API or via web content extraction. Having an API is preferred because it is faster and more reliable. Merobase belongs to this case. However, in cases like Koders and Krugle, which give their answers as http pages only, web extraction is the single option. Web content extraction is the non-trivial process of collecting unstructured web data and storing them in a database or an XML file [31].

Implementation Details
In this section, we give implementation details of the Query Engine subsystem, which actually implements the federation. It supports two types of foreign search engine integration: API-based and Extraction-based.

API-based Integration
Merobase
Merobase [22] integration belongs to this case and was implemented by means of a JAR search client provided by the Merobase creators. A Perl web service was written utilizing this API and returning the results for a user-specified query in a suitable XML format. The Merobase API supports 2 parameters: s for the search keyword and n for the number of results requested. An upper limit of 30 results per query has been set by Merobase developers.

Google Code Search
Google Code Search was integrated through its API. It turned out though that soon after the integration Google announced that the service will be no longer available. This is a nice example of the value of a federated search that continues to serve its users even though some sources are not available. Given the situation described, we do not give further details on this case.
Extraction-based Integration
When APIs are not available, web extraction does the integration. This requires the availability of an easy to use, robust and flexible web content extraction framework. DEi¬XTo was the tool of choice. It is briefly described right after.
DEiXTo - A web content extraction framework
DEiXTo [27] is a powerful web data extraction tool that is based on the W3C Document Object Model (DOM). It provides the user with an arsenal of features aiming at the construction of well-engineered extraction rules that describe what pieces of data to scrape from a website.

Koders
Koders [21] integration was smooth, in the sense that the html result pages were fully accessible by DEiXTo. The service supports 4 URL parameters: s for the search keyword, li for license type, la for language and n for the number of results requested.

Krugle
Krugle [20] integration on the other hand raised some difficulties mostly due to the heavy use of AJAX calls in its search results pages. Currently, DEiXTo does not support JavaScript automation. As a result we used Selenium [28] which actually automates a Firefox instance and were able to get Krugle's HTML results properly, and then forwarded them to DEiXTo for the actual extraction. Again, OCEAN sees Krugle as a web service supporting 4 URL parameters: s for the search keyword, pro for the aiming project, lic for the desired code license and n for the number of results desired.

The System in Use
The main screen of the search facility of OCEAN consists of a form. In the textbox entitled "Search" the user can specify the keyword(s) of his search, separated by spaces. The search space can then be narrowed down by using the three combo boxes labelled language, license and type, respectively. Language refers to the programming language of the source code the user wishes to retrieve (e.g. Java, Perl, PHP, etc.). License refers to the type of the license under which the source code retrieved has been initially published. Finally, type refers to the type of the file the user is looking for. This type can be class, interface or enum (enumeration type). If any criteria do not apply to some search engine, they are simply omitted.

COPE
The Component Adaptation Environment (COPE) tool is used by reuse engineers of SME AGs to recognize, extract, test, document etc. components from OSS projects. The extracted components are then placed in the Component Repository and Search Engine (COMPARE) tool that is used by SMEs to discover the extracted components in the context of the application engineering process.

Component Makers
Based on the analysis and recommendations carried out earlier the Reuse Engineer can now produce independent software components and then place these components in the repository using the 'Knowledge Manager' feature of COPE. Four different kinds of component makers are currently provided. The Interface Maker uses as input the clusters produced by the 'Dependencies Recommender'. The Dependency Maker presents all the classes of the project along with their reusability assessment and the reuse engineer can select a class and extract a component providing the functionality of the selected class. The Adapter Pattern Maker presents the clusters produced by the 'Pattern Recommender' and displays clusters involved in Adapter pattern instances.

Component Testing and Validation

After the component source files have been extracted the reuse engineer will process the component further in an IDE. This is an essential program comprehension step in which unit tests or execution scenarios examining a specific functionality are created. Also it is important to resolve additional dependencies, such as data dependencies, that are required for the component to work.

Component Packaging and Classification
The component package that is generated from the usage of COPE includes the following:
(a) A top directory with the component name,
(b) A readme.txt file which contains information such as: A short description of the component, the originating OSS Project, license or licenses, the programming language and technology, other components it uses if any, and the domain and main concept of the domain the component provides,
(c) Component source files,
(d) Required Libraries,
(e) Component Documentation generated by UML commercial or open source tools, and
(f) The test HTML report which includes separate subdirectories for each test case along with the test results (coverage etc.).

COMPARE
Introduction
COMPARE (Component Repository and Search Engine) is a tool that allows SME software re-users to search and discover the assets (software artefacts, technical documents, test suites, metamodels) produced by the Domain Engineering Process. COMPARE features an advanced search engine that can be used for searching among the components, according to specific needs and selection criteria ranging from desired features (functionality) to programming languages, execution frameworks, etc.

Technology platform
The COMPARE application is developed using the open source Apache-MySQL-PHP software stack. Also, COMPARE reuses extensively existing open source frameworks and web applications that provide various types of functionality.
Architecture Overview
The key objective of the application described in this document is to allow users to search and discover Software Components, allow users to upload new Software Components and provide community features regarding them. From the users point of view the application is a series of dynamic web pages, accessible through a web browser. On the backend, the application searches and retrieves data from a database and from other external sources through its external interfaces. COMPARE is built on top of the Joomla framework and its architecture is heavily based on it. The Joomla architecture is decomposed in three tiers.

The Infrastructure Module
The Infrastructure Module provides a set of infrastructure services which are utilised by all other components of the COMPARE system. Specifically, it comprises the Asset Metadata Repository, the Asset Manager and the Notifier. Also, the Infrastructure Module provides access to statistical information about the platform and provides the template user interface on top of which the user interfaces of all the other components are rendered. Finally, the Infrastructure Module controls user access and permissions to the platform through the use of the User module Module.

Asset Metadata Repository
The Asset Metadata Repository is a relational database which COMPARE uses to store information about components. It is a MySQL database using the InnoDB engine and contains the tables of the Joomla framework, the tables of the third party integrated applications and the tables of the COMPARE extensions

Asset Manager
The Asset Manager provides access to the assets of each component. It is composed by the Component Page, the SVN access component and the social modules (Forum and Wiki components).

Notifier module
The Notifier Module is used to receive and update the recent activity of the hosted components. Also, this module generates an 'activity index' based on the number of updates that were made in the last month. A component update is considered to be made in the following actions:
-A change is made to a property of the component
-A new file is uploaded in the component's repository
-A change is made in the component's wiki page
-A new thread is started in the forum

User Module
The User Module handles all the functionality regarding the users of the platform. It is composed by the User Extension Module which is an extension to the User Component of the Joomla framework and the Component Rating Module which holds the rating information that the users apply to each component.

Consumption Module
The aim of the Consumption Module is to allow the software re-user to search, provide feedback and retrieve the software components that are hosted by the platform. The Consumption Module provides its features through a set of web pages which can be accessed via the World Wide Web (WWW). The Consumption Module comprises the Asset Searching Module, the Asset Retrieval Module and the Interest Management Module.

The Asset Searching Module
The Asset Searching Module provides methods for a software re-user to search and filter the software components hosted by the COMPARE. This module is accessible to the re-user via a web page, where the user submits his search terms, and the module uses them to search the Asset Metadata Repository and present the results. Also, the Asset Searching Module provides filtering mechanisms to filter the search results based on various criteria. The Asset Searching Module receives the search terms from the re-user and analyses them.

Component Search Module
The Component Search Module is used to search for software components, from the Search page, asynchronously with the use of AJAX. Also, while the Search page is generating, it will use the model of this module.

Search page
The search page presents a search field which the user can use to search for CHSCs. Keywords entered in the search field will first be used to search a component that contains them in its name, then in its description and finally in its platform. For example, the keywords 'COMPARE tool' will produce a search for a component which contains 'COMPARE' and 'tool' and then 'COMPARE' or 'tool' for the name, description and platform fields.

Potential Impact:

Based on the capacities of the OPEN-SME repository and tools, a number of (bundles of) products and services can be offered to each customer segment. The OPEN-SME tools and repository allow analysis services and quality assurance. If services that are exclusively based on the tools are considered, OPEN-SME can offer help to solve legacy issues. The repository only allows offering components. Finally, the tools themselves can potentially be sold.

Exploitation plans
Overview
As laid out in Deliverable D2.6a the various actors in the OSS value network play different roles. In our case, there are two key actors in the value network of the OPEN-SME toolset: the technical academic partners of the OPEN-SME project provide the developers of a toolset for OSS reuse and reuse services, and the OPEN-SME-AGs in the consortium provide the distributors of the toolset and these services. The technical/academic partners, primarily AUTH and TELETEL, compile and analyse a set of existing tools for the identification and evaluation of reusable OSS code and OSS components. These existing tools are transferred into a suite that allows fast and comprehensive reusability checks of OSS code and components, which is not offered by any single tool underlying the suite. This act provides the key value creation process within the OPEN-SME project.

Proposed SME-AGs Business Strategy
Based on the analysis of the position and role of three SME-AGs that belong to the OPEN-SME project consortium in business ecosystems and OSS value networks, first recommendations of suitable OSS reuse business models for these (and similar) SME-AGs can be given. Overall, VSP shows a very commercial orientation and must be considered as integral and important part of the business ecosystem in Västerås, in which OSS development and reuse are widespread. Conclusively, VSP plans to take over an active and commercial role in the distribution and implementation of the OSS reuse tools and services based on these tools by advancing itself into a software vendor for the OPEN-SME tools / suite.

OPEN-SME Business Model and Exploitation Strategy
Being aware of the fact that the market introduction of a complex product like the OPEN-SME repository and tool needs time and a strategy, the partners have agreed to start the “OPEN-SME business” at a rather small scope, with VSP as key player for familiarizing, testing and implementing the OPEN-SME repository and tools in the robotics domain of the Science Park. In this initial phase, training and consultancy shall be provided by AUTH. The roll-out, which provides the second phase, is intended to happen in different directions. The first one is collaboration with the SMEs and SME associations in the OPEN-SME consortium.

Customer Segments
A number of relevant customers have been identified. In the initial phase, the most important customers will be the VSP members, specifically those ones in the field of robotics. This approach has been chosen in order to familiarize with the OPEN-SME repository and tools in a controllable area. The robotics domain of VSP is particularly useful for the introduction and testing of the OPEN-SME tools and repository because these members of VSP have a lot of knowledge of OSS, so that the learning curve is assumed to be less steep than in other domains. In the second phase, when VSP has accumulated enough knowledge about the OPEN-SME tools and repositories, other Science Parks and Incubators will be approached.

Channels
There are three types of channels - distribution, communication and sales - that serve different purposes and play a role at different points in time. The OPEN-SME partners identified the following channels through which potential customers (target groups) presumably want to be reached.

Customer Relationships
The establishment of a self-sustained OSS reuse community is considered to be the key for all customer relations in the OPEN-SME business model. Regarding the types of relationships, the partners agreed that fully and semi-automated relationships should be avoided, as the complexity of the tasks probably does not allow for the level of standardization that would be necessary for these types of relationships. Within the community itself, self-service relationships may be an option, as the level of expertise within the community should be high enough.

Key Activities
Key activities that must be performed in order to run the OPEN-SME business model successfully are twofold, on the one hand they have to help preparing the market for the OPEN-SME tools and repository and the services based thereof, on the other hand they have to secure and advance the value propositions offered to the target groups. One key activity that is important in the initial phase is a survey / overview of OSS activities within the portfolio of the SME-AGs and SMEs of the OPEN-SME consortium.

Key Partnerships
There are different types of key partnerships that serve different purposes. The key partners in the OPEN-SME business model are, in the initial phase, the partners of the OPEN-SME consortium and the VSP member companies (especially in the field of robotics). These partnerships can at current be considered as informal (as not based on a contract) strategic alliances between non-competitors. At a later stage, when a critical level of OSS reuse expertise has been built up at VSP and OPEN-SME consortium partners, additional contact points in relevant domains (which have to be identified by the partners), in particular other Science Parks have to be integrated in the business model as key partners.

Key Resources
There are a number of key resources required by the OPEN-SME value propositions. In the first place, there is an essential need for domain experts, first in the field of robotics, later in other domains, too. In addition, hardware is needed for server and storage capacity. Cloud computing was considered to be an inexpensive and efficient and flexible option, in this regard. Other key resources are assistance in building the OSS reuse the community / network and clarifying IPR conditions (rights to OPEN-SME repository and tools).

Revenue Streams
Revenue streams can be generated in various ways. Given the interview results it is obvious that customers are not easily willing to pay for OSS reuse analysis and services. However, the workshops have identified a number of values that appear attractive enough to be paid for by the target groups. The first value in this regard is certification, as this service provides a sort of guarantee that the software or component does what it is supposed to do.

Dissemination
Project Web Site

The OPEN-SME consortium established a website, (see http://opensme.eu online) for the support of the dissemination activities. This site provides public access to general information on the project (objectives, partners, scope, etc.), and to its public deliverables and presentations. Also the site accommodates restricted sections accessible only by the consortium members. The project web site is updated with information and content on a regular basis.

Dissemination Events

During the project a large number of dissemination activities took place from the majority of the partners. Furthermore, all the kinds of dissemination activities have been covered by the partners.
-A member of the OPEN-SME team participated in the DSM-TP 2010 summer school. The main concept of the DSM-TP summer school was Domain Specific Model (DSM) and Domain Specific Languages (DSL) which are an important aspect of the OPEN-SME project regarding the role of the re-use Engineer. Details on the topics of the school can be found at the DSM-TP 2010 summer school webpage: (see http://ctp.di.fct.unl.pt/DSM-TP/ online).
-A member of the OPEN-SME team participated in the ADAPT 2010 summer school.
The central theme of the ADAPT summer school was software adaptation, which is an important aspect of the OPEN-SME project. Details on the topics of the school can be found at the ADAPT 2010 summer school webpage (see http://userpages.uni-koblenz.de/~adapt/summerschool2010/ online)
-VSP Workshop

The workshop took place at VSP, Vasteras, Sweden on 25 and 26 January, 2012, with the participation of AUTH. MDU and UM-MERIT, which delivered a whole day seminar regarding the OPEN-SME business models. The agenda of the workshop also included a presentation of the OPEN-SME OSS Reuse Platform and Repository, VSP's plans to make use of OPEN-SME, usage preconditions (skills and capacities) and roles / collaboration

-Second OPEN-SME workshop

The Greek Association of Computer Engineers (EMYPEE) has successfully organized the Athens OPEN-SME Workshop on Friday 17/2/12, which has been held in the premises of Technical Chamber of Greece (TEE). The Workshop has attracted the interest of more than 40 participants that originated from SMEs, academia and public organizations in Greece. A welcome speech has been given by Mr. Spyridon Zanias (member of the TEE management board). The Workshop program contained 10 presentations and a round table discussion.

-Third OPEN-SME workshop

The 3rd OPEN-SME workshop took place on the 30th of May in Nicosia, organized by ETEK. The workshop was attended by 30 members of the IT Community of Cyprus and was addressed by the General Cashier of ETEK, Mr. Antonis Valanides. There was considerable interest from the participants in both the OPEN-SME toolset and the VSP business practices.

-Final OPEN-SME workshop

The final workshop was sponsored by ACM and organised as part of the ACM SigSoft COmpARch 2012 conference (see http://opensme.eu/ross online), bringing researchers and industrial experts to present and discuss the issues related to reuse of open-source components from technical, process, organizational, legal, and business point of view. The focus was on the potential benefits for Small and Medium Enterprises (SMEs). The workshop was organized as a combination of submitted papers presentations and open discussions in Bertinoro, Italy on 26 June 2012.

Publications

The consortium achieved the following publications:

Potential Impact

Based on the capacities of the OPEN-SME repository and tools, a number of (bundles of) products and services can be offered to each customer segment.

The OPEN-SME tools and repository allow analysis services and quality assurance. If services that are exclusively based on the tools are considered, OPEN-SME can offer help to solve legacy issues. The repository only allows offering components. Finally, the tools themselves can potentially be sold.

Exploitation plans
Overview
As laid out in Deliverable D2.6a the various actors in the OSS value network play different roles. In our case, there are two key actors in the value network of the OPEN-SME toolset: the technical academic partners of the OPEN-SME project provide the developers of a toolset for OSS reuse and reuse services, and the OPEN-SME-AGs in the consortium provide the distributors of the toolset and these services. The technical/academic partners, primarily AUTH and TELETEL, compile and analyse a set of existing tools for the identification and evaluation of reusable OSS code and OSS components. These existing tools are transferred into a suite that allows fast and comprehensive reusability checks of OSS code and components, which is not offered by any single tool underlying the suite. This act provides the key value creation process within the OPEN-SME project.

Proposed SME-AGs Business Strategy
Based on the analysis of the position and role of three SME-AGs that belong to the OPEN-SME project consortium in business ecosystems and OSS value networks, first recommendations of suitable OSS reuse business models for these (and similar) SME-AGs can be given. Overall, VSP shows a very commercial orientation and must be considered as integral and important part of the business ecosystem in Västerås, in which OSS development and reuse are widespread. Conclusively, VSP plans to take over an active and commercial role in the distribution and implementation of the OSS reuse tools and services based on these tools by advancing itself into a software vendor for the OPEN-SME tools / suite. Business models developed for this sort of SME-AG should put the SME-AG in the centre of the model and strive to generate sustainable revenues directly for the SME-AG.

OPEN-SME Business Model and Exploitation Strategy
Being aware of the fact that the market introduction of a complex product like the OPEN-SME repository and tool needs time and a strategy, the partners have agreed to start the 'OPEN-SME business' at a rather small scope, with VSP as key player for familiarizing, testing and implementing the OPEN-SME repository and tools in the robotics domain of the Science Park. In this initial phase, training and consultancy shall be provided by AUTH. The roll-out, which provides the second phase, is intended to happen in different directions. The first one is collaboration with the SMEs and SME associations in the OPEN-SME consortium.

Customer Segments
A number of relevant customers have been identified. In the initial phase, the most important customers will be the VSP members, specifically those ones in the field of robotics. This approach has been chosen in order to familiarize with the OPEN-SME repository and tools in a controllable area. The robotics domain of VSP is particularly useful for the introduction and testing of the OPEN-SME tools and repository because these members of VSP have a lot of knowledge of OSS, so that the learning curve is assumed to be less steep than in other domains. In the second phase, when VSP has accumulated enough knowledge about the OPEN-SME tools and repositories, other Science Parks and Incubators will be approached.

Channels
There are three types of channels - distribution, communication and sales - that serve different purposes and play a role at different points in time.

The OPEN-SME partners identified the following channels through which potential customers (target groups) presumably want to be reached.
-Internet (webpage, email)
--Software communities
--SME clusters / groups
--Thematic forums
-Social media (Facebook, LinkedIn, Twitter etc.)
--Registered 'followers' from industry, academia and software communities
-Phone
--Companies
--Science Parks
--EU networks
--Industry Associations
-Face-to-face
--VSP
-Teaching / courses
--Academia
--Industry associations / chambers of commerce
-Academia and industry collaboration
--Master theses
--Internships
-Events
--Industry events
--Software community events, e.g. FOSSDEM (fossdem.org)
--Domain-specific events (e.g. conferences in the robotics area)

Customer Relationships
The establishment of a self-sustained OSS reuse community is considered to be the key for all customer relations in the OPEN-SME business model. Regarding the types of relationships, the partners agreed that fully and semi-automated relationships should be avoided, as the complexity of the tasks probably does not allow for the level of standardization that would be necessary for these types of relationships. Within the community itself, self-service relationships may be an option, as the level of expertise within the community should be high enough.

Key Activities
Key activities that must be performed in order to run the OPEN-SME business model successfully are twofold, on the one hand they have to help preparing the market for the OPEN-SME tools and repository and the services based thereof, on the other hand they have to secure and advance the value propositions offered to the target groups. One key activity that is important in the initial phase is a survey / overview of OSS activities within the portfolio of the SME-AGs and SMEs of the OPEN-SME consortium. This survey would provide an initial overview of the markets for the OPEN-SME tools, repository and services and contact points for entering these markets.

Key Partnerships
There are different types of key partnerships that serve different purposes .

The key partners in the OPEN-SME business model are, in the initial phase, the partners of the OPEN-SME consortium and the VSP member companies (especially in the field of robotics). These partnerships can at current be considered as informal (as not based on a contract) strategic alliances between non-competitors. At a later stage, when a critical level of OSS reuse expertise has been built up at VSP and OPEN-SME consortium partners, additional contact points in relevant domains (which have to be identified by the partners), in particular other Science Parks have to be integrated in the business model as key partners. In this case, other forms of partnerships may be chosen, and the relationships might get formal (i.e. based on contracts).

Key Resources
There are a number of key resources required by the OPEN-SME value propositions. In the first place, there is an essential need for domain experts, first in the field of robotics, later in other domains, too. In addition, hardware is needed for server and storage capacity. Cloud computing was considered to be an inexpensive and efficient and flexible option, in this regard. Other key resources are assistance in building the OSS reuse the community / network and clarifying IPR conditions (rights to OPEN-SME repository and tools).

Revenue Streams
Revenue streams can be generated in various ways.

Given the interview results it is obvious that customers are not easily willing to pay for OSS reuse analysis and services. However, the workshops have identified a number of values that appear attractive enough to be paid for by the target groups. The first value in this regard is certification, as this service provides a sort of guarantee that the software or component does what it is supposed to do. The idea of the OPEN-SME partners is to provide a medium-level certification that can be issued based on extensive testing but without going through the time consuming procedure of strictly formal certification, like by ISO standards. Another value that target groups are expected to pay for is tested components. Here, customers have to pay for the tests, not the components, as these are OSS.

Dissemination
During the project a large number of dissemination activities took place from the majority of the partners. Furthermore, all the kinds of dissemination activities have been covered by the partners.

List of Websites:
http://opensme.eu
143292981-8_en.zip