Skip to main content
European Commission logo print header

Developer-Centric Knowledge Mining from Large Open-Source Software Repositories

Periodic Reporting for period 2 - CROSSMINER (Developer-Centric Knowledge Mining from Large Open-Source Software Repositories)

Reporting period: 2018-07-01 to 2019-12-31

Open source software (OSS) is computer software provided under a license that permits developers to study, change, and improve the software for free. A report by Standish Group states that adoption of open source software models has resulted in savings of about €58 billion per year to consumers. Unlike commercial software, which is developed with a business motivated commitment to provide support and updates, OSS technologies are most often developed in a public, collaborative, and loosely-coordinated manner. This has important implications for the quality of different OSS technologies, as well as the level of support that different OSS communities provide to software developers.

While there are several high quality and mature OSS projects that deliver stable and well-documented products, which often involve a vibrant expert and user community providing support both in answering user questions and in addressing reported defects in the software, there are also many OSS projects that are dysfunctional due to one or more of the following:
+ The development team behind the OSS project invests little time in further development and support
+ Development of the OSS project has been discontinued due to lack of commitment or motivation
+ Documentation is limited or of poor quality making the software difficult to understand and to update
+ The community around the OSS project is small, questions receive late/no response, and defects get repaired slowly or ignored

Consequently, developing new software systems by reusing existing OSS components raises challenges in:
+ Searching for OSS candidate components
+ Evaluating and selecting the most suitable OSS components from amongst identified candidates
+ Adapting selected OSS components to fit specific requirements of new software products and services

Solving these challenges brings about substantial benefits to European software developers, but also overall within Europe’s increasingly digital community as the reuse of OSS technologies brings important savings in development, which enables increased investment in unique and forward looking innovations for enterprise and individuals.
The project has completed all of the tasks as planned and developed an innovative set of tools for developers working with open source software (OSS) projects that provide the following capabilities : a) Mining Source Code; b) Mining Natural Language Sources; c) Mining System Configurations; d) Workflow-Based Knowledge Extraction; and e) Mining Cross-Project Relationships. These tools include support to operate with different industry Integrated Development Environments (IDE), with the Eclipse IDE implementation already made available by the project. The tools have been integrated into a framework that can be deployed locally by software development organisations, or in the Cloud by OSS forge providers. All of the project technologies have been made available in open source at the SCAVA site hosted by the Eclipse Research Labs to enable broad industry access and take-up to improve European software development.

In addition to open source dissemination of the project results, six industrial organisations have developed demonstrators and carried out evaluations of the project technologies. Each of these organisations is moving forward with exploitation to use the tools to offere new software development services, to enhance commercial code analysis offerings, and to provide advanced OSS metrics for widely used Eclipse Foundation and OW2 forgess for OSS. The research and development partners will continue to evolve the technologies through an open and transparent process that welcomes the contributions from others interested in tools and technologies that enable greater use of OSS and improved development process performance for organisations that adopt OSS technologies for their products and services.
The CROSSMINER project has delivered an integrated platform for development of complex software systems that (1) enables monitoring, in-depth analysis and evidence-based selection of OSS components; and (2) facilitates knowledge extraction from large open source software repositories. The specific technological capabilities that have been developed within the CROSSMINER project are the following:
+ OSS analysis tools to extract and store actionable knowledge from a collection of OSS projects
+ Natural language analysis tools to extract quality metrics related to community communication channels and bug tracking systems of OSS projects
+ System configuration analysis tools to provide an integrated development and operational view of OSS projects
+ Workflow-based extractors that simplify creation of customised analysis and knowledge extraction from OSS projects
+ Cross-project analysis tools for understanding a wide range of OSS project relationships (e.g. dependencies and conflicts) based on developer defined similarity measures
+ Advanced integrated development environments allowing developers to easily adopt the CROSSMINER knowledge base and analysis tools while receiving OSS project alerts, recommendations, and feedback to improve developer productivity

CROSSMINER has established the conditions that will enable the take-up and further enhancement of the code base by third-party developers, especially SMEs, including development of new applications to be commercialised. Each of the six industrial pilot partners have adopted the technologies in offering new software development services, new OSS analysis metrics, improved open source forge monitoring and assessments, and adaptive online user experiences when browsing OSS repositores. The project results will substantially impact Europe’s software development community and digital economy and measured improvements include: 80% of developers indicated CROSSMINER technologies made it easier to work with open source projects and technologies, and over 85% indicated expected testing time reductions of at least 10% due to CROSSMINER. In addition, CROSSMINER technologies were able to provide a 90% reduction in time needed to discover what changes to their product offer were needed when an important third-party open source component used for their product line was updated, and also provided a 75% reduction of time needed to find out how obsolete or deprecated API elements (classes, methods, etc.) can be replaced. These quantified figures verify that overall use of the CROSSMINER technologies should achieve it's goal of providing a 50% increase in the productivity of software developers and a 50% reduction in development effort through the increased use of OSS projects and increased in the accuracy of project dependencies mining to reduce errors in adopting and deploying OSS projects.
crossminer-summary.jpg