Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Projection of Security Vulnerabilities caused by Exploits in Dependencies

Periodic Reporting for period 1 - ProSVED (Projection of Security Vulnerabilities caused by Exploits in Dependencies)

Reporting period: 2022-06-21 to 2024-06-20

ProSVED (Projection of Security Vulnerabilities caused by Exploits in Dependencies) aims to forecast software vulnerabilities originating from security exploits in third-party libraries. In modern software projects, the code directly managed by developers, such as for security patches, represents only a small portion of the entire codebase. The majority resides in external dependencies, which pose significant security risks to the entire project. These risks can be mitigated through strategic update policies, but identifying optimal policies requires solving a complex prognosis problem: detecting the critical vulnerabilities hidden among vast amounts of third-party code.

ProSVED introduces an innovative approach to address this challenge, facilitating the selection of the most effective update policies to minimize security risks from external code. This project advances the field of software security analysis beyond traditional empirical methods into the realm of formal risk modeling for prediction and mitigation.

Estimating the quantity and severity of security vulnerabilities in code is crucial for software quality and control. While empirical methods are limited to detection, and traditional formal approaches rely on assumptions that don't hold in this context (such as the independence of codebases), Statistical Model Checking, though a relevant formal method, is often ineffective due to the rarity of significant events.

ProSVED presents a new formalism to accurately model the propagation of vulnerabilities from third-party libraries to the main codebase. This enables the development of a risk analysis theory to assess and optimize software-update policies for different codebases, significantly enhancing the ability to predict and mitigate security vulnerabilities.
### Work Performed and Main Achievements

**Development of Formal Models to Quantify the Probability of Emergence of Security Vulnerabilities:**

1. **Time Dependency Trees (TDTs):**
- Developed minimal directed acyclic graphs (DAGs) to represent the evolution of interdependent software codebases over time.

2. **Attack Trees (ATs):**
- Established a bijection between TDTs and ATs, allowing slices of TDTs to be represented as ATs.
- Developed algorithms for efficient metric computation on ATs, including the probability and time of attack/vulnerability.

3. **Probability Density Functions (PDFs):**
- Created time-series representations of library releases versus CVE publications.
- Estimated the probability of CVE publication as a function of time from the release of a library instance using empirical methods like Kernel Density Estimates.

**Embeddings on Current Practices, Technologies, and Communities:**

1. **Applications on Development Environments:**
- Implemented models in Java/Maven and Python/PyPI development environments.

2. **Communications with Practitioners:**
- Engaged with developers and researchers to assess the usability of proposed models.
- Gathered feedback on desired functionalities and perspectives on the use and training of professionals.

3. **Connections to EC Policies:**
- Linked project findings and methodologies to European Commission policies on cybersecurity sovereignty and development.

**Communication and Dissemination to Academia and Industry:**

1. **Exercisable Experimental Reproduction Packages:**
- Included reproducible experimental packages with main publications to foster result reproducibility and reusability by third parties in different environments.

2. **Invited Talks and Workshops:**
- Delivered presentations at industrial events such as the Vuln4Cast technical colloquium, Free Software Conference, and Privacy Symposium.
- Participated in research workshops and conferences, including sessions on formal models for security vulnerabilities in Smart Contracts, the Lorentz workshop, and the final dissemination event of ProSVED.
### Results Beyond the State of Art

**Projection of Security Vulnerabilities caused by Exploits in Dependencies (ProSVED)** generates quantitative forecasts regarding the emergence of security vulnerabilities in third-party open-source code, which can propagate through software dependencies and pose threats to entire projects. The project has introduced several innovative theories and technologies:

**Time Dependency Trees (TDTs):**
- ProSVED has developed Time Dependency Trees, which enhance traditional dependency trees with minimized code-evolution representations. TDTs are directed acyclic graphs that depict the evolution of dependency trees over time. This advancement enables forecasting analyses that leverage historical vulnerability data, implementing robust time-series studies where nodes in the TDT DAG are labeled with quantitative estimates.

**Probability Density Functions (PDFs):**
- A key innovation of ProSVED is the forecasting of future vulnerabilities specific to individual codebases, distinct from global CVE databases such as those maintained by FIRST. The project proposes statistical fittings of time series to estimate the duration between the release of a library or code and the first publication of a CVE originating from that code. This approach results in Probability Density Functions (PDFs) that quantitatively assess the likelihood of a new CVE being published for the code under scrutiny.

**Library Classification:**
- ProSVED addresses the challenge of viewing security vulnerabilities in individual libraries as rare events by aggregating libraries with similar characteristics from a security perspective. This classification approach is pioneering in the field, providing a protocol to identify the main functionalities that significantly influence the discovery of vulnerabilities. Such characterizations are crucial for enhancing the proactive management of software security risks.

These advancements underscore ProSVED's contribution to advancing the understanding and prediction of security vulnerabilities in software ecosystems. By innovating in time-series forecasting, dependency modeling, and library classification, ProSVED significantly enhances the capability to preemptively mitigate risks posed by third-party dependencies in software development.
Time Dependency Tree example from the Maven library jira-core
Probability of vulnerabilities in functions with high exposure to the Internet
Probability of vulnerability as a function of time
Probability of vulnerabilities in libraries with little exposure to the Internet
Logo of the project
From Time Dependency Tree to Attack Tree
Time Dependency Trees (TDTs) are directed acyclic graphs that mode
Developers' view and hindsight of vulnerabilities in code development chains