Periodic Reporting for period 3 - DICE (Developing Data-Intensive Cloud Applications with Iterative Quality Enhancements)
Reporting period: 2017-02-01 to 2018-01-31
This shortcoming is significant, since the rush to hit the Big Data market with new products leads companies to reduce the attention they pay to quality aspects, which are yet critical to avoid project failures. This issue is exacerbated by the fact that quality engineering of data-intensive software systems is still in its infancy, making it difficult today to analyze, predict and guarantee quality-of-service for this class of applications.
DICE is the first open source framework that offers a quality-aware methodology to develop and operate Big Data applications. With DICE, software vendors and developers can efficiently prototype new data-intensive applications at low cost, quickly creating business cases and proofs-of-concept for Big data technologies within their organisations. DICE encompasses quality assessment, architecture enhancement, testing and agile delivery, relying on principles of the emerging DevOps paradigm. At a high-level, the DICE paradigm aims at:
- Tackling skill shortages and learning curves in quality-driven development and Big Data technologies through open source development tools, models, and methods.
- Shortening the time to market for data-intensive applications that meet quality requirements, reducing costs for independent software vendors (ISVs) and increasing value for end-users.
- Reducing the number and the severity of quality incidents by iteratively learning the quality-levels of the application at runtime, feeding this information back to the design environment.
A free WikiBook is available that describes the DICE methodology in more detail (http://www.dice-h2020.eu/book/).
Open source and commercial versions of the DICE framework can be obtained from the project website (http://www.dice-h2020.eu/). Experimental assessments of the DICE methodology using these tools have been carried out against three industrial pilots involving:
- Stream-processing systems for social media data analysis
- Batch processing for tax fraud detection
- Cloud-based management of real-time port operations
Results indicate substantial productivity gains thanks to DICE, particularly in terms of reduction of deployment and configuration time for Big data platforms, compared to manual. The DICE framework is also able to identify several violations and anti-patterns in the application designs, as well as consistently reduce manual times for testing and system evaluation.
In particular, the open source release of the DICE framework is available free of charge and offers to development and operations teams:
- An Eclipse-based IDE implementing the DICE DevOps methodology and guiding the user step-by-step through the use of cheatsheets
- A new UML profile to design data-intensive applications taking into account quality-of-service requirements and featuring privacy-by-design methods
- Quality analysis tools to simulate, verify, and optimize the application design and identify possible anti-patterns
- OASIS TOSCA-compliant deployment and orchestration on cloud VMs and containers
- Monitoring and anomaly detection tools based on the Elasticsearch-Logstash-Kibana stack
- Runtime methods for configuration optimization, testing and fault injection
- Native support for open-source Apache platforms such as Storm, Spark, Hadoop, and Cassandra.
The DICE framework is also available in commercial versions focused on real-time applications and batch processing system development.
The DICE tools have been presented and are actively downloaded by a diverse group of stakeholders. Videos that illustrate cross-cutting benefits of the solution for different needs and use case scenarios are available on the DICE YouTube channel (https://www.youtube.com/channel/UC1EcaiuK-7Ztbj5n8n4MeFQ) together with tutorials on the DICE blog (http://www.dice-h2020.eu/blog/) as well as regular announcements on the DICE Twitter feed (https://twitter.com/diceh2020).
In the design of a data-intensive applications, existing software engineering approaches face a number of limitations, even if one considers the basic specification of requirements. For example, it is possible with MDE to express entity-relationship models, basic dependencies between components and data, field types and values, and data semantics. In the operations of a data-intensive application, DICE offers methods for deployment, monitoring, testing, configuration and anomaly detection for the aforementioned data-intensive technologies.