Skip to main content
European Commission logo print header

SCALing by means of Ubiquitous Storage

Final Report Summary - SCALUS (SCALing by means of Ubiquitous Storage)

Storage research increasingly gains importance based on the tremendous need for storage capacity and I/O performance. Over the past years, several trends have considerably changed the design of storage systems, starting from new storage media over the widespread use of storage area networks, up to grid and cloud storage concepts. Furthermore, to achieve cost efficiency, storage systems are increasingly assembled from commodity components. Thus, we are in the middle of an evolution towards a new storage architecture made of many decentralized commodity components with increased processing and communication capabilities, which requires the introduction of new concepts to benefit from the resulting architectural opportunities.

The Marie Curie Initial Training Network (MCITN) ”SCALing by means of Ubiquitous Storage (SCALUS)” aimed at elevating education, research, and development inside this exciting area with a focus on cluster, grid, and cloud storage. The vision of this MCITN has been to deliver the foundation for ubiquitous storage systems, which can be scaled in arbitrary directions like capacity, performance, distance, and security. Providing this ubiquitous storage became a major demand for today’s IT systems and leadership in this area can have a significant impact on European competitiveness in IT technology.

To get this leadership, it is necessary to invest into storage education and research and to bridge the current gap between local storage, cluster storage, grid storage, and cloud storage. The consortium proceeded into the direction by having built the first interdisciplinary teaching and research network on storage issues. SCALUS consisted of top European institutes and companies in storage and cluster technology, building a demanding but rewarding interdisciplinary environment for young researchers.

The research within this MCITN aimed at shaping the evolution to active, scalable, reliable, and grid-enabled storage systems. Storage systems based on these properties will be able to scale from small, local storage devices over cluster storage, up to globally distributed storage and will be based on unified concepts for these different complexity levels. An example is the development of a local file system for Facebook-like demands. The underlying concepts have taken into account that this file system could be scaled to HPC and Cloud storage systems based on a single architecture. This MCITN therefore blurred the boundaries between the different complexity levels to enable remotely collaborating storage that behaves as similarly to local storage as possible.

This gap between the different complexity levels has been efficiently bridged by a joint effort of researchers from theoretical computer science, informatics, and electrical engineering, collaborating with experts from industry at different levels of abstraction. An example is the transfer of the theory of hypergraphs to Cloud computing, which led to an algorithm, which is able to calculate the optimal data placement in data clouds.

Storage technology has become a complex scientific research area that can only be pushed forward by a close interdisciplinary cooperation. The consortium of this MCITN consisted of experts for algorithm development and analysis, autonomic management of storage environments, the development of storage protocols, including file systems and object-level protocols, and the design of storage architectures as well as hardware for storage systems. The program of this MCITN proposal did not include aspects of the development of storage media, but of course included usage of latest developments in hard disk technology, like Flash RAM-based disks, e.g. by providing and implementing new overwrite-compatible algorithms, which help to increase the lifetime of SSDs for database-like applications.

The close interdisciplinary cooperation is reflected in our research methodology, which has been based on the concept of Algorithm Engineering. Algorithm Engineering is an iterative cyclic process consisting of design, analysis, implementation, and experimental evaluation of algorithms. Realistic models for computers and applications, as well as algorithm libraries and collections of real input data allow a close coupling to the applications, e.g. having been performed for MapReduce environments.

From our ESR students’ perspective, SCALUS offered a bright perspective for their future. The experiences taught in SCALUS are very rarely found, but nevertheless very important in Europe and we have already been able to see a competition for them after graduation.

The results of SCALUS and their potential impact have to be measured based on the importance of storage research and the application of its results for the European industry. The members and partners of the SCALUS network successfully increased their visibility on a worldwide level, with publications in the top storage conferences and journals as well as organizers of cluster, cloud, and storage conferences.

Technological leadership inside the field of ubiquitous storage environments as addressed inside SCALUS has a major impact on the competitiveness of European IT industry and in European storage research. SCALUS members founded the exascale IO initiative E10, having built up a European community around results from SCALUS. A visible influence on European IT competitiveness can become a reality in a very short period of time inside the innovative field of storage technology, where research ideas have often turned out to become commercially successful in the past. Examples are the introduction of RAID technology in 1987, over network attached storage and storage area network technology, up to the introduction of storage management and storage virtualization technologies, which all have become multi-billion Euro business inside a time-span of very few years.