Optimising cloud computing

Cloud computing has revolutionised the landscape of the information technology (IT) world with affordable computing resources. An EU-funded project has developed the tools to selectively inspect only the most useful data from the cloud data sets.

Digital Economy

Computer users are increasingly faced with finding means to store vast amounts of data. Larger hard drives do meet some of these needs but there is growing trend towards saving data on an off-site storage system. Within just a few years, companies have switched from hardware to such third party cloud services. The advent of cloud infrastructures has also made it feasible to analyse massive data sets with parallel processing integrated into the new virtual environment. The 'Cloud-based indexing and query processing' (CLOUDIX)(opens in new window) project adopted MapReduce to process and generate large data sets. The cutting-edge research work conducted during the two-year project significantly increased the performance of MapReduce. MapReduce is a programming model widely used for special-purpose computations involving large amounts of data such as web request logs. It is also used to derive various kinds of data including inverted indices. A "map" function is applied to each logical "record" to compute a set of intermediate key values. Then, a "reduce" process identifies all values that share the same key to combine derived data appropriately. The CLOUDIX researchers provided mechanisms for accessing a subset of the input data, instead of scanning all data to produce the same result. Specifically, advanced algorithms support early termination of data processing when sufficient data for producing the correct result has been accessed. The decisive first steps have also been made towards integrating efficient ranking techniques to sort results according to their relevance. During the CLOUDIX project, different approaches were combined to address the shortcomings of the most prominent framework for parallel query processing in the cloud. On the other hand, its merits include scalability, fault-tolerance, load-balancing and most importantly simplicity. The CLOUDIX results, published in peer-reviewed scientific journals, are expected to help scientists and professionals save working hours while analysing large data sets.

Keywords

Discover other articles in the same domain of application

Novel toxicology platform brings innovation and safety one step closer

28 August 2025

A robotic system to save the bees?

26 May 2023

Preserving the intangible: New technology for the visual arts

24 August 2020

Online scenarios clear the path to digital competences

2 April 2020

Project Information

CloudIX

Grant agreement ID: 274063

Project closed

Start date 1 September 2011

End date 31 August 2013

Funded under

Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)

Total cost

€ 212 225,60

EU contribution

€ 212 225,60

212 225,60

Coordinated by

NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET NTNU
Norway

Optimising cloud computing

Keywords

Discover other articles in the same domain of application

Share this page Share this page on social networks

Download Download the content of the page