ARMADA INRIA Project, 2014 - 2016
Dynamic Big Data Processing
Managing and processing Big Data is now a major problem for the scientific community, and especially the computer science one. Big Data appears in most of the industry sector like manufacturing, natural resources, healthcare, or finance and insurance, and represent a major challenge. Big Data is also used for Intelligent Systems, Scientific Problems, Entertainment, or Sports. There are several existing solutions to manage Big Data. Traditional ones are usually based on a cluster or grid architecture, and mainly use the well-known MapReduce algoritmo designed by Google Inc. and deployed for its crawler. Other solutions are based on large scale distributed systems like Peer-to-Peer networks, and may also use the MapReduce paradigm. However these solutions are likely to be short-lived in terms of the cost/benefit ratio. Managing and processing Dynamic Big Data, where new data is produced continuously, is much more complex. Static cluster or grid based solutions are prone to induce bottleneck problems, and are therefore ill-suited in this context. In this project we aim to design and implement a Reliable Large Scale Distributed Framework for the Management and Processing of Dynamic Big Data.