Platform for supporting disaster management and applications.

FONDEF IDeA code ID15I10560

This project proposes to build and promote a simple stream processing platform devised to process massive streams of events in real time for applications designed to support management and assistance in natural disaster emergencies. Its relevant design features are related to providing pertinent services to applications in aspects such as parallel and distributed processing of data such as text messages, in such a way that it makes transparent to software developers the complex details of communication and processing infrastructure.



Basal funds FB0001, Conicyt, Chile. Computer science researcher.

Dynamic Big Data Processing

Managing and processing Big Data is now a major problem for the scientific community, and especially the computer science one. Big Data appears in most of the industry sector like manufacturing, natural resources, healthcare, or finance and insurance, and represent a major challenge. Big Data is also used for Intelligent Systems, Scientific Problems, Entertainment, or Sports. There are several existing solutions to manage Big Data. Traditional ones are usually based on a cluster or grid architecture, and mainly use the well-known MapReduce algoritmo designed by Google Inc. and deployed for its crawler. Other solutions are based on large scale distributed systems like Peer-to-Peer networks, and may also use the MapReduce paradigm. However these solutions are likely to be short-lived in terms of the cost/benefit ratio. Managing and processing Dynamic Big Data, where new data is produced continuously, is much more complex. Static cluster or grid based solutions are prone to induce bottleneck problems, and are therefore ill-suited in this context. In this project we aim to design and implement a Reliable Large Scale Distributed Framework for the Management and Processing of Dynamic Big Data.

Automatic Stream Applications Deployment

This project aims to create an infrastructure to deploy stream applications in a easy manner. The Stream Processing Engines are designed to deal with online processing of large volumes of data streams in a scalable manner, however, its configuration and programming paradigm restrict their use to expert users. The proposed infrastructure aims to bring the use of SPEs to all kind of users and extend theirs use to other applications.

Real-time Stream Processing

This project proposes research aimed at building software products for efficient and scalable processing of large event streams in real time. Feasible applications of this technology include domains as diverse as systems for crisis management and finance. In many cases, these applications require efficient parallelization and distribution of multiple tasks to achieve response times of the order of a fraction of a second per event. This as a result of the ever increasing availability of massive event sources such as message channels or geographically distributed sensors. The implementation of these complex systems is very demanding in terms of software development. Consequently, this project takes advantage of an unique opportunity for research collaboration and international visibility from a product called open-source S4 (Simple Scalable Streaming System), which facilitates development of such applications. Our project proposes to extend S4 to deliver better performance and enhance it with additional software tools that represent a clear advantage over other competing products. S4 is currently an incubation project at the Apache Software Foundation, a fact that directs monetization of technology to the open-source industry under Apache licensing. This is a market of billions of dollars that has been growing steadily in the last years. Companies monetize through consulting, training, application development, and services with premium versions of open-source products and/or licensing of related software tools.

Reputation and Energy aware Search for SupPOrting Natural Disasters

Upon the occurrence of a natural disaster in an urban area, it is expected a sudden rise of a large number of people making intensive use of mobile devices to get information. At the same time, a huge amount of information is expected to be generates from many different sources. In this project we aim to give an answer to the following questions: How to provide an infrastructure which is tolerant to disruption and delays, considering the constraints of the network in an environment composed of mobile devices? How we can get and disseminate relevant and diverse information from sources in an effective way, considering constraint such as temporarily lack of communication, energy consumption, and bounded capacity of nodes to store information?

E-mail Analysis Based on Human Behavior

E-mail has become a standard tool for communicating people. This project aims to create a software tool that allows users to classify their e-mail texts to obtain profiles describing the type of communication he/she has had with the receivers over time. Two software tools will be developed to analyze the content of e-mails and classify them by using machine learning techniques fed with training data generated with psychology and human behavior techniques. As a result, tool prototypes will be built and tested for two market targets: one for the banking sector for customer service where a specialized tool will be developed for assessing executive-client relationship; and another for general users who can download a free Plug-in for their personal e-mail accounts.

Real-Time and Scalable Web Observatory

The concept of observatories on the Web is generic and of interest to a wide variety of individuals and businesses. Only in Chile, over the past five years, there have been dozens of observatories dedicated to discovering information relevant to a wide range of national affairs that is communicated to its users. These systems classify information manually and present content in a static way, and techniques for content search are still based on keywords, which limits the usefulness of the information collected by the observatory. This project aims to improve the user experience of observatories providing software tools that allow us to observe the dynamics of the Web by combining techniques of spatio-temporal databases and recent strategies for web mining such as topics detection and emerging social communities, among others. The innovative component of the project lies in the projection in time and space (sources of publication and / or geographic locations) of the different views of the detected information on the Web, as well as, the traces detected in the interactions by the users of national companies with applications that attract millions of users. Basically, the business model of the project has the following components: (a) A Web portal “Observatories” scalable to millions of users and configurable to multiple domains, according to the needs of individual and corporate users, and where the funding comes via payment of subscription and download plug-ins for personal computers, and (b) an advanced system based on modules that can be installed in the data center of business organizations interested in monitoring the Web and better understand their users.