WP2 – Information Extraction

This work package will be led by KAIST. The information landscapes converge rapidly towards Big Data landscapes with increased variety, value and velocity. The aim of this work package is to address the variety of data that can be relevant to the user by transforming the data into a format of choice (RDF in our case) so as to efficiently release the value of the data through making it available to the end user. Given that mobile devices cannot cope with the velocity of current data sources (e.g., twitter streams), we will follow a hybrid approach where rapid and high-quality offline extracting techniques developed within this WP will process high-velocity data sources. Moreover, given that semi-structured and structured data can be easily converted to RDF, we will mainly focus on extracting RDF from unstructured (i.e., textual) data.

Relevant portions of the extracted data (based on the user’s profile) will be pushed to the user’s mobile device periodically. In addition, on-device processing will allow extracting RDF from local data (including mail, contact information, address book, GPS sensor data as well as acceleration data) and thus combining local data with external data pushed to the device. On-device indexing mechanisms as described in WP5 will finally facilitate time-efficient question answering based solely on the data available on the device.