from 30 November 2017 to 1 December 2017
The Square Meeting Centre
Europe/Brussels timezone
Connecting the building blocks for Open Science
Home > Timetable > Contribution details

Contribution Posters


Diffusion Phenomena Simulation at SoBigData Research Infrastructure


  • Vittorio ROMANO

Primary authors


One of the most pressing and fascinating challenges is understanding the complexity of the globally interconnected society we inhabit today. Our desires, opinions, sentiments, records of our mobile phone calls and GPS tracks leave traces of our behaviours. In this perspective, data science is an interdisciplinary and pervasive paradigm aiming to turn data into knowledge and value. Data may be structured or unstructured, static or streaming. Knowledge and value are provided in the form of predictions, automated decisions, models learned from data. The exploitation of results imply ethical, social, legal, and business aspects. Based on the above observations, this presentation shows how the SoBigData Research Infrastructure (RI) serves a large community of data scientists. In its first part, we show the concepts related to RI: a data catalogue, where the products of the RI are described and searchable. At the moment, the catalogue contains 47 datasets, and 41 methods. We show how the concept of virtual research environment (VRE) is used to support the research. Currently, SoBigData provides a VRE called SoBigData Lab where a user can use several methods. Finally, we introduce the concept of exploratory. An exploratory is a virtual thematic place related to a specific research line. A user is driven by an exploratory to discover data, methods and workflows for performing its own experiments. The number of exploratories is growing and now the RI has 5 exploratories related to human mobility, well-being, societal debates, migration flows and sports. Each exploratory is supported by a specific VRE. Furthermore, SoBigData RI hosts several services (e.g. M-Atlas and TAG-ME) offering services for text and mobility mining. This [link][1] lists the VREs available inside the RI. The second part of the presentation is focussed on the SoBigData features by means of a specific application tailored to data scientists in a context of complex network analysis. For this aim, among the social mining analytics capabilities offered by SoBigData there is the NDlib framework, an ecosystem tailored to provide analytic support to diffusion phenomena simulation. Several real-world phenomena can be modeled and studied as diffusion processes: virus spreading, word of mouth, diffusion of innovations are only few examples. NDlib offers two different interfaces to its users to approach diffusion study: a developer interface, implemented as a python library, that allows programmers to simulate the diffusion of viruses/ideas/behaviours over a social graph by selecting among a predefined set of models as well as by allowing the definition of novel ones; an analyst interface that, on the other hand, offers a web oriented visual platform that provides an abstraction over the NDlib library, thus allowing its users to setup and execute remote experiments and visualize simulations results. Concluding, this presentation overviews on how SoBigData supports data science towards the new frontiers of big data exploitation providing an example on how a tool developed inside our RI can be useful also for training new generation of data scientists. [1]: