Mr Daniel Garcia (IFCA-UC)
In our digital era, more and more data are being created by different providers, and the volume is growing in particular for Environmental Sciences and Ecology. In order to address grand and difficult challenges like the global warming, the Biodiversity reduction and extreme events, the combined use of these heterogeneous and interdisciplinary data sources is needed due to the complexity of the Earth systems. One of the problems to be addressed from now to the coming years is to ensure the freshwater availability. The eutrophication is an extended problem that affects many reservoirs and lakes in the whole world and it impacts directly on the water quality. The eutrophication is caused by different algae species bloom and it is affected by different processes and parameters: water temperature, nutrients, solar radiation, etc. So that, in order to predict and prevent algae bloom the use of different data sources is required. Supported by the eXtreme-DataCloud project, which aims at developing a scalable environment for data lifecycle management and computing and integrate different services and tools based on Cloud Computing resources to manage Big Data, and under the umbrella of H2020 programme, one of the Use Cases in XDC project representing LifeWatch ERIC (the European Research Infrastructure for Ecosystems and Biodiversity) is integrating data from heterogeneous data sources for Environmental data such as Satellites (NASA Landsat, ESA Sentinel), meteorological stations, In-situ instrumentation or Internet of Things sensors in order to feed hydrodynamics and water quality models in a Cloud Computing environment. Some of these data sources are open very often, like satellite or meteorological data, but they need a complex integrative process to be exploited. Also, these data sources produce data in different formats like NetCDF4, HDF5, CSV. The goal of this use case is to automatize different stages of data lifecycle in order to integrate these data sources and to model and simulate water environments like a reservoir. In order to interoperate the big data sources, metadata standards like the Ecological Metadata Language will play a very important role to support FAIR (Findable, Accessible, Interoperable, Reusable) data production. The proper data lifecycle management for big data and heterogeneous data sources and integrate this cycle in a computing environment will allow scientist to exploit the data resources in order to address these important challenges. The European Open Science Cloud will be the perfect environment to manage this data life cycle in an integrative way, providing resources and tools to deal with these problems.
The poster will show the approach adopted to manage heterogeneous data sources to feed forecasting models.
|Type of abstract||Poster|
Mr Daniel Garcia (IFCA-UC) Dr Fernando Aguilar (CSIC) Prof. Jesus Marco de Lucas (CSIC) Dr María Castrillo (IFCA-UC)