Big Data Biodiversity? The state of biodiversity research as a data-driven science.

Roughly one decade ago, the notion of data-intensive - or data-driven - Science as a new paradigm for biodiversity research was first introduced. The data science approach had a great impact in such different scientific fields as astronomy, robotics or climate science. Furthermore, the concept of data-managment and publishing is well established in the biodiversity research community. The history of Biodiversity Informatics can be traced back to the 1970s and large scale digitization projects have been on the rise since the late 1990s. It therefore appears likely, that biodiversity research should benefit greatly from data-driven approaches. This poster delievers an overview of the way that data is published and utilized for research into biodiversity. We review the GBIF, biggest open data portal focused on biodiversity and how its data is used by researchers. Additionaly, we identified the most common application of novel analytical tools commonly associated with data driven science(such as machine learning algorithms). They are still only rarely applied directly for answering biodiversity questions. We argue that this is at least in part a result of the kind of data and the publication method. Thirdly, we take the point of view of a decision maker in the field of energy production. Manager of power plants have an interest in monitoring their onsite biodiversity and in predicting the effect their actions have on the local wildlife. For these application, currently available open data does not cover a wide enough spatial and temporal range. Combining primary biodiversity data with secondary data sources (such as remote sensing) via modern data analytics tools could help to fill the gap in data at least partialy. Further, concentrated efforts on creating worldwide biodiversity observation networks publishing regularly updated data sets of biodiversity indicators need to be established and funded.


Although biodiversity research has a long tradition of collecting, publicing and analysing data, there are currently no applications of what one might call a "data-driven" approach to biodiveristy. We argue that data-driven methods will be increasingly important for establishing temporal and spatial extensive monitoring scheme. These monitoring schemes will however be necessary for planners and decision makers to consider the effect their actions have on global biodiversity.

