Streamlining data cleaning and modelling using BCCVL and ALA

Streamlining data cleaning and modelling using BCCVL and ALA

  • By
  •  December 7, 2016
  •  Tags:  Tools & Apps

The Biodiversity and Climate Change Virtual Laboratory (BCCVL) is a ‘one stop modelling shop’ that simplifies the process of biodiversity-climate change modelling. The ALA is a collaborative open infrastructure that pulls together biodiversity data from multiple sources. This week we have released a collaborative project to streamline processes for people using ALA data and BCCVL modelling tools.

There are currently more than 60 million occurrence records in the ALA, based on specimens, field observations and surveys. Through the Spatial Portal the ALA also provides powerful mapping and analysis tools, allowing users to explore the information in new ways. For example, users can pull in species occurrence records and clean these based on a number of different factors, such as location uncertainty. Once cleaned, users can also conduct some preliminary data investigation such as histograms and scatter plots to better understand their data before using it in models.

BCCVL connects the biodiversity and climate change research community to Australia’s national computation infrastructure by integrating a suite of tools in a coherent online environment. The goal of the BCCVL is to integrate these tools and data sets with high-performance computers and major data storage facilities.

Currently the BCCVL offers its users live access to the raw ALA occurrence records. However, to clean this data users would have to download it, manually clean it, and then re-upload it into the BCCVL. At the end of our collaborative project, users will be able to find, investigate and clean ALA occurrence records using the tools in the ALA and then, at the click of a button, push this data set directly to the BCCVL ready to be used in models. Data can also come in the other way (i.e. from BCCVL) and then cleaned, refined or augmented in the ALA and returned to BCCVL. This integration promotes good data practices and will encourage more robust model outputs. It also allows the two NCRIS facilities to focus on their core strengths: BCCVL on modelling, and ALA on data aggregation, exploration and visualisation.

The collaboration between BCCVL and ALA is ongoing. The BCCVL is supported by the National eResearch Tools and Resources Project (NeCTAR), an initiative of the Australian Government being conducted as part of the Super Science Initiative and financed from the Education Investment Fund, Department of Industry, Innovation, Science, Research and Tertiary Education. The University of Melbourne is the lead agent for the delivery of the NeCTAR project and Griffith University is the sub-contractor. The ALA is funded by the Australian Government National Collaborative Infrastructure Strategy (NCRIS) and hosted by CSIRO.

For more information on BCCVL-ALA collaboration, visit ‘Fit for purpose’ data with the ALA.