Knowledge Base

How to work with data

This section of our site is dedicated to data sets. Here you will find information on uploading and submitting data sets, and how to decide on the type of license that you want to use. This page also includes information on how we integrate data and how we handle sensitive data.

How we upload data sets

If you already have a collection of species sightings they can be submitted as a data set rather than one at a time using the record a sighting form. A data set needs to be in a structured data format (spread sheet or database) with an identifier for each record and the information about the sightings in individual columns. There are some example templates below. For more information on the data standard used by the ALA the full Darwin Core schema can also be found below. When records are submitted as a data set they will be given an information page describing the data and it’s terms of use as well as displaying the usage via the ALA.

View full Darwin Core schema

How to submit a data set

The first stage in submitting a data set is to produce a file suitable for loading into the ALA. If you manage your data in a spread sheet you can create copy with Darwin Core column headings or copy your data into one of the templates below.

If you manage your data in a data base you’ll need to produce and export a file. Once you have a suitably formatted file, the sandbox (link below) is a tool to run a trial load of your data (without it being loaded to the main ALA) to test if it displays correctly. The ALA Sandbox also provides an upload capability. The sandbox can be used to map and test data, when the data is ready for loading it can then be described and submitted for review before it is loaded into the ALA.

If you have any questions about submitting data to the ALA please contact the ALA Data Management Team below.

Download a template:

Test data in the sandbox Contact Data Management

Decide on a license for your data

The Atlas of Living Australia offers an integrated set of biological observations to Australian and International researchers and the community. This requires that each dataset be licensed in a way that ensures that researchers and the community are able to reuse data that you submit.

Our preferred license is the Creative Commons Attribution license, where users of your data are required to attribute their use of your data to you.

We also support the Creative Commons Zero license, by which you can dedicate your data to the Public Domain. This provides for very simple reuse of your data, given the thousands of datasets in the Atlas.

If you have commercial concerns about your data the Creative Commons Attribution, Non-commercial license withholds permission for the use of your data for commercial purposes without your providing additional permission.

We do not accept data under any of the Creative Commons licenses that include the “No Derivatives” term, as a key purpose of the Atlas is to facilitate the re-use of data and the creation of derivative products.

How we integrate data

Integrating or aggregating data is the process of bringing multiple, disparate datasets together and combining them into a single data structure. Combining and standardising the different data sets allows them to be searched as a single unit using common terms. The ALA brings together hundreds of data sets and makes them available through a common interface. The ALA uses Darwin Core, an internationally developed biodiversity data interchange standard, as it’s core data model and point of standardisation.

How we do it

Data sets submitted for loading into the ALA are reviewed by the Data Management Team for the type of content and ability to be mapped to Darwin core. Then in communication with owner/custodian the data is processed to a format suitable for loading using data transformation software. If there is going to be an ongoing load schedule the processing saved for future repeat use.

Contact the Data Management team below if you have questions about the ALA data integration process.

Submit a data set Contact Data Management

How we handle sensitive data

A small proportion of species in the ALA are considered sensitive. There are many reasons a species might be considered sensitive such as being highly endangered or at risk of collection or disturbance. The data associated with these sensitive species is also considered sensitive meaning that if the exact location of the species was known, individuals of that species might be at risk. To protect these species, the ALA has a sensitive data service to lessen the accuracy of location data for sensitive species records.

How we do it

Each state and territory supplies the ALA with a list of sensitive species in their jurisdiction and the rules for processing records of those species.

When a species record from a sensitive species list is uploaded via Record a Sighting, the location of the record is modified according to the relevant rules. The location may be completely withheld or the accuracy of location coordinates may be adjusted. The ALA’s data integration tool, Sandbox (http://sandbox.ala.org.au/datacheck/), also runs sensitive data processing software.

To discuss the inclusion or removal of species on a sensitive species list, please contact the list owner specified on the sensitive species lists.

Contact Data Management View the Sensitive Species Lists