Home > Documents > Services > Data Annotation Services

Data Annotation Services

This is a DRAFT – please send us your comments and suggestions.

The ALA is a project to support integration of data from a wide range of different sources. This activity can be significantly enhanced if users have the opportunity to store comments of various kinds on data items. Examples could include

  • Plain text annotations providing comments or proposed corrections for any data item
  • Structured annotations proposing corrections for data items with well-known structures and formats
  • Annotations providings links to other data items or vocabulary terms
  • Responses from data providers or other users to any annotation

The ALA has received additional funding in the period 2008-2010 from the NCRIS Platforms for Collaboration capability’s NeAT programme to assist in the development of its Metadata Repository and Data Annotation Services. See Data Integration and Annotation Services in Biodiversity.

The aim is to provide a common and consistent set of services for storing and accessing annotations, and to use these services to provide an environment within which observations, statements and assertions about any species can easily be captured and managed for future use. Each annotation will consist of a block of text or structured data and will reference the original data item via a globally unique identifier which reliably allows the item to be accessed again. Annotations will be stored in a central service and can be retrieved by user applications and tools (including web sites) by supplying the identifier for the original data item.

Uses for such a service include:

  • Providing a consistent model for associating any comment with any data item and for managing threads of such comments
  • Allowing users (or automated data validation tools) to propose structured corrections to a data item which appears to be in error, and allowing other users (or the original data provider) to decide whether to keep the original values or to adopt the corrected values in their own analyses, etc.
  • Allowing users to store additional information about a particular species. For example a user could store a small piece of structured data indicating that a particular species (identified via a globally unique identifier) feeds upon another species (also identified via a globally unique identifier). Similarly a user could make a link between a species and properties in an ontology (e.g. to provide descriptive data for use by Online Identification Services).

Further reading

J. Hunter, I. Khan, A. Gerber, HarVANA – Harvesting Community Tags to Enrich Collection Metadata, Joint Conference on Digital Libraries, JCDL 2008. Pittsburgh, PA, USA, June 16 – 20, 2008.

Last modified: May 12, 2009 at 2:15 pm