Microorganism data schema

Background

The microorganism data schema was developed by the Atlas of Living Australia (Atlas) to support the sharing of microorganism data. The data schema is an extension to Darwin Core (DwC) specific to microorganisms. The schema was developed by the Atlas in association with a Technical Working Group established by the Council of Heads of Australian Microorganism Collections.

Detailed information about Darwin Core is available at http://rs.tdwg.org/dwc/terms/index.htm and more information about sharing data through the Atlas. The additional fields in the extension are described below.

Example shared data files:

Recommended minimum data

The recommended layout for organisations wishing to share the minimum data to meet all requirements identified by CHACM members is available at Micro organism Shared Data Layout [.pdf 481KB]. Any given collection may not record every field, or local policy may prevent it being published.

The recommended minimum data set is:

Field name Description Data Rules
institutionCode The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record. The value should also be represented in the AMRiN/ALA “Collectory” list. For example, “National Pathology Institute”, “FMNH”, “AKN-CLO”.
DWC term: http://rs.tdwg.org/dwc/terms/institutionCode
Mandatory free text
collectionCode The name, acronym, code, or initials identifying the collection or data set from which the record was derived. The value should also be represented in the AMRiN/ALA “Collectory” list. If not supplied, ALA/AMRiN will default the institutionCode value. For example, “Bacteria”, “ANBC”, “42”.
DWC term: http://rs.tdwg.org/dwc/terms/collectionCode
Free text
catalogNumber The strain number. The institution or collection’s unique internal reference number for the specimen. For example “CBS 14”.
MCL term: http://www.straininfo.net/ns/mcl/2.0/strainNumber
DWC term: http://rs.tdwg.org/dwc/terms/catalogNumber
Mandatory free text
scientificName Describes the species name. Species name contains genus and species epithet, and if applicable, the subspecies epithet.
MCL term: http://www.straininfo.net/ns/mcl/2.0/speciesName
DWC term: http://rs.tdwg.org/dwc/terms/scientificName
Mandatory free text
basisOfRecord The specific nature of the data record. For example, “LivingSpecimen”.
DWC term: http://rs.tdwg.org/dwc/terms/basisOfRecord
Values:“LivingSpecimen”, “Taxon”, “Occurrence”, “PreservedSpecimen”, “FossilSpecimen”, “HumanObservation”, “MachineObservation”, null.
otherCatalogNumbers A list of other, equivalent strain numbers, separated by semicolons. For example, “ATCC 128;IFO 278;PYCC 937;SCHLEIN”.
MCL term: http://www.straininfo.net/ns/mcl/2.0/otherStrainNumbers
DWC term: http://rs.tdwg.org/dwc/terms/otherCatalogNumbers
Free text
eventDate Date when sample was taken. Preferred format is YYYY-MM-DDThh:mm:ss±hhmm. For example:

  • “2010-04-01T12:00:00+1000” – sample was taken April 1 2010 at noon AEST
  • “2010-04-01” – sample was taken April 1 2010.

MCL term: http://www.straininfo.net/ns/mcl/2.0/sampleDate
DWC term: http://rs.tdwg.org/dwc/terms/eventDate

Datetime ISO 8601
sampleOccurrenceType Origin of the sample. Location information for Clinical records should not be provided for display on maps for public users. Values are:

  • Clinical – the sample was taken from a human patient or subject
  • Environmental – the sample was taken from a biome
  • Veterinary – the sample was taken from livestock, pets or wild animals.
Values: “Clinical”, “Veterinary”, “Environmental”, null
sampleSource The source or host the sample was taken from. For example, “chicken”, “horse”, “human”, “mud”.
MCL term: http://www.straininfo.net/ns/mcl/2.0/sampleHabitat
Free text
sampleSpecificSite The specific site the sample was taken from. For example, “fruit”, “egg”, “dairy product”, “meat”, “blood”, “urine”, “mucus”.
MCL term: similar to http://www.straininfo.net/ns/mcl/2.0/sampleHabitat
Free text
country Country where sample was taken. For example, “Australia”.
MCL term: http://www.straininfo.net/ns/mcl/2.0/sampleLocationCountry
DWC term: http://rs.tdwg.org/dwc/terms/country
Value must be listed at ISO3166 (including ISO3166-3, formerly used country names).
stateProvince The name of the next smaller administrative region than country (state, province, canton, department, region, etc.) in which the Location occurs. For example, “ACT”, “NSW”, “NT, “QLD”, “SA”, “TAS”, “VIC”, ‘WA”. Value should be listed at ISO3166-2 (Country subdivision code).
DWC term: http://rs.tdwg.org/dwc/terms/stateProvince
Free text
verbatimLatitude Latitude at which sample was taken. The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive.
DWC term: http://rs.tdwg.org/dwc/terms/verbatimLatitude
Number
verbatimLongitude Longitude at which sample was taken. The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive.
DWC term: http://rs.tdwg.org/dwc/terms/verbatimLongitude
Number
coordinateUncertaintyInMeters The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location. Leave the value empty if the uncertainty is unknown, cannot be estimated, or is not applicable (because there are no coordinates). Zero is not a valid value for this term. For example: “30? (reasonable lower limit of a GPS reading under good conditions if the actual precision was not recorded at the time), “71? (uncertainty for a UTM coordinate having 100 meter precision and a known spatial reference system).
DWC term: http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters
Number
coordinatePrecision A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude. For example, “0.00001? (normal GPS limit for decimal degrees), “0.000278? (nearest second), “0.01667? (nearest minute), “1.0? (nearest degree).
DWC term: http://rs.tdwg.org/dwc/terms/coordinatePrecision
Number
geodeticDatum The ellipsoid, geodetic datum, or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude as based. For example, “EPSG:4326?, “WGS84?, “NAD27?, “Campo Inchauspe”, “European 1950?.
DWC term: http://rs.tdwg.org/dwc/terms/geodeticDatum
Free text

Extension field descriptions

The Atlas accepts the following fields IN ADDITION to Darwin Core.

Field name Description Data Rules
history History of the strain/specimen, showing culture and deposit events. For example, “CBS -> IFO -> NITE”, which reads as the strain was isolated at CBS, deposited to IFO, then deposited to NITE.
MCL term: http://www.straininfo.net/ns/mcl/2.0/history
Free text
sampleOccurrenceType Origin of the sample. Location information for Clinical records should not be provided for display on maps for public users. Values are:

  • Clinical – the sample was taken from a human patient or subject
  • Environmental – the sample was taken from a biome
  • Veterinary – the sample was taken from livestock, pets or wild animals.
Values: “Clinical”, “Veterinary”, “Environmental”, null
sampleSource The source or host the sample was taken from. For example, “chicken”, “horse”, “human”, “mud”.
MCL term: http://www.straininfo.net/ns/mcl/2.0/sampleHabitat
Free text
sampleSpecificSite The specific site the sample was taken from. For example, “fruit”, “egg”, “dairy product”, “meat”, “blood”, “urine”, “mucus”.
MCL term: similar to http://www.straininfo.net/ns/mcl/2.0/sampleHabitat
Free text
biohazardLevel The CDC/EU Biosafety/Pathogen/Protection Level for the microorganism. For example, “1” (causes only mild disease to humans, or is difficult to contract via aerosol in a lab setting). Values: “1”, “2”, “3”, “4”, null
mediumName Common name of culture medium. For example, “Nutrient Agar (Oxoid CM3)”
MCL term: http://www.straininfo.net/ns/mcl/2.0/mediumName
Free text
mediumDescription Full description (including list of ingredients) of culture medium preparation.
MCL term: http://www.straininfo.net/ns/mcl/2.0/mediumDescription
Free text
dateLastChecked Date and time the culture was last checked and confirmed viable. Preferred format is YYYY-MM-DDThh:mm:ss±hhmmFor example:

  • “2010-04-01T12:00:00+1000” – culture was checked April 1 2010 at noon AEST
  • “2010-04-01” – culture was checked April 1 2010.
Datetime ISO 8601
actualGrowthTemperature The temperature the specimen was grown at (degree Celsius, do not include unit).
Similar to MCL term: http://www.straininfo.net/ns/mcl/2.0/optimalGrowthTemperature
Number
minimalGrowthTemperature Minimal temperature (degree Celsius, do not include unit) necessary to observe growth on culture medium.
MCL term: http://www.straininfo.net/ns/mcl/2.0/minimalGrowthTemperature
Number
optimalGrowthTemperature Temperature (degree Celsius, do not include unit) at which optimal growth on culture medium can be observed.
MCL term: http://www.straininfo.net/ns/mcl/2.0/optimalGrowthTemperature
Number
maximalGrowthTemperature Maximal temperature (degree Celsius, do not include unit) at which growth on culture medium can be observed.
MCL term: http://www.straininfo.net/ns/mcl/2.0/maximalGrowthTemperature
Number
oxygenRelationship One of AE, MA, FAN, AT, MAT, AN:

  • AE aerobic (100% air)
  • MA microaerophilic (5% air)
  • FAN facultative aerobic
  • AT aerotolerant (prefers anaerobic conditions for good growth, but tolerates aerobic condtions)
  • MAT microaerotolerant (anaerobic bacteria tolerating microaerophilic conditions)
  • AN anaerobic

MCL term: http://www.straininfo.net/ns/mcl/2.0/oxygenRelationship

Values: “AE”, “MA”, “FAN”, “AT”, “MAT”, “AN”, null
catalogURL Intended to support requests or purchases of a specimen.Link to online order form for the strain.
MCL term: http://www.straininfo.net/ns/mcl/2.0/catalogURL
URL
availabilityRemarks Intended to support requests or purchases of a specimen. Information about how to order the strain if there is no online order form.For example, “Available on request, email dr.hofstedter@myco.edu.au for details.” Free text
geneticLocus Location of a gene or DNA sequence on a chromosome. For example: COX1, TEF, Beta tubulin, ITS, 16S or 18S, 23S or 26S or 28S Free text
sequenceOrigin Whether the sequence is of DNA or protein or RNA Values:“DNA”, “Protein”, “RNA”, null
sequence Correctly edited final sequence. Free text
sequenceForwardPrimer Name of the oligonucleotide used as a primer for the ‘plus’ strand during amplification, or url for its description. For example, “Universal Forward 20mer 5′ GTTGTAAAACGACGGCCAGT 3′” Free text
sequenceReversePrimer Name of the oligonucleotide used as a primer for the ‘minus’ strand during amplification, or url for its description. For example, “Universal Reverse 20mer 5′ CACAGGAAACAGCTATGACC 3′” Free text

Related content