DNA-derived Occurrence records should be tabular and shared as a text file. The preferred format is a CSV file but any other delimited text file is also acceptable (e.g. TSV). If your records are held in a spreadsheet we can accept XLS or XLSX files. We do not accept records held in any other formats (e.g. FASTA or FASTQ).

We do not accept unformatted records or raw sequencing output. The NBN Atlas follows the Darwin Core standard for sharing and storing biodiversity data, we also support a selection of terms from the DNA Derived Data Darwin Core Extension. All required fields in the NBN Atlas DNA-derived Occurrence Record Upload Template must be populated. The template only includes our required and desirable terms. Any terms listed below and under the additional Darwin Core terms heading can be included as a field in your data. These terms have been tested on the NBN Atlas and confirmed to process correctly.

Any Darwin Core term can be included as a field within your data, but we cannot guarantee terms that are not listed here will process correctly. The Darwin Core Quick Reference Guide is an easy-to-read reference of the currently (as of 2023-09-18) recommended terms maintained as part of the Darwin Core standard. Terms must be formatted and spelled exactly as provided here.

Please send your occurrence records and completed metadata form to data@nbn.org.uk.

 

Darwin Core Term Definition and Examples
occurrenceID
Required
Definition: The unique and persistent row key or ID of the record that can follow any format.

Examples:
2877
record_89
c8d77be

datasetName
Required
Definition: The title given to the data resource (this should match the name provided in the metadata form).
license
Required
Definition: A legally binding licence giving official permission to do something with the record.

One of:
CC0
CC-BY
CC-BY-NC
OGL

Additional Information:
See the Data licence help guide.
See the definition of Creative Commons licences.
See the Open Government Licence.

rightsHolder
Required
Definition: A person or organisation owning or managing rights over the resource, usually the name of the Data Partner.
taxonID
Required
Definition: The Taxon Version Key (TVK) for the taxon supplied in the scientificName field.

Examples:
NBNSYS0000004130
NHMSYS0000516660
BMSSYS0000001357

Additional Information: TVKs can be obtained from the UKSI Sandbox.

scientificName
Required
Definition: The full scientific name. This should be the name of the lowest level taxonomic rank that can be determined.

Examples:
Lepidoptera
Lycaenidae
Polyommatus
Polyommatus icarus

taxonRank
Required
Definition: The taxonomic rank of the taxon provided in the scientificName field.

One of:
species
genus
family

Additional Information: The NBN Atlas will only accept DNA records for family, genus, or species level classifications.

identificationVerificationStatus
Required
Definition: A categorical indicator of the extent to which the taxonomic identification has been verified to be correct.

One of:
Accepted - correct
Accepted - considered correct
Unconfirmed - plausible
Unconfirmed - not reviewed

Additional Information:
See the Verification Status Terms help guide for more information.

eventDate
Required
Definition: The date or range during which the occurrence was recorded.

Dates or date ranges can be supplied in the following formats:

a complete date
YYYY-MM-DD (2008-09-04)
DD/MM/YYYY (04/09/2008)

partial dates
a month and year in YYYY-MM (2008-09) format.
a year in YYYY (2008) format.

date ranges
a complete date range in YYYY-MM-DD/ YYYY-MM-DD (2008-01-01/2009-02-25) format (preferred).
a day range within the same month in YYYY-MM-DD/DD (2008-09-04/05) format.
a month range within the same year in YYYY-MM/MM (2008-09/10) format.
a year range in YYYY/YYYY (2008/2009) format.

The year, month, and day can also be supplied in numeric format in fields labelled “year”, “month”, and “day”. These values do not need padding with zeros, e.g. “7” not “07”.

Additional Information: Not suitable for a time in a geological context.

recordedBy
Required
Definition: Names of people, groups, or organisations responsible for recording the original Occurrence.

Examples:
John Rodwell
Dennis, R. L. H. | Shreeve, T. G.
Volunteer
Not Provided

gridReference
Required if decimalLatitude and decimalLongitude are not supplied
Definition: Location of the record using the Ordnance Survey National Grid reference system. British (OSGB) and Irish (OSI) National Grid references are both accepted. Records from Northern Ireland should be supplied with an OSI grid reference or latitude/longitude coordinates.

Channel Islands National Grid references are not accepted; CI records should be supplied with latitude/longitude coordinates.

Examples:
SU122421
SP5405
NT22

Additional Information: Values should not include spaces or any other symbols. This term is not part of the Darwin Core standard.

decimalLatitude
Required if gridReference isn’t provided
Must be used in conjunction with decimalLongitude
Definition: The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location.

Examples:
52.952516

decimalLongitude
Required if gridReference isn’t provided
Must be used in conjunction with decimalLatitude
Definition: The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location.

Examples:
-1.144200

coordinateUncertaintyInMeters
Required if decimalLatitude and decimalLongitude are supplied
Definition: The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the location.

Additional Information: See the Grid and coordinate based records on the NBN Atlas for more information.

locality
Desirable
Definition: The specific description of the place or place name where the record was made.
basisOfRecord
Required
Definition: the nature of the record.

The default for DNA data is MaterialSample.

occurrenceStatus
Required
Definition: the presence of individuals in the occurrence.

The default for DNA data is present.

Additional Information: The NBN Atlas will only accept presence records for DNA-derived occurrence records.

preparations
Required
Definition: A list (concatenated and separated) of preparations and preservation methods.

The default for DNA data is DNA extract.

samplingProtocol
Required
Definition: The name of, reference to, or description of the method or protocol used during the event.

Examples:
5ml surface water sampling

organismQuantity
Required
Definition: A number or enumeration value for the quantity of organisms. The type of value is described by the organismQuantityType term.

Examples:
15346

organismQuantityType
Required
Definition: The type of quantification system used for the quantity of organisms. The value is quantified by the organismQuantity term.

Examples:
DNA sequence reads

associatedSequences
Desirable
Definition: A list (concatenated and separated) of identifiers (publication, global unique identifier, URI) of genetic sequence information associated with the record.

Examples:
https://www.ncbi.nlm.nih.gov/nuccore/U34853.1

DNA_sequence
Required
Definition: The DNA sequence.

Examples:
GTGGGTTTGGAGCACCGCCAAGTCCTTAGAGTTTTAAGCGTTTGTGCTCGTAGTTCTCAGGCGAATACTTTGGTGGGGAGAAGTATTTAGATTTAAGGCCAA

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

target_gene
Required
Definition: Targeted gene or marker name for marker-based studies.

Examples:
CO1

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

pcr_primer_reference
Required
Definition: Reference for the primers:

Examples:
https://doi.org/10.1186/1742-9994-10-34

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

env_medium
Required for eDNA data
Definition: The environmental medium which surrounded your sample or specimen prior to sampling:

Any of:
liquid water [ENVO:00002006]
fresh water [ENVO:00002011]
brackish water [ENVO:00002019]
seawater [ENVO:00002149]
estuarine water [ENVO:01000301]
contaminated water [ENVO:00002186]

air [ENVO:00002005]

organic material [ENVO:01000155]
wood [ENVO:00002040]
fecal material [ENVO:00002003]
biofilm material [ENVO:01000156]
planktonic material [ENVO:01000063]

sediment [ENVO:00002007]
clay [ENVO:01000015]
silt [ENVO:01000016]
sand [ENVO:01000017]
gravel [ENVO:01000018]
mud [ENVO:00002160]

soil [ENVO:00001998]
peat [ENVO:00002160]
compost [ENVO:00005785]
topsoil [ENVO:00005786]

or any subclasses of ENVO’s environmental material class: http://purl.obolibrary.org/obo/ENVO_00010483

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

env_broad_scale
Required for eDNA data
Definition: The broad-scale environment the sample or specimen came from.

One of:
terrestrial biome [ENVO:00000446]
freshwater biome [ENVO:00000873]
estuarine biome [ENVO:01000020]
marine biome [ENVO:00000447]

or any subclasses of ENVO’s biome class: http://purl.obolibrary.org/obo/ENVO_00000428

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

otu_db
Desirable
Definition: The OTU database (i.e. sequences not generated as part of the current study) used to assigning taxonomy to OTUs or ASVs.

Examples:
NCBI

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

otu_seq_comp_appr
Desirable
Definition: The OTU sequence comparison approach, such as tools and thresholds used to assign “species-level” names to OTUs or ASVs.

Examples:
blast version 2.12.0+

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

otu_class_appr
Desirable
Definition: The OTU classification approach / algorithm and clustering level (if relevant) when defining OTUs or ASVs.

Examples:
standard Linux tools

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

env_local_scale Definition: The local environmental context the sample or specimen came from. Please use terms that are present in ENVO and which are of smaller spatial grain than your entry for env_broad_scale.

any subclasses of ENVO’s biome class: http://purl.obolibrary.org/obo/ENVO_00000428

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

target_subfragment Definition: Name of subfragment of a gene or marker.

Examples:
V5

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

pcr_primer_name_forward Definition: Name of the forward PCR primer that were used to amplify the sequence of the targeted gene, locus or subfragment.

Examples:

Riaz_12S_V5F

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

pcr_primer_forward Definition: Forward PCR primer that were used to amplify the sequence of the targeted gene, locus or subfragment.

Examples:

TAGAACAGGCTCCTCTAG

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

pcr_primer_name_reverse Definition: Name of the reverse PCR primer that were used to amplify the sequence of the targeted gene, locus or subfragment.

Examples:

Riaz_12S_V5R

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.

pcr_primer_reverse Definition: Reverse PCR primer that were used to amplify the sequence of the targeted gene, locus or subfragment.

Examples:

TTAGATACCCCACTATGC

Additional Information: This term is part of the DNA Derived Data Darwin Core Extension.