Data publishing

Most institutions publishing data to GBIF need to convert their data into a format suitable for GBIF to process, typically Darwin Core Archive.

Tools including the GBIF IPT and BioCASe can convert data stored in spreadsheets and databases to the appropriate formats. The IPT is the most common way to publish data to GBIF.

Some institutional collections management systems, such as Symbiota or EarthCape, can export all or part of their data to GBIF.

Users or institutional systems (custom software) which can generate Darwin Core Archives and make them available on a webserver have two options:

Further discussion of the options can be found in this blogpost.

Dataset classes

Datasets can be published in four different formats:

Generally, the data quality increases from metadata-only to sampling event datasets.

Data quality recommendations

You can familiarize yourself with the requirements for the various types of datasets here.

Tools to quality check your publication

Dataset validator

The dataset validator can be used to validate zipped Darwin Core Archive datasets.

Species matching

The species matching tool can be used to normalize species names from a CSV file against the GBIF backbone.

Name usage, search and parsing can be carried out with the species API.

Flags and issues

When records are published to GBIF, they may receive various data quality flags and issues. The meaning and how to deal with the different issues are documented for occurrence and checklist datasets.