Strain sequence data should be described using the following metadata fields in tab-separated format (please see here for an example file: metadata_table.template.tsv):

genome_ID l source l technology l NCBI_ID l NCBI_name l contact l other

If the value of a field is unknown you may leave it empty. Mandatory fields are genome_ID, source, technology and contact. The fields have the following meaning:

  • 'genome_ID' specifies an identifier for a sequence sample from a particular strain, might include multiple sequences.
  • 'source' specifies your sample (e.g. Arabidopsis thaliana root sample).
  • 'technology' specifies the sequencing technology used (e.g. Illumina paired end).
  • 'NCBI_ID' specifies an identifier for a strain in a reference taxonomy. This can also be an identifier at a higher taxonomic rank, if the strain is not represented in the taxonomy yet.
  • 'NCBI_name' specifies the respective name of the taxon in the reference taxonomy.
  • 'contact' specifies the email of the owner of a contributed assembled isolate strain sequence sample.
  • 'other' is a field for comments.