Gabi Dröge, Katharine Barker, Walter G. Berendsohn The Global Genome Biodiversity Network: Leveraging DNA and Tissue collections globally Gabi Dröge, Katharine Barker, Walter G. Berendsohn Botanic Garden and Botanical Museum Berlin-Dahlem
Global Genome Biodiversity Network Network for non-human biobanks (e.g. DNA, tissue) Founded in 2011 Precursor project DNA Bank Network founded in 2007 providing virtual infrastructure General secretariat: Smithsonian Institution Technical secretariat: Botanic Garden and Botanical Museum Berlin-Dahlem http://www.ggbn.org
Data standard for sharing tissue and DNA information Goals Data standard for sharing tissue and DNA information Portal to make biobank sample data available Institutional directory Knowledge platform Gabi will expand on progress toward accomplishing the first goal, here I will outline progress on the other four goals. Emphasize task forces responsible.
Recruit partners with different regional and taxonomic focus Goals Best practices related to management and stewardship of genomic samples Recruit partners with different regional and taxonomic focus Identify gaps in GGBN collections Gabi will expand on progress toward accomplishing the first goal, here I will outline progress on the other four goals. Emphasize task forces responsible.
Audiences Biorepositories (contributors) Organizations with living or preserved specimens (contributors) Researchers (users)
Today: 37 GGBN members worldwide
Developing Best Practices Recommendation for Biodiversity Biorepositories Collaboration with ISBER* April 2013, Submitted to ISBER To be included into next version of ISBER Best Practices GGBN Document Library Collaboration with ESBB** May 2016, beta-release for member use Knowledge platform for non-human biobanking * International Society for Biological and Environmental Repositories ** European, Middle Eastern and African Society for Biopreservation and Biobanking
Developing Best Practices Access and Benefit Sharing (ABS) compliance July 2015, Documentation available for member use Material Transfer Agreements Code of Conduct Statement of Use of Genomic Material Collaboration with CETAF* Provide trusted and transparent access to genomic samples for users and contributors through an ABS framework * Consortium of European Taxonomic Facilities
Business Model Will come into force 01/2016 Core and Associate members made a commitment to become financial or in-kind contributors Current funding by: German Research Foundation (DFG) SYNTHESYS (EC) In-kind contributions
Business Model: Pre-order samples in the Portal Order DNA or tissue samples through GGBN portal Download of sample information Request forwarded to GGBN member holding the sample(s) Institution responsible for all further steps Checking availability and loaning conditions Provide price offer to scientist Request signing of Material Transfer Agreement Shipping samples Citation guidelines for samples coming soon (available at ggbn.org)
The GGBN Data Portal
Goal (December 2015): New Data Portal Expanded functionality Corporate Design Based on feedback, review and requirements of the GGBN community Public beta-release July 2015 http://data.ggbn.org/ggbn_new Final release December 2015
Data Portal Architecture: Primary goal Do not re-invent the wheel!
Data Portal Architecture Molecular analysis data Source material / specimens DNA & Tissue
Portal: Basic Architecture GBIF checklist bank, CITES IPT Provider Store raw and cleaned data Harvester (B-HIT)* Data Cleaning Index (MySQL) Provider Get full access to orginal record Create SOLR index External sources Query Login, User settings GGBN web service *Berlin Harvesting and Indexing Toolkit
Portal: Taxonomic Backbone Goals: Query expansion: get synonyms and accepted names Keep names used by providers Sources: Certain datasets from GBIF checklist bank web service GBIF backbone, CoL, NCBI Prokaryotic Nomenclature up-to-date (PNU) web service EOL web service (under consideration) The Plant list web service (under consideration)
Portal: Aggregate data from multiple sources
Portal: Aggregate data from multiple sources Explore Chenopodium ficifolium 1908 specimens 42 nucleotide sequences taxon page 3 DNA samples 4 tissue samples Getting live counts from other biodiversity portals for each record via web services
CITES @ GGBN Warning and request for CITES registration number when ordering the sample both for curator and user.
Portal: Statistics Example: Samples from Caryophyllales Above: all records (DNA, tissue, specimens) Right: DNA / tissue samples
White Paper on Data Portal published Droege et al. 2014
GGBN Data Standard http://terms.tdwg.org/wiki/GGBN_Data_Standard Based on ABCDDNA Is meant to be used with ABCD or DwC -> all occurrence terms are excluded (geography, scientificname etc.) Include elements of other standards (e.g. MIxS, SPREC) Collaboration with GBIF, Genomics Standards Consortium, GenBank, EMBL, ESBB, TDWG, and others
GGBN Data Standards
GGBN Data Standard Vocabulary for sample and sequencing data ABS http://terms.tdwg.org/wiki/GGBN_Data_Standard Vocabulary for sample and sequencing data ABS loan information
GGBN Data Standard Implementation for ABCD and Darwin Core-Archive available Supported by IPT (v2.2) and BioCASe (v3.5.3) Tests performed by BGBM, NMNH, CSIRO, ZFMK, DSMZ Submission to TDWG as a standard in 12/2015 Submission to GSC* as a project in 08/2015 Support provided by BGBM White paper submitted * Genomic Standards Consortium
Use Case: Environmental Samples and DNA Environmental Samples already at GGBN (bird fecal samples) Works well with DwC-A and ABCD Environmental DNA at GGBN currently work in progress Challenges: Identification based on sequences Hundreds of taxa Proper search results and display genomic DNA (bird, plant) vs. environmental DNA (what bird has eaten) Users should find what they are looking for
Use Case: Environmental Samples and DNA Solution for ABCD: ABCDGGBN-Enviro http://data.ggbn.org/schemas/ggbn/Enviro/ABCDGGBN_Enviro.html GGBN extension @ Identification Sequences on identification level instead on unit level
Use Case: Environmental Samples and DNA Solution for DwC-A: work in progress Current star schema structure does not allow 1:n:n relations Collabortion with GBIF and GSC/MIxS to find a solution
Join GGBN!
Vouchers, traceability, deposition Source: Droege et al. 2014 Every biodiversity biorepository is welcome to join GGBN. Researchers: deposit your samples and data in a GGBN collection if you don‘t have a DNA or tissue bank. GGBN provides a virtual and physical infrastructure to make your research traceable for the future.
Tracing back information go to http://data.ggbn.org to find available DNA from this specimen
Tracing back information
Tracing back information
Portal: Number of Samples Online by Year
Portal: Number of Species Online by Year
Second GGBN International Conference 21.-24. June 2016 Berlin
Thank you http://www.ggbn.org ggbn@si.edu GGBN Interim Executive Committee GGBN Members GGBN Collaborators GGBN Task Forces DFG SYNTHESYS