Overview of EBI Data Resources and Services Dominic Clark, Industry Programme Manager, [email protected] http://www.ebi.ac.uk/industry What is EMBL-EBI? Part of the European Molecular Biology Laboratory Based on the Wellcome Trust Genome Campus near Cambridge, UK Non-profit organisation 2 07.02.20 Why do we need EMBL-EBI services?

3 Data Growth Global context Very large user community: Need to preserve data and make accessible to all Impact on medicine & agriculture Impact on society & bioindustries 07.02.20 07.02.20 2008 BAC New types of data Literature Literatureand andontologies ontologies Genomes

Genomes Protein Proteinsequence sequence DNA DNA&&RNA RNAsequence sequence Protein Proteinstructure structure Gene Geneexpression expression Chemical Chemicalentities entities Protein Proteinfamilies, families, motifs and

motifs anddomains domains Protein Proteininteractions interactions Pathways Pathways Systems Systems 4 07.02.20 EMBL-EBIs mission To provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress To contribute to the advancement of biology through basic investigator-driven research in bioinformatics To provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators To help disseminate cutting-edge technologies to industry

5 07.02.20 Services www.ebi.ac.uk/services Key facts about services European node for globally coordinated data collection and dissemination projects Core databases produced in collaboration with other world leaders, including NCBI (US), National Institute of Genetics (Japan), Swiss Institute of Bioinformatics, Cold Spring Harbor Laboratory (US) One of the worlds most comprehensive collection of molecular databases 7 07.02.20 Principles of service provision

8 07.02.20 Accessibility all data and tools freely available without restriction Compatibility we develop and promote the use of standards in bioinformatics Comprehensive data sets agreements with other data providers ensure that our resources contain comprehensive and up-to-date data; agreements with publishers ensure that published data are placed in a public repository at the earliest opportunity Portability data and software can be downloaded and installed locally Quality Our databases are enhanced through annotation and cross-referencing Databases: molecules to systems Genomes

Genomes Ensembl Ensembl Ensembl EnsemblGenomes Genomes EGA EGA Literature Literatureand andontologies ontologies CiteXplore, CiteXplore,GO GO Protein Proteinfamilies, families, motifs and motifs anddomains domains InterPro InterPro

Nucleotide Nucleotidesequence sequence EMBL-Bank EMBL-Bank Microarray Microarray&&gene gene expression data expression data ArrayExpress ArrayExpress Protein Proteininteractions interactions IntAct IntAct Proteomes Proteomes UniProt, UniProt,PRIDE

PRIDE Protein Proteinstructure structure PDB PDB Pathways Pathways Reactome Reactome Chemical Chemicalentities entities ChEBI, ChEMBL ChEBI, ChEMBL Systems Systems BioModels BioModels 9

07.02.20 Database collaborations 10 07.02.20 Standards development international collaborations Genomics GenomicsStandards StandardsConsortium Consortium(GSC) (GSC) gensc.org gensc.org Genome Genomeannotation annotation www.geneontology.org www.geneontology.org Protein Proteinsequence

sequence www.uniprot.org www.uniprot.org Nucleotide Nucleotidesequence sequence www.insdc.org www.insdc.org Microarray Microarrayand andGene Gene Expression ExpressionData Data(MGED) (MGED) www.mged.org www.mged.org HUPOHUPOProteomics Proteomics Standards Standards Initiative

Initiative(PSI) (PSI) Psidev.sf.net Psidev.sf.net Protein Proteinstructure structure www.wwpdb.org www.wwpdb.org Cheminformatics Cheminformatics www.ebi.ac.uk/chebi www.ebi.ac.uk/chebi Pathways Pathways www.reactome.org www.reactome.org www.biopax.org www.biopax.org Metabolomics MetabolomicsStandards StandardsInitiative Initiative(MSI)

(MSI) www.metabolomicssociety.org www.metabolomicssociety.org 11 07.02.20 Systems Systemsmodelling modelling standards standards www.sbml.org www.sbml.org EBI website and search engine EB-eye Search Searchall allmain main databases databasesininone onego go

Advanced Advancedsearch: search: drill down to drill down tospecific specific fields in specific fields in specific databases databases 12 07.02.20 Refine Refineyour yoursearch search Genomes 1: Ensembl Genomic

Genomicalignments alignments Chromosomes Chromosomes Genes Genes Pick Pickaagenome genome Synteny Synteny Gene Genefamilies families SNPs SNPs Across species 13

07.02.20 Orthology Orthology Within species Genomes 2: Ensembl Genomes Ensembl Ensembl Metazoa Metazoa Ensembl-like Ensembl-likegenome genomebrowser browserfor for non-vertebrate non-vertebratespecies species Ensembl

EnsemblBacteria Bacteria Select Orthologue view to see putative orthologues. Across species 14 07.02.20 View options Using view options, you can select to view only the current gene or the entire expanded gene tree. Nucleotides: EMBL-Bank DDBJ GenBank www.insdc.org

Direct submissions Patents Genomesequencing projects 15 07.02.20 EMBL-Bank Updates Third-party annotation Keyword and sequence searching Map-based search of environmental samples Downloads

Structures: PDBe Sequence Sequence mapping mapping Linking Linkingto to domain domaindata data Ligands Ligands Assemblies Assemblies Electron Electron density density visualization visualization

Active Activesites sites Fold Foldmatching matching 16 07.02.20 Surface Surface matching matching Worldwide Protein Data Bank www.wwpdb.org PDB FTP Traffic 17 07.02.20

User support 2Can bioinformatics user support www.ebi.ac.uk/2Can Online help pages www.ebi.ac.uk/help E-mail support www.ebi.ac.uk/support 19 07.02.20 Research www.ebi.ac.uk/groups Key facts about research Dedicated research groups aim to understand biology through new approaches to interpreting biological data Services teams also carry out R&D to enhance existing services and develop new ones Research programme complements services and the two are mutually supportive 21 07.02.20

Research groups Text Textmining mining Rebholz-Schuhmann Rebholz-Schuhmann Genome Genomeanalysis analysis Birney, Flicek, Birney, Flicek, Enright, Enright,Goldman Goldman Structural Structural bioinformatics bioinformatics Thornton Thornton Transcriptome

Transcriptome analysis analysis Brazma, Brazma,Huber Huber Protein Proteinannotation annotation Apweiler Apweiler Regulatory Regulatorynetworks networks Luscombe Luscombe Cheminformatics Cheminformatics Steinbeck, Steinbeck, Overington Overington Pathways,

Pathways,networks, networks, systems systems Le LeNovre Novre 22 07.02.20 Differentiation Differentiation and and development development Bertone Bertone Training www.ebi.ac.uk/training A tripartite user-training programme

Training Trainingany anytime, time,anywhere, anywhere,at at any anypace pace www.ebi.ac.uk/training/elearning www.ebi.ac.uk/training/elearning Training Trainingcomes comesto toyou you www.ebi.ac.uk/training/roadshow www.ebi.ac.uk/training/roadshow Bioinformatics ics Roadshow v eLearning g programmeme

Hands-on training at EMBLMBL EBI Hands-on Hands-onuser usertraining trainingon onall allour our core coredata dataresources resourcesfor forlab-based lab-based researchers researchers www.ebi.ac.uk/training/handson www.ebi.ac.uk/training/handson 24 07.02.20

Hands-on training for all levels of experience Interactive training in our purpose-built IT training suite at EMBL-EBI, Hinxton, Cambridge Learn from the EBIs experts through a combination of talks and practical exercises Take a tour of all our core data resources, or focus in on specific data types Full programme at www.ebi.ac.uk/training/handson Wellcome Images 25 07.02.20 http://www.ebi.ac.uk/training/handson/ Genomics, proteomics, transcriptomics, protein structures 26 07.02.20 27

07.02.20 Consolidating Bioinformatics in Europe EU-funded projects coordinated by the EBI SLING Serving life science information in the next generation 29 07.02.20 Providing unrestricted access to some of the worlds most important biological databases Bioinformatics roadshows provide hands-on training for users

Funded by the European Commission within its FP7 Programme within the Research Infrastructure Programme 4 partners in 4 countries ENFIN Network of Excellence 30 Brings together experimentalists and computational biologists to develop the next generation of informatics resources for systems biology Funded by the European Commission within its FP6 programme under the thematic

area Life sciences, genomics and biotechnology for health 20 partners in 13 countries www.enfin.org 07.02.20 EMBRACE Network of Excellence 31 Aims to enable bioinformatics research through better interoperability of servers, databases and services Funded by the European Commission within its FP6 programme under the thematic area Life sciences, genomics

and biotechnology for health 17 partners in 11 countries www.embracegrid.info 07.02.20 ELIXIR European life sciences infrastructure for biological information To build a sustainable European infrastructure for biological information supporting life science research and its translation to: medicine, the environment, the bioindustries, and society 32 participants in 13 countries 32 07.02.20 33 07.02.20

