1 of 57

INTRODUCTION OF DATABASES

2 of 57

DATABASE

  • Store useful information
  • Store data for sharing and reuse
  • When and where to use the information

3 of 57

STORING INFORMATION

4 of 57

NCBI

https://www.ncbi.nlm.nih.gov/

5 of 57

NCBI

6 of 57

PUBMED

  • Most widely used platform for literature searching.

https://pubmed.ncbi.nlm.nih.gov/

7 of 57

PUBMED

8 of 57

PUBMED

9 of 57

GENBANK

https://www.ncbi.nlm.nih.gov/genbank/

10 of 57

GENBANK

11 of 57

GENBANK

  • Locus: ID
  • Source: species
  • Features: gene, CDS …
    • source: whole genome
    • gene
      • Position
      • Gene name
    • CDS
      • Note: function
      • Protein name
      • Start coden
      • Sequence
    • Regulatory
      • class
      • Influence gene

12 of 57

REFSEQ

https://www.ncbi.nlm.nih.gov/refseq/

13 of 57

REFSEQ

14 of 57

REFSEQ

15 of 57

ENSEMBL

https://asia.ensembl.org/index.html

16 of 57

ENSEMBL

17 of 57

ENSEMBL

18 of 57

EMBL-EBI

https://www.ebi.ac.uk/

19 of 57

EMBL-EBI

20 of 57

RNA-SEQ

21 of 57

REFSEQ FTP

https://ftp.ncbi.nlm.nih.gov/refseq/

22 of 57

REFSEQ FTP

23 of 57

REFSEQ FTP

24 of 57

GFF FORMAT

  1. seqname - name of the chromosome
  2. source - database or project name
  3. feature - feature type name, e.g. Gene, Variation, Similarity
  4. start - Start position of the feature
  5. end - End position of the feature

  1. score - A floating point value.
  2. strand - defined as + (forward) or - (reverse).
  3. frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on..
  4. attribute - A semicolon-separated list of tag-value pairs, providing additional information about each feature.

25 of 57

GEO

https://www.ncbi.nlm.nih.gov/geo/

26 of 57

GEO

27 of 57

USCS GENOME BROWSER

https://genome.ucsc.edu/

28 of 57

USCS GENOME BROWSER

29 of 57

USCS GENOME BROWSER

30 of 57

PROTEOMICS

31 of 57

UNIPROT

https://www.uniprot.org/

32 of 57

UNIPROT

33 of 57

UNIPROT

34 of 57

PRIDE

https://www.ebi.ac.uk/pride/

35 of 57

PRIDE

36 of 57

PROTEIN STRUCTURE

37 of 57

PROTEIN DATA BANK

https://www.rcsb.org/

38 of 57

PROTEIN DATA BANK

39 of 57

PROTEIN DATA BANK

40 of 57

PROTEIN DATA BANK

41 of 57

FUNCTIONS

42 of 57

GO – GENE ONTOLOGY

http://geneontology.org/

43 of 57

GO – GENE ONTOLOGY

https://www.biostars.org/p/450726/

44 of 57

KEGG

https://www.genome.jp/kegg/

45 of 57

KEGG

46 of 57

KEGG

47 of 57

PFAM

http://pfam.xfam.org/

48 of 57

PFAM

49 of 57

RFAM

https://rfam.xfam.org/

50 of 57

RFAM

https://rfam.xfam.org/

51 of 57

RFAM

52 of 57

CANCER OMICS DATA

53 of 57

TCGA

https://portal.gdc.cancer.gov/

54 of 57

TCGA

55 of 57

TCGA

56 of 57

CPTAC

https://proteomics.cancer.gov/programs/cptac

57 of 57

CPTAC