1 of 106

BlobToolKit

Interactive quality assessment of genome assemblies

2 of 106

Interactive Exercises

Command line examples will look like this

Follow the URLs for browser-based examples

$ cd ~/blobtoolkit

$ ls

Blobtools2

insdc-pipeline

specification

taxdump

viewer

3 of 106

Overview

  • Why Blobs?
  • BlobToolKit
  • Using the Viewer — browser
  • Running BlobToolKit — command line
  • Programmatic Access — browser & command line

4 of 106

Why blobs?

We want to sequence a tardigrade

– but it comes with a soup of other organisms

5 of 106

Why blobs?

All tardigrade DNA will be at the same molarity*

* Organellar genomes

* Sex chomosomes

* Allelic divergence

* Repeats

Contaminant DNA will be at different molarities*

* Single-copy symbionts

* Or not

6 of 106

Why blobs?

coverage

<1x 1x 2x 4x

Relative frequency

7 of 106

Why blobs?

coverage

<1x 1x 2x 4x

Relative frequency

8 of 106

Why blobs?

All tardigrade DNA will have similar GC content*

* Organellar genomes

* Localised variation

* Repeats

Contaminant DNA GC content may differ*

* Or not

9 of 106

Why Blobs?

GC proportion

0% 25% 50% 75% 100%

Relative frequency

10 of 106

Why Blobs?

GC proportion

0% 25% 50% 75% 100%

coverage

11 of 106

Why Blobs?

add taxonomic annotation to each contig

GC proportion

0% 25% 50% 75% 100%

coverage

12 of 106

Why Blobs?

GC proportion

0% 25% 50% 75% 100%

coverage

Hypsibius dujardini

Chitinophaga

Pseudomonas

Stenotrophomonas

alphaproteobacterium

13 of 106

Blobology Sujai Kumar and colleagues 2013

14 of 106

Blobology Sujai Kumar and colleagues 2013

15 of 106

BlobTools Dominik Laetsch and colleagues 2017

16 of 106

BlobTools Dominik Laetsch and colleagues 2017

17 of 106

BlobToolKit

University of Edinburgh &

Wellcome Sanger Institute

  • Richard Challis
  • Mark Blaxter

European Nucleotide Archive

European Bioinformatics Institute

  • Edward Richards
  • Jeena Rajan
  • Guy Cochrane

18 of 106

BlobToolKit

19 of 106

BlobToolKit

BlobDir dataset

DatasetID

|— meta.json

|— identifiers.json

|— gc.json

|— length.json

|— ncount.json

|— {LIBRARYNAME}_cov.json

|— {LIBRARYNAME}_read_cov.json

|— {TAXRULE}_positions.json

|— {TAXRULE}_{RANK}.json

|— {TAXRULE}_{RANK}_cindex.json

|— {TAXRULE}_{RANK}_positions.json

|— {TAXRULE}_{RANK}_score.json

|— {LINEAGE}_busco.json

20 of 106

BlobToolKit

BlobDir dataset

BlobTools2

DatasetID

|— meta.json

...

DatasetID

|— meta.json

|— identifiers.json

|— gc.json

|— length.json

|— ncount.json

|— {LIBRARYNAME}_cov.json

|— {LIBRARYNAME}_read_cov.json

|— {TAXRULE}_positions.json

|— {TAXRULE}_{RANK}.json

|— {TAXRULE}_{RANK}_cindex.json

|— {TAXRULE}_{RANK}_positions.json

|— {TAXRULE}_{RANK}_score.json

|— {LINEAGE}_busco.json

$ ./blobtools create --fasta ACVV01.fasta \

... /path/to/BlobDir

21 of 106

BlobToolKit

BlobDir dataset

BlobTools2

DatasetID

|— meta.json

...

DatasetID

|— meta.json

|— identifiers.json

|— gc.json

|— length.json

|— ncount.json

|— {LIBRARYNAME}_cov.json

|— {LIBRARYNAME}_read_cov.json

|— {TAXRULE}_positions.json

|— {TAXRULE}_{RANK}.json

|— {TAXRULE}_{RANK}_cindex.json

|— {TAXRULE}_{RANK}_positions.json

|— {TAXRULE}_{RANK}_score.json

|— {LINEAGE}_busco.json

$ ./blobtools create --fasta ACVV01.fasta \

... /path/to/BlobDir

Pipeline

22 of 106

BlobToolKit

BlobDir dataset

BlobTools2

DatasetID

|— meta.json

...

$ ./blobtools create --fasta ACVV01.fasta \

... /path/to/BlobDir

Pipeline

Viewer

23 of 106

BlobToolKit

BlobDir dataset

BlobTools2

DatasetID

|— meta.json

...

$ ./blobtools create --fasta ACVV01.fasta \

... /path/to/BlobDir

ENA browser

Pipeline

Viewer

24 of 106

BlobToolKit Pipeline

25 of 106

BlobToolKit Pipeline

26 of 106

BlobToolKit Pipeline

27 of 106

BlobToolKit Pipeline

Pipeline configuration file

assembly:

accession: GCA_00029833$

alias: DroAlb_1.0

bioproject: PRJNA39511

level: scaffold

span: 253560284

prefix: ACVV01

taxon:

taxid: 7291

name: Drosophila albomi$

28 of 106

BlobToolKit Pipeline

Pipeline configuration file

similarity:

defaults:

evalue: 1e-25

max_target_seqs: 10

root: 1

mask_ids: [7215]

databases:

- {name: nt_v5, local$

- {name: reference_pr$ taxrule: bestsumorder

29 of 106

BlobToolKit Pipeline

Pipeline configuration file

reads:

paired:

- [SRR01,ILLUMINA,482$

- [SRR02,ILLUMINA,552$

single:

- [SRR03,PACBIO]

coverage:

max: 100

min: 0.5

30 of 106

BlobToolKit Pipeline

Pipeline configuration file

busco:

lineages:

- diptera_odb9

- arthropoda_odb9

- eukaryota_odb9

lineage_dir: /busco/lin$

31 of 106

BlobToolKit Pipeline

Snakemake command to run the Pipeline

snakemake -p \

--use-conda \

--conda-prefix $CONDA_DIR \

--directory $WORKDIR/ \

--configfile $WORKDIR/$ASSEMBLY.yaml \

--stats $ASSEMBLY.snakemake.stats \

-j $THREADS \

--resources btk=1 \

-n

32 of 106

BlobToolKit Pipeline

33 of 106

BlobToolKit Pipeline

Cluster configuration

__default__:

mem: 100

queue: 'small'

bamtools_stats:

threads: 1

mem: 1000

run_blastn:

threads: 16

mem: 100000

queue: 'normal'

34 of 106

BlobToolKit Pipeline

Snakemake command for running the Pipeline on a cluster

snakemake -p --cluster-config cluster.yaml \

--drmaa " -o {log}.o \

-e {log}.e \

-R \"select[mem>{cluster.mem}] rusag$

-M {cluster.mem} \

-n {cluster.threads} \

-q {cluster.queue}" \

...

35 of 106

Using the Viewer

  • Finding datasets
  • BlobToolKit Views
  • Indicators of assembly quality
  • Exploring non-target data
  • Digging deeper
  • Customising plots
  • Reproducibility

36 of 106

Using the Viewer

37 of 106

Using the Viewer

  • Finding datasets
  • BlobToolKit Views
  • Indicators of assembly quality
  • Exploring non-target data
  • Digging deeper
  • Customising plots
  • Reproducibility

38 of 106

Finding datasets

39 of 106

Using the Viewer

  • Finding datasets
  • BlobToolKit Views
  • Indicators of assembly quality
  • Exploring non-target data
  • Digging deeper
  • Customising plots
  • Reproducibility

40 of 106

BlobToolKit views

41 of 106

Using the Viewer

  • Finding datasets
  • BlobToolKit Views
  • Indicators of assembly quality
  • Exploring non-target data
  • Digging deeper
  • Customising plots
  • Reproducibility

42 of 106

Indicators of assembly quality

43 of 106

Using the Viewer

  • Finding datasets
  • BlobToolKit Views
  • Indicators of assembly quality
  • Exploring non-target data
  • Digging deeper
  • Customising plots
  • Reproducibility

44 of 106

Exploring non-target data

45 of 106

Using the Viewer

  • Finding datasets
  • BlobToolKit Views
  • Indicators of assembly quality
  • Exploring non-target data
  • Digging deeper
  • Customising plots
  • Reproducibility

46 of 106

Digging deeper

47 of 106

Using the Viewer

  • Finding datasets
  • BlobToolKit Views
  • Indicators of assembly quality
  • Exploring non-target data
  • Digging deeper
  • Customising plots
  • Reproducibility

48 of 106

Customising plots

49 of 106

Using the Viewer

  • Finding datasets
  • BlobToolKit Views
  • Indicators of assembly quality
  • Exploring non-target data
  • Digging deeper
  • Customising plots
  • Reproducibility

50 of 106

Reproducibility

51 of 106

Using the Viewer

Find assemblies with good and bad conventional metrics to compare plots

Search for assemblies with cobionts, e.g.:

  • Apicomplexa in Chordata
  • Proteobacteria in Arachnida

Take a look at some of these assemblies:

  • Crypturellus cinnamomeus PTEZ01 (bird)
  • Colinus virginianus AWGT02 (bird)
  • Onchocerca ochengi FJNM01 (nematode parasite)
  • Brugia timori UZAG01 (nematode parasite)
  • Sciurus vulgaris mSciVul1_1 (mammal)

52 of 106

Running BlobToolKit

  • Hosting a local Viewer instance
  • Running BlobTools2
  • Extending BlobTools2
  • Dataset validation

53 of 106

Running BlobToolKit

Check BlobToolKit has been downloaded

$ cd ~/blobtoolkit

$ ls

blobtools2

insdc-pipeline

specification

taxdump

viewer

54 of 106

Running BlobToolKit

Check the Conda package manager has been installed

$ conda activate btk_env

(btk_env) $ which python3

/home/username/miniconda3/envs/btk_env/bin/python3

55 of 106

Running BlobToolKit

  • Hosting a local Viewer instance
  • Running BlobTools2
  • Extending BlobTools2
  • Dataset validation

56 of 106

Hosting a local Viewer instance

57 of 106

Hosting a local Viewer instance

Download a BlobDir dataset from the public Viewer

$ mkdir -p ~/blobtoolkit/datasets

$ cd ~/blobtoolkit/datasets

$ curl http://blobtoolkit.genomehubs.org/download/AC/ACVV01/ACVV01.blobdir.tar.gz | tar xf -

58 of 106

Hosting a local Viewer instance

Viewer configuration environment variables

NODE_ENV=local

BTK_CLIENT_PORT=8080

BTK_API_PORT=8000

BTK_API_URL=http://localhost:8000/api/v1

BTK_BASENAME=/view

BTK_ORIGINS='http://localhost:8080 http://localhost null'

BTK_HOST=localhost

BTK_FILE_PATH=/home/username/blobtoolkit/datasets

BTK_USE_DEFAULT_LINKS=true

BTK_STATIC_THRESHOLD=100000

BTK_NOHIT_THRESHOLD=1000000

59 of 106

Hosting a local Viewer instance

Create file with Viewer environment variables

$ cd ~/blobtoolkit/viewer

$ cp .env.dist .env

$ pwd

/home/username/blobtoolkit/viewer

$ nano .env

...

BTK_FILE_PATH=/home/username/blobtoolkit/datasets

...

60 of 106

Hosting a local Viewer instance

Start the Viewer API (back end server)

Start the Viewer (front end server)

$ cd ~/blobtoolkit/viewer

$ conda activate btk_env

(btk_env) $ npm start

...

$ cd ~/blobtoolkit/viewer

$ conda activate btk_env

(btk_env) $ npm run client

...

61 of 106

Hosting a local Viewer instance

62 of 106

Hosting a local Viewer instance

Environment variables for publicly available site

NODE_ENV=production

BTK_API_URL=https://blobtoolkit.genomehubs.org/api/v1

BTK_HTTPS=true

BTK_ORIGINS='https:localhost:8080 https://blobtoolkit.genom$

BTK_HOST='blobtoolkit.genomehubs.org'

BTK_KEYFILE='/path/to/privkey.pem'

BTK_CERTFILE='/path/to/cert.pem'

BTK_GDPR_URL=https://genomehubs.org/gdpr

BTK_DATASET_TABLE=true

63 of 106

Running BlobToolKit

  • Hosting a local Viewer instance
  • Running BlobTools2
  • Extending BlobTools2
  • Dataset validation

64 of 106

Running BlobTools2

65 of 106

Running BlobTools2

66 of 106

Running BlobTools2

67 of 106

Running BlobTools2

68 of 106

Running BlobTools2

69 of 106

Running BlobTools2

Command to create a BlobDir dataset

$ ./blobtools create --fasta ACVV01.fasta \

--meta ACVV01.yaml \

--hits ACVV01.blastn.out \

--hits ACVV01.diamond.out \

--cov ACVV01.SR01.bam \

--busco ACVV01.busco.diptera_odb9.tsv \

--taxid 7291 \

--taxdump /path/to/taxdump \

/path/to/BlobDir

70 of 106

Running BlobTools2

Options when adding BLAST results to a BlobDir dataset

$ ./blobtools add --hits ACVV01.blastn.vs.custom.db.out \

--taxdump /path/to/taxdump \

--taxrule bestsum=myTaxruleName \

--bitscore 500 \

--evalue 1e-75 \

--hit-count 5 \

/path/to/BlobDir

71 of 106

Running BlobTools2

Download BLAST results from the public Viewer

$ cd ~/blobtoolkit

$ conda activate btk_env

(btk_env) $ curl http://blobtoolkit.genomehubs.org/download/AC/ACVV01/ACVV01.blastn.nt.root.1.minus.7215.out.gz | gunzip > ACVV01.blastn.out

72 of 106

Running BlobTools2

Import BLAST results with non-default settings

$ cd ~/blobtoolkit

$ ./blobtools2/blobtools add --hits ACVV01.blastn.out --taxrule bestsum=alt --taxdump ./taxdump --bitscore 500 ./datasets/ACVV01

$ ls ./datasets/ACVV01/alt_p*

datasets/ACVV01/alt_phylum.json

datasets/ACVV01/alt_phylum_cindex.json

datasets/ACVV01/alt_phylum_positions.json

datasets/ACVV01/alt_phylum_score.json

datasets/ACVV01/alt_positions.json

73 of 106

Running BlobToolKit

  • Hosting a local Viewer instance
  • Running BlobTools2
  • Extending BlobTools2
  • Dataset validation

74 of 106

Extending BlobTools2

Generic datatypes

  • Identifier
  • Variable
  • Category
  • Array
  • MultiArray

One parser per analysis type

  • fasta.py for --fasta
  • hits.py for --hits
  • cov.py for --cov
  • busco.py for --busco
  • trnascan.py for --trnascan
  • txt.py for --txt (coming soon)

75 of 106

Running BlobToolKit

  • Hosting a local Viewer instance
  • Running BlobTools2
  • Extending BlobTools2
  • Dataset validation

76 of 106

BlobDir validation

77 of 106

BlobDir validation

Validate a BlobDir dataset

$ cd ~/blobtoolkit

$ ./specification/validate.py ./datasets/ACVV01/meta.json

VALID

78 of 106

Running BlobToolKit

Which method seems most useful to you?

  • Pipeline
  • BlobTools2

Which analyses would you like to see incorporated?

Where do you think development efforts should be focused?

  • Viewer?
  • Pipeline?
  • BlobTools2?
  • documentation?

79 of 106

Programmatic Access

  • Viewer API — browser & command line
  • Viewer plots — command line
  • Filtering datasets — command line
  • Filtering data files

80 of 106

Programmatic Access

  • Viewer API — browser & command line
  • Viewer plots — command line
  • Filtering datasets — command line
  • Filtering data files

81 of 106

Viewer API

82 of 106

Viewer API

Access API endpoints on the command line

$ curl -s https://blobtoolkit.genomehubs.org/api/v1/dataset/id/ACVV01/assembly/span

253560284

83 of 106

Viewer API

Use jq to process API data

$ curl -s https://blobtoolkit.genomehubs.org/api/v1/field/ACVV01/length | jq '.values | add'

253560284

$ curl -s https://blobtoolkit.genomehubs.org/api/v1/field/ACVV01/length | jq '.values | map(select(. > 5000)) | add'

214698551

84 of 106

Programmatic Access

  • Viewer API — browser & command line
  • Viewer plots — command line
  • Filtering datasets — command line
  • Filtering data files

85 of 106

Viewer plots

X11

Generate a plot using blobtools view

$ ./blobtools2/blobtools view --ports 8000-8099 \

--param gc--Min=0.3 --param plotShape=hex \

./datasets/ACVV01

Initializing viewer |███████████████████████████████████████| 15/15 seconds

Loading_http://localhost:8013/view/dataset/ACVV01/blob?staticThreshold=Infinity&nohitThreshold=Infinity&plotGraphics=svg&gc--Min=0.3&plotShape=hex

...

waiting for file 'ACVV01.blob.hex.png'

86 of 106

Viewer plots

Generate a plot using blobtools view

$ ./blobtools2/blobtools view --ports 8000-8099 \

--param gc--Min=0.3 --param plotShape=hex \

./datasets/ACVV01

Initializing viewer |███████████████████████████████████████| 15/15 seconds

Loading_http://localhost:8013/view/dataset/ACVV01/blob?staticThreshold=Infinity&nohitThreshold=Infinity&plotGraphics=svg&gc--Min=0.3&plotShape=hex

...

waiting for file 'ACVV01.blob.hex.png'

87 of 106

Viewer plots

Command we’d like to use to generate a taxon-filtered plot

$ ./blobtools2/blobtools view \

--host https://blobtoolkit.genomehubs.org \

--view snail \

--param bestsumorder_phylum--Keys=Proteobacteria \

ACVV01

88 of 106

Viewer plots

Command we’d like to use to generate a taxon-filtered plot

$ ./blobtools2/blobtools view \

--host https://blobtoolkit.genomehubs.org \

--view snail \

--param bestsumorder_phylum--Keys=Proteobacteria \

ACVV01

89 of 106

Viewer plots

Use jq to view bestsumorder_phylum category keys

$ jq '.keys' ./datasets/ACVV01/bestsumorder_phylum.json

[

"no-hit",

"Proteobacteria",

"Arthropoda",

"undef",

"Ascomycota",

"Chordata",

"Mollusca",

...

90 of 106

Viewer plots

Use jq to view bestsumorder_phylum category keys

$ curl -s https://blobtoolkit.genomehubs.org/api/v1/field/ACVV01/bestsumorder_phylum | jq '.keys'

[

"no-hit",

"Proteobacteria",

"Arthropoda",

"undef",

"Ascomycota",

...

91 of 106

Viewer plots

Use the key value to generate a plot from a publicly hosted dataset

$ ./blobtools2/blobtools view --view snail \

--host https://blobtoolkit.genomehubs.org \

--param bestsumorder_phylum--Keys=1 ACVV01

Loading https://blobtoolkit.genomehubs.org/view/dataset/ACVV01/snail?staticThreshold=Infinity&nohitThreshold=Infinity&plotGraphics=svg&bestsumorder_phylum--Keys=1

Fetching ACVV01.snail.png

waiting for element snail_save_png

waiting for file 'ACVV01.snail.png'

92 of 106

Viewer plots

Use the key value to generate a plot from a publicly hosted dataset

$ ./blobtools2/blobtools view --view snail \

--host https://blobtoolkit.genomehubs.org \

--param bestsumorder_phylum--Keys=1 ACVV01

Loading https://blobtoolkit.genomehubs.org/view/dataset/ACVV01/snail?staticThreshold=Infinity&nohitThreshold=Infinity&plotGraphics=svg&bestsumorder_phylum--Keys=1

Fetching ACVV01.snail.png

waiting for element snail_save_png

waiting for file 'ACVV01.snail.png'

93 of 106

Viewer plots

Alternate command to host a local Viewer instance

$ ./blobtools2/blobtools view --ports 8000-8099 --remote \

./datasets/ACVV01

Initializing viewer |███████████████████████████████████████| 15/15 seconds

Open dataset at http://localhost:8001/view/dataset/BlobDir/blob?

For remote access use:

ssh -L 8001:127.0.0.1:8001 -L 8000:127.0.0.1:8000 username@remote_host

94 of 106

Programmatic Access

  • Viewer API — browser & command line
  • Viewer plots — command line
  • Filtering datasets — command line
  • Filtering data files

95 of 106

Filtering datasets

Filter a local BlobDir dataset

$ ./blobtools2/blobtools filter --param length--Min=3000000 --table STDOUT ./datasets/ACVV01

[

["index","identifiers","gc","length","SRR026696_cov","best$

[17958,"JH855722.1",0.3877,3161164,0.6789,"Arthropoda"],

[21431,"JH859027.1",0.3881,7262926,0.753,"Arthropoda"]

]

96 of 106

Filtering datasets

Use filters to compare alternative taxonomic inferences

$ ./blobtools2/blobtools filter \

--param bestsumorder_phylum--Keys=no-hit \

--table ACVV01.alt_taxrule.tsv \

--table-fields bestsumorder_phylum,alt_phylum \

./datasets/ACVV01

$ head ACVV01.alt_taxrule.tsv

index identifiers bestsumorder_phylum demo_phylum

1 JH838199.1 Proteobacteria no-hit

4 JH838202.1 Arthropoda no-hit

...

97 of 106

Filtering datasets

Generate multiple outputs from a single filter command

$ ./blobtools2/blobtools filter \

--param bestsumorder_phylum--Keys=Proteobacteria \

--param bestsumorder_phylum--Inv=true \

--table ACVV01.proteobacteria.tsv \

--table-fields length,bestsumorder_genus,SRR026696_cov \

--summary ACVV01.proteobacteria.json \

--summary-rank genus \

--out ./datasets/ACVV01_proteobacteria \

./datasets/ACVV01

98 of 106

Filtering datasets

Inspect the genus-level taxonomy of scaffolds in the filtered dataset

$ head ACVV01.proteobacteria.tsv

index identifiers length bestsumorder_genus SRR026696_cov

1 JH838199.1 1836 Acetobacter 0.0403

37 JH838235.1 2833 Gluconobacter 0.041600000000000005

46 JH838244.1 1575 Acetobacter 0.0407

69 JH838267.1 2008 Gluconobacter 0

91 JH838288.1 23979 Acetobacter 0.1859

99 JH838296.1 1326 Gluconobacter 0.0463

118 JH838315.1 1342 Acetobacter 0

124 JH838321.1 1445 Gluconobacter 0.0333

99 of 106

Filtering datasets

View summary data for scaffolds assigned to Acetobacter

$ jq '.summaryStats.hits.Acetobacter' ACVV01.proteobacteria.json

{ "span": 4272045,

"count": 695,

"gc": [0.564,0.5814,0.4864,0.6416,0.3952,0.6496],

"cov": [0.1937,0.2394,0.0637,0.5886,0.0216,2.2387],

"n50": 48752,

"l50": 14,

"n90": 1756,

"l90": 366 }

100 of 106

Filtering datasets

Generate a blob plot from the filtered dataset

$ ./blobtools2/blobtools view --ports 8000-8099 \

--param plotShape=circle \

--param catField=bestsumorder_genus \

--param bestsumorder_genus--Active=true \

./datasets/ACVV01_proteobacteria

Initializing viewer |███████████████████████████████████████| 15/15 seconds

Loading

...

waiting for file 'ACVV01_proteobacteria.blob.circle.png'

101 of 106

Filtering datasets

Generate a blob plot from the filtered dataset

$ ./blobtools2/blobtools view --ports 8000-8099 \

--param plotShape=circle \

--param catField=bestsumorder_genus \

--param bestsumorder_genus--Active=true \

./datasets/ACVV01_proteobacteria

Initializing viewer |███████████████████████████████████████| 15/15 seconds

Loading

...

waiting for file 'ACVV01_proteobacteria.blob.circle.png'

102 of 106

Programmatic Access

  • Viewer API — browser & command line
  • Viewer plots — command line
  • Filtering datasets — command line
  • Filtering data files

103 of 106

Filtering assembly files

Filter a FASTA file based on taxonomic inference

$ ./blobtools2/blobtools filter --fasta ACVV01.fasta \

--param bestsumorder_phylum--Keys=no-hit \

--suffix with_taxonomy ./datasets/ACVV01

$ ./blobtools2/blobtools filter --fasta ACVV01.fasta \

--param bestsumorder_phylum--Keys=no-hit \

--param bestsumorder_phylum--Inv=no-hit \

--suffix without_taxonomy ./datasets/ACVV01

$ ls

ACVV01.fasta ACVV01.without_taxonomy.fasta

ACVV01.with_taxonomy.fasta

104 of 106

Filtering assembly files

Filter FASTQ files based on taxonomic inference

$ ./blobtools2/blobtools filter \

--fastq SRR01_1.fastq.gz \

--fastq SRR01_2.fastq.gz \

--cov ACVV01.SRR01.bam \

--param bestsumorder_phylum--Keys=no-hit \

--suffix with_taxonomy ./datasets/ACVV01

$ ls

ACVV01.fastq.gz ACVV01.fastq.with_taxonomy.gz

105 of 106

Exploring Further

Download a BlobDir dataset from the public Viewer

$ cd ~/blobtoolkit/datasets

$ curl http://blobtoolkit.genomehubs.org/download/AC/ACVV01/ACVV01.blobdir.tar.gz | tar xf -

Try alternate taxrule parameters

Filter and compare tables and summaries

Reproduce interactive plots

106 of 106

BlobToolKit

University of Edinburgh &

Wellcome Sanger Institute

  • Richard Challis (rc28@sanger.ac.uk)
  • Mark Blaxter

European Nucleotide Archive

European Bioinformatics Institute

  • Edward Richards
  • Jeena Rajan
  • Guy Cochrane