1 of 5

Bioconductor HCA Group:

Marioni WIP

Aaron Lun

20th February 2019

2 of 5

Interchangeable methods (VP trees, KMKNN, Annoy, HNSW)

out <- findKNN(X=some_data, BNPARAM=VptreeParam())

out <- findKNN(X=some_data, BNPARAM=AnnoyParam(ntrees=50))

out <- findKNN(X=some_data, BNPARAM=HnswParam(nlinks=20))

Prebuilt indices for multiple queries

idx <- buildIndex(X=some_data, BNPARAM=KmknnParam())

fout <- findKNN(BNINDEX=idx)

qout <- queryKNN(query=some_other_data, BNINDEX=idx)

(Bioc)Parallelization

fout <- findKNN(X=some_data, BPPARAM=MulticoreParam(4))

Support for range finding, Manhattan distance

3 of 5

Interchangeable methods (exact, IRLBA, randomized SVD)

out <- runSVD(some_data, BSPARAM=ExactParam())

out <- runSVD(some_data, BNPARAM=IrlbaParam(extra.work=10))

out <- runSVD(some_data, BNPARAM=RandomParam(q=2))

Automatic cross-product for fat/thin matrices

idx <- runSVD(some_data, BNPARAM=IrlbaParam(fold=5))

(Bioc)Parallelization

fout <- runPCA(some_data, BSPARAM=IrlbaParam(), BPPARAM=MulticoreParam(4))

Deferred and low rank matrix representations (DelayedArrays)

defmat <- DeferredMatrix(some_data, center=centers, scale=stdevs)

lrmat <- LowRankMatrix(rotation, components)

idx <- runSVD(some_data, BNPARAM=RandomParam(deferred=TRUE))

4 of 5

Native C++ support for community-defined representations

library(SomeOtherPackage)

# Package defines its own matrix representation

X <- SomeOtherMatrixRepresentation()

# Package supports C++ access to this matrix representations

beachmat::supportCppAccess(X) # TRUE

# When beachmat encounters this matrix, it will look for

# SomeOtherPackage’s C++ shared library,and use routines

# from that library to access X’s contents.

BeachmatUsingFunction(X)

# This enables representation-specific methods to be use

# for data access, improving efficiency.

5 of 5

Other HCA-funded stuff:

  • batchelor: single-cell batch correction
    • Interoperable wrappers for different methods
    • Improvements to existing MNN methods
    • Builds off BiocSingular, BiocNeighbors

  • compareSingleCell:
    • Workflow for comparative scRNA-seq data analysis

  • DropletUtils:
    • Consistent PRNG with parallelization in emptyDrops

  • General dimensionality reduction speed-ups