1 of 15

Single-cell RNA-seq analysis using R

Getting set up: infrastructure terms

Oct-04-2022

Iguaracy Pinheiro-de-Sousa

(iguaracy@ebi.ac.uk)

Daniel O’Hanlon

(dohanlon@ebi.ac.uk)

2 of 15

Single cell RNA-seq big data

3 of 15

Single cell RNA-seq big data

Current Opinion in Systems Biology2017,4:85–91

Databases

4 of 15

Single cell RNA-seq – Data generation – cell capture�

Levitin HM, et al. Trends Cancer. 2018 Apr;4(4):264-268.

Microtitre-plate methods rely on isolating cells into individual wells of the plate using, for example, pipetting, microdissection or fluorescent activated cell sorting (FACS). In high-throughput scRNA-seq with microfabricated microwell arrays, individual cells are co-encapsulated with individual, uniquely barcoded mRNA-capture beads in physically isolated microwells (per-cell reaction volume: ∼100 pl). Reverse transcription of bead-captured mRNA molecules results in the incorporation of a bead-specific barcode into each cDNA molecule. The barcoded cDNA molecules from all cells are then pooled together and converted into a single RNA-seq library. (C) High-throughput scRNA-seq with droplet-based microfluidics is similar to (B) except that the co-encapsulation occurs in droplets (per-cell reaction volume: ∼1 nl)

5 of 15

Single cell RNA-seq – Data generation – transcript quantification �

Hwang et al.Experimental & Molecular Medicine(2018) 50:96

tag-based (10x Chromium)

full-length (SMART-seq2)

From: https://www.10xgenomics.com/instruments/chromium-controller

MATER METHODS 2013;3:203

SMART-seq2 is a popular low-throughput method, providing full-length transcript quantification. It is ideally suited for studying a smaller group of cells in greater detail (e.g. differential isoform usage, characterisation of lowly-expressed transcripts).
10x Chromium is a popular high-throughput method, using UMIs for transcript quantification (from either 3’ or 5’ ends). It is ideally suited to study highly heterogeneous tissues and sample large populations of cells at scale. The main advantage of tag-based protocols is that they can be combined with unique molecular identifiers (UMIs), which can help improve the accuracy of transcript quantification. The reason for this improvement has to do with the PCR amplification step during library preparation, which creates several duplicate copies of each molecule. Because this amplification is exponential, molecules may be unfairly represented in the final library, leading to over-estimation of their expression due to these PCR duplicates.
This UMI is part of the sequencing read and can then be computationally taken into account when quantifying the transcript’s abundance.

6 of 15

Single cell RNA-seq – Data processing�

A classical scRNA-seq workflow contains four main steps:

Mapping the cDNA fragments to a reference;

Assigning reads to genes;

Assigning reads to cells (cell barcode demultiplexing);

Counting the number of unique RNA molecules (UMI deduplication).

�

https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/algorithms/overview#alignment

7 of 15

Hardware requirements

8-core Intel or AMD processor (16 cores recommended)
64GB RAM (128GB recommended)
1TB free disk space
64-bit CentOS/RedHat 7.0 or Ubuntu 14.04

You won’t be running this on your laptop!

(probably….)

support.10xgenomics.com/single-cell-gene-expression/software/overview/system-requirements

NB: The minimum requirement of 64GB RAM will allow

`cellranger aggr` to aggregate up to 250k cells,

more memory will be required beyond that.

8 of 15

Thanks for the memories

RAM (GB)

Wallclock

time

(h)

RAM (GB)

Wallclock

time

(h)

Wallclock

time

(h)

RAM (GB)

CPUs

(Amazon EC2)

Cellranger ‘multi’

60k cells

Cellranger ‘count’

20k cells

Cellranger ‘aggr’

250k cells

RAM amount matters for speed, but only up to ~ 256GB

support.10xgenomics.com/single-cell-gene-expression/software/overview/system-requirements

9 of 15

Batch computing

10 of 15

Batch computing

Compute

nodes

.

Multi-core CPUs with lots of RAM

(typically at least 32 cores, 128 GB of RAM)

Connected to one or more storage servers to store files

accessible to all nodes (typically many TB)

Also connected to each other to pool resources even

further (although not often used in bioinformatics)

Interfaces to these ‘compute’ nodes is commonly through

a batch scheduling system running on a ‘login’ node,

that implements a system to submit jobs to run and share

resources fairly amongst users

11 of 15

Job submission

For example, for LSF:

Number of CPUs

Amount of RAM (MB)

Queue type

Submission command

Output log

Executable

You might also specify other resources (GPUs, etc)
You might submit many of these jobs over subsets of data

12 of 15

Interfacing with clusters

Impossible to make many general statements - hardware and software

configurations differ institute to institute

There are workflow configuration packages that abstract this to some degree:

Specify your input files, scripts, etc, and these will split and run jobs across nodes

and cores, but require some initial configuration

www.nextflow.io

snakemake.readthedocs.io

www.nextflow.io/docs/latest/executor.htm

github.com/Snakemake-Profiles/

13 of 15

Graphics processing units (GPUs)

Originally invented for rendering and video games
Essentially big ‘vector’ processors

‘Warp’ of threads - executes the same

code on different processors in parallel

2 + 7 - 9²

4 + 8 - 2²

-4 + 1 - 3²

3 + 2 - 18²

…

Have to be specifically programmed for specific tasks - not all are suitable!

14 of 15

Graphics processing units (GPUs)

One dominant manufacturer - Nvidia
Drivers and platform (CUDA) specific to manufacturer
Consumer and enterprise grade hardware:

GeForce RTX 3090

24GB GDDR6X, 1008 GB/s
10496 CUDA cores (FP32)
30 TFLOPS (FP32)

Ampere A100

80GB GDDR6, 1935 GB/s
6912 CUDA cores* (FP32)
19.5 TFLOPS (FP32)

*More are dedicated to FP64

More RAM = Bigger models, less copying data from system memory (faster)

15 of 15

Graphics processing units (GPUs)

Anything that does matrix/vector calculations can be sped up with data

parallelism on GPUs

Neural networks in particular can take advantage of this!

ml-cheatsheet.readthedocs.io/en/latest/forwardpropagation.html