1 of 12

Who am I?

1. PhD in Visual Neuroscience (King’s College London)

Image registration, stitching, bootstrapping, dimension reduction (PCA and PLS) to interpret the 3D anatomy of the LGN

2. Post-doctoral research on Brain Activity during development(RIKEN)

Signal analysis, Motion correction, Arduino, building analysis pipelines to determine the role of neural activity in brain development

Python,

Image segmentation, Image processing,

Dimension reduction (tSNE, UMAP),

Modelling,

Machine learning,

Clustering algorithms. To develop automated neural connectomic pipelines

3. Assistant Professor + Data Science (Kyushu University)

4. Data Scientist and Technical Project Manager (Metacell)

Consulting data scientist,

Familiarity with cloud working (K8s),

SQL,

Absorbed modern business practices (e.g. Agile methodology, user stories, etc..)

2 of 12

Case Study - Neural Connectomics

Multi-colour labelling is a strategy to uniquely identify hundreds of neurons at the same time.

Means we can begin to build a comprehensive connectome (brain map) cheaper and faster than ever before.

3 of 12

The problem - We need a lot of colours

Image created by randomwire.com

4 of 12

For a lot of colours we need to go beyond the human eye

2ⁿ-1 colour combinations

7 (2³-1) colour combinations

3 (2²-1) colour combinations

1 colour combination

5 of 12

Identifying individual neurons

“QDyeFinder” Pipeline

Chromatic Aberration correction
Linear Unmixing
Signal to Noise evaluation
Signal consistency
Colour transformation
Colour identification

6 of 12

Image segmentation reduces processing

Pixel/Voxel based analyses

Segmentation/ROI based analyses

7 of 12

Essentially we now have a clustering problem

8 of 12

Existing Clustering algorithms are sub-optimal

Clustering Algorithm	K-means	Mean Shift Clustering	DBSCAN
What it measures	Distance	Density	Density and distance
Input required	#clusters (k)	A density kernel	Minimum number of points and distance
Advantage	Standard, fast	Outliers have a limited effect	Considers both the density and distance of the points (new gold standard)
Disadvantage	We don’t know the final number of clusters	Density of true clusters may be variable	Clusters produced can cover a large colour space

9 of 12

dCrawler: A new distance based clustering algorithm

10 of 12

dCrawler: A new distance based clustering algorithm

11 of 12

dCrawler: Effective in sub-optimal conditions too

12 of 12

dCrawler: Able to separate colours in images too