1 of 12

Who am I?

1. PhD in Visual Neuroscience (King’s College London)

Image registration, stitching, bootstrapping, dimension reduction (PCA and PLS) to interpret the 3D anatomy of the LGN

2. Post-doctoral research on Brain Activity during development(RIKEN)

Signal analysis, Motion correction, Arduino, building analysis pipelines to determine the role of neural activity in brain development

Python,

Image segmentation, Image processing,

Dimension reduction (tSNE, UMAP),

Modelling,

Machine learning,

Clustering algorithms. To develop automated neural connectomic pipelines

3. Assistant Professor + Data Science (Kyushu University)

4. Data Scientist and Technical Project Manager (Metacell)

Consulting data scientist,

Familiarity with cloud working (K8s),

SQL,

Absorbed modern business practices (e.g. Agile methodology, user stories, etc..)

2 of 12

Case Study - Neural Connectomics

Multi-colour labelling is a strategy to uniquely identify hundreds of neurons at the same time.

Means we can begin to build a comprehensive connectome (brain map) cheaper and faster than ever before.

3 of 12

The problem - We need a lot of colours

Image created by randomwire.com

4 of 12

For a lot of colours we need to go beyond the human eye

2n-1 colour combinations

7 (23-1) colour combinations

3 (22-1) colour combinations

1 colour combination

5 of 12

Identifying individual neurons

“QDyeFinder” Pipeline

  • Chromatic Aberration correction
  • Linear Unmixing
  • Signal to Noise evaluation
  • Signal consistency
  • Colour transformation
  • Colour identification

6 of 12

Image segmentation reduces processing

Pixel/Voxel based analyses

Segmentation/ROI based analyses

7 of 12

Essentially we now have a clustering problem

8 of 12

Existing Clustering algorithms are sub-optimal

Clustering Algorithm

K-means

Mean Shift Clustering

DBSCAN

What it measures

Distance

Density

Density and distance

Input required

#clusters (k)

A density kernel

Minimum number of points and distance

Advantage

Standard, fast

Outliers have a limited effect

Considers both the density and distance of the points (new gold standard)

Disadvantage

We don’t know the final number of clusters

Density of true clusters may be variable

Clusters produced can cover a large colour space

9 of 12

dCrawler: A new distance based clustering algorithm

10 of 12

dCrawler: A new distance based clustering algorithm

11 of 12

dCrawler: Effective in sub-optimal conditions too

12 of 12

dCrawler: Able to separate colours in images too