1 of 10

Similarity Search with Deep Convolutional Neural Networks

DISA Lab, FI, Masaryk University, Brno

�CEMI Meeting, 9.12. 2014

http://disa.fi.muni.cz/

2 of 10

Deep Convolutional Neural Network

  • The ImageNet Challenge: recognize 1000 image categories
    • Training data: 1.2M training images
      • manually labeled, obtained by crowdsourcing
  • Krizhnevsky, Sutskever & Hinton solution (2012):

3 of 10

Hidden Layers as Image Descriptors

  • We take responses from the last hidden layer
    • Krizhnevsky trained model
      • Caffe framework
      • No retraining (nor adaptation)
    • 4096-dim visual “semantic” descriptor
    • Compared by Euclidean distance

4 of 10

Indexing and Searching

  • 4096-dim vectors extracted from 20 million images
    • Profimedia dataset (Profiset)
    • Extraction realized on CPUs in grid: 10s per extraction
      • About 2000 CPU-days
      • Latest Caffe CPU implementation: 4s per extraction�
  • Indexing by PPP-Codes (best paper of DEXA 2014)

PPP-Codes index on X

k-NN(q)

refined answer

data storage

candidate set CXX

1

2

3

calculate δ(q,c), c CX

5 of 10

Online Search Demo

http://disa.fi.muni.cz/profimedia-neural_network-20M

HW configuration:

  • 8 cores (each query is parallelized)
  • 32 GB RAM (memory index has 2GB)
  • SSD disk (132 GB on the disk)

Response time: around 500ms

6 of 10

CLEF 2014: Image Annotation Task

  • Task definition
    • Input: image + set of candidate concepts (40 to 207)
    • Expected result: set of relevant concepts

7 of 10

8 of 10

Image Annotation Task: Results

9 of 10

Online Annotation Demo

10 of 10

What’s Next

  • Comparison of indexing efficiency
    • PPP-Codes vs. FLANN (CVUT)
    • TODO: Specify the paper
      • and write it�
  • Demo paper?�
  • Make a publicly available dataset from the �20 million 4096-dim visual descriptors