1 of 46

Visual Search as a Cloud Service by Large-Scale Commodity GPU Adoption

Ashwin Nanjappa

Visenze

2 of 46

Outline

Who we are
What we do
How we do it (using GPUs)

3 of 46

Outline

Who we are
What we do
How we do it (using GPUs)

4 of 46

Who we are

Mission: simplify the visual web
3 solutions

Visual search
Visual recognition
Search and recognition for videos

5 of 46

Our background

Spin-off from NExT Research Centre, National University of Singapore, Singapore
Closed $14M funding so far
Staff: 45 (10 PhDs)
Customers: Major ecommerce sites in SE Asia, Japan, India, UK, USA

6 of 46

Who we are
What we do
How we do it (using GPUs)

7 of 46

What we do

Visual Search as a Service and applications
Image/video recognition as a Service and applications

8 of 46

What we do

Visual Search as a Service and applications
Image/video recognition as a Service and applications

9 of 46

One-click visual search experience

object detection + visual search

10 of 46

60+% faster than text search

ViSenze visual search API helps people find visual information more easily

Lazada

Goodrich

Patsnap

11 of 46

ViDiscovery/ViSearch App

Visual discovery of everything
Only app that does both product search and image recognition
Supported by huge structured/unstructured database

14 of 46

What we do

Visual Search as a Service and applications
Image/video recognition as a Service and applications

15 of 46

- 75% manual effort

ViSenze’s auto tagging helps sites tag their products more efficiently

16 of 46

ViSenze’s auto tagging solution helps sites tag their products more efficiently

17 of 46

Deep structured visual taxonomy for specific verticals

18 of 46

Deep structured visual taxonomy for specific verticals

19 of 46

Video Recognition API service

20 of 46

Application: Video + shopping user experience

21 of 46

Who we are
What we do
How we do it (using GPUs)

22 of 46

Technologies

Computer vision and Deep learning

Classification
Detection
Search

Large-scale data crawling

Distributed web service development

Java, Golang, Scala, C++
Docker
Vagrant
Zookeeper
Apache Thrift

23 of 46

Visual search infrastructure

Training pipeline: Train and update models for online online systems
Indexing pipeline: Accept image feed and extract visual features and build index
Search pipeline: Search the image within index in real time

24 of 46

Visual search infrastructure

Training pipeline: Train and update models for online online systems
Indexing pipeline: Accept image feed and extract visual features and build index
Search pipeline: Search the image within index in real time

25 of 46

Visual search infrastructure

Training pipeline: Train and update models for online online systems
Indexing pipeline: Accept image feed and extract visual features and build index
Search pipeline: Search the image within index in real time

26 of 46

Visual search infrastructure

Training pipeline: Train and update models for online online systems
Indexing pipeline: Accept image feed and extract visual features and build index
Search pipeline: Search the image within index in real time

27 of 46

Offline training

Fine tuning CNN
Train model from scratch

28 of 46

Offline training: Siamese network

A loss function more suitable for visual search

29 of 46

Offline training infrastructure

Servers with Pascal GPUs (TitanX)

Past: 980Ti, Titan

Xeon CPUs
Customized Caffe
Training pipeline software (Python)
Inference on CPU and GPU
Visualization and debugging
Evaluation and experiment system (online)

30 of 46

Experiment as a service

31 of 46

Experiment infrastructure

Pascal GPU servers
Custom Caffe
Jenkins for job scheduling
Minio for data store
Custom web service for visualization

32 of 46

Online search with feature and recognition

Visual recognition: trained CNN model
Visual feature: intermediate level activation (e.g. fc6 from Alexnet)

33 of 46

Online infrastructure

34 of 46

Online infrastructure

Index

35 of 46

Online infrastructure: indexing

API load balancers, index queues, customer queues

36 of 46

Online infrastructure: indexing

AWS Cx instances, 200 max, throughput: 1M images/hour

37 of 46

Online infrastructure: indexing

Features on S3, hashes loaded into memory on region servers

38 of 46

Online infrastructure: search

Load balancers and API servers for low latency

39 of 46

Online infrastructure: search

AWS G2 instances to detect and extract features

40 of 46

Online infrastructure: search

Distance search using hash and features

41 of 46

Online infrastructure: search

Latency: <400ms, Bottleneck: Getting the image!

42 of 46

Amazon GPU instances

43 of 46

Amazon GPU instances

Not available in all regions (US only)

44 of 46

Hybrid infrastructure

Everything on cloud GPU: not there yet

45 of 46

GPUs @ Visenze

Economical for training and evaluation

Can be easily deployed to Amazon GPU instances later

Easy to program

Caffe, Tensorflow

Scalable across GPU generations
Fast and powerful

5-10x faster compared to CPU
100s of training, experiments every day/week!

AWS GPU perf per $ still a bit higher than CPU instances

Indexing: CPU
Search: GPU

1 of 46

2 of 46

3 of 46

4 of 46

5 of 46

6 of 46

7 of 46

8 of 46

9 of 46

10 of 46

11 of 46

12 of 46

13 of 46

14 of 46

15 of 46

16 of 46

17 of 46

18 of 46

19 of 46

20 of 46

21 of 46

22 of 46

23 of 46

24 of 46

25 of 46

26 of 46

27 of 46

28 of 46

29 of 46

30 of 46

31 of 46

32 of 46

33 of 46

34 of 46

35 of 46

36 of 46

37 of 46

38 of 46

39 of 46

40 of 46

41 of 46

42 of 46

43 of 46

44 of 46

45 of 46

46 of 46