A | B | C | D | E | F | G | H | I | |
---|---|---|---|---|---|---|---|---|---|
1 | Results for BODY_25 model (OpenPose >= v1.5.0), COCO model (default until OpenPose 1.4.0) is about 35% slower on GPU but 3x faster on CPU | ||||||||
2 | OpenPose 1.5.0 benchmark | Leave a comment on this document if you try on a different graphics card and have its running time! | |||||||
3 | Server graphics card model | #GPUs | Millisec / frame body-only+ | FPS body-only+ | GPU memory body (MiB)*+ | Fps all (body,hands,face)+ | GPU memory all (MiB)*+ | Flags | 3rd party and OS for that measurement |
4 | Nvidia V100 | cuDNN 7, CUDA 9, OpenCV 3.0 | |||||||
5 | Nvidia p100 | cuDNN 5.1, CUDA 8 | |||||||
6 | Nvidia k80 | ||||||||
7 | |||||||||
8 | Graphics card model | #GPUs | Millisec / frame body-only+ | FPS body-only+ | GPU memory body (MiB)*+ | Fps all (body,hands,face)+ | GPU memory all (MiB)*+ | Flags | 3rd party and OS for that measurement |
9 | GTX 1080 Ti (11GB)** | 1 | 46.1 | 21.7 | 2413 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe custom |
10 | Titan X Pascal (12GB) | 1 | 56.8 | 17.6 | 2221 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0 |
11 | GTX 1080 (8GB) | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0 | |||
12 | GTX 1070 (8GB)** | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "640x480" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0, Windows | |||
13 | Quadro M6000 (24GB)** | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0, Windows | |||
14 | GTX 980 (8GB)** | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0 | |||
15 | GTX 1060 Ti (3GB)** | 1 | ~8.9 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 6, CUDA 8, Caffe 1.0.0, Windows | ||
16 | GTX 780 Ti** | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0, Windows | |||
17 | GTX 1050 Ti (4GB) | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0 | |||
18 | GTX 960 (4GB)** | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 6.0, CUDA 8, Caffe 1.0.0 | |||
19 | GTX 965 (2GB)** | 1 | Out of memory | Out of memory | Out of memory | Out of memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0 | |
20 | GTX 750 Ti** | 1 | Out of memory | Out of memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0 | |||
21 | GTX 860 (2GB) | 1 | Out of memory | Out of memory | Out of memory | Out of memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0 | |
22 | GeForce 940MX (4GB)** | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0, Windows | |||
23 | GTX 660 (2GB)** | 1 | Out of memory | Out of memory | Out of memory | Out of memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 6.0, CUDA 8, Caffe 1.0.0 | |
24 | Quadro K3000M (10GB)** | 1 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0, Windows | |||
25 | GeForce GT 740 (2GB)** | 1 | Out of memory | Out of memory | Out of memory | Out of memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 6.0, CUDA 8, Caffe 1.0.0 | |
26 | AMD Radeon RX 560 (4GB) | 1 | Out of memory | Out of memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | ROCM 1.8, Ubuntu 16.04, Windows | |||
27 | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 6.0, CUDA 8, Caffe 1.0.0 | |||||||
28 | Graphic cards combination | #GPUs | Millisec / frame body-only+ | FPS body-only+ | GPU memory body (MiB)*+ | Fps all (body,hands,face)+ | GPU memory all (MiB)*+ | Flags | 3rd party and OS for that measurement |
29 | GTX 1080 Ti + Titan X Pascal | 2 | 26.0 | 38.4 | ~1/2 or 1/3 of body-only | ~2 x body memory | --resolution "1280x720" --net_resolution "656x368" --num_scales 1 | cuDNN 5.1, CUDA 8, Caffe 1.0.0 | |
30 | |||||||||
31 | Intel CPU version (no GPU) (alpha version) | #Cores | Millisec / frame body-only+ | FPS body-only+ | RAM memory (MB) | Fps all (body,hands,face)+ | GPU memory all (MiB)*+ | Flags | 3rd party and OS for that measurement |
32 | i9-7900X (MKL - 8 OMP - 8) | 20 | ~0.4 | 40000 | --net_resolution "-1x368" --model_pose COCO | cuDNN 5.1, CUDA 8, Intel Caffe (MKL) | |||
33 | i9-7900X (MKL - 1 OMP - 1) | 20 | ~0.2 | 6020 | --net_resolution "-1x368" --model_pose COCO | cuDNN 5.1, CUDA 8, Intel Caffe (MKL) | |||
34 | Intel Core i7-6850K | 12 | 2293.0 | 0.4 | 9244 | --net_resolution "-1x368" --model_pose COCO | cuDNN 5.1, CUDA 8, Intel Caffe (MKL), OpenCV 3.3 | ||
35 | i7-6700K (MKL - 4 OMP - 4) | 8 | ~0.3 | 7231 | --net_resolution "-1x368" --model_pose COCO | cuDNN 5.1, CUDA 8, Intel Caffe (MKL) | |||
36 | i7-6700K (MKL - 1 OMP - 1) | 8 | ~0.1 | 2521 | --net_resolution "-1x368" --model_pose COCO | cuDNN 5.1, CUDA 8, Intel Caffe (MKL) | |||
37 | i7-4710HQ | 8 | 5190.0 | 0.2 | 6723 | --net_resolution "-1x368" --model_pose COCO | cuDNN 5.1, CUDA 8, Intel Caffe (MKL), OpenCV 3.3 | ||
38 | i7-4710HQ | 8 | 7173.0 | 0.1 | 8030 | --net_resolution "-1x368" --model_pose COCO | cuDNN 5.1, CUDA 8, Caffe 1.0.0, Windows | ||
39 | |||||||||
40 | Intel CPU (-accuracy to increase speed) | #Cores | Millisec / frame body-only+ | FPS body-only*** | RAM memory (MB) | Fps all (body,hands,face)+ | GPU memory all (MiB)*+ | Flags | 3rd party and OS for that measurement |
41 | i9-7900X (MKL - 8 OMP - 8) | 20 | ~1.7 | 5820 | --net_resolution "-1x256" --model_pose COCO (lower accuracy) | cuDNN 5.1, CUDA 8, Intel Caffe (MKL) | |||
42 | i9-7900X (MKL - 1 OMP - 1) | 20 | ~0.7 | 1620 | --net_resolution "-1x256" --model_pose COCO (lower accuracy) | cuDNN 5.1, CUDA 8, Intel Caffe (MKL) | |||
43 | i7-6700K (MKL - 4 OMP - 4) | 8 | ~0.9 | 4100 | --net_resolution "-1x256" --model_pose COCO (lower accuracy) | cuDNN 5.1, CUDA 8, Intel Caffe (MKL) | |||
44 | i7-6700K (MKL - 1 OMP - 1) | 8 | ~0.3 | 2381 | --net_resolution "-1x256" --model_pose COCO (lower accuracy) | cuDNN 5.1, CUDA 8, Intel Caffe (MKL) | |||
45 | i7-4710HQ | 8 | 3553.0 | 0.3 | 5322 | --net_resolution "-1x256" --model_pose COCO (lower accuracy) | cuDNN 5.1, CUDA 8, Intel Caffe (MKL), OpenCV 3.3 | ||
46 | i7-4710HQ | 8 | 2149.0 | 0.5 | 3650 | --net_resolution "-1x256" --model_pose COCO (lower accuracy) | cuDNN 5.1, CUDA 8, Intel Caffe (MKL), OpenCV 3.3 | ||
47 | |||||||||
48 | Ubuntu (non Intel) and Mac OS | ||||||||
49 | i7-4710HQ | 0.1 | 20000 | --net_resolution "-1x368" --model_pose COCO | cuDNN 5.1, CUDA 8, Caffe | ||||
50 | |||||||||
51 | |||||||||
52 | |||||||||
53 | * According to the bash command: `nvidia-smi` | ||||||||
54 | + Values dependent of 3rdparty libraries (OpenCV, Caffe, ...) | ||||||||
55 | ** Values added based on user comments | ||||||||
56 | *** Lower accuracy in order to speed up the code | ||||||||
57 | |||||||||
58 | Other Notes | ||||||||
59 | cuDNN 6.0 is ~15% slower than 5.1 in our tests | ||||||||
60 | The ~ symbol means approx | ||||||||
61 | |||||||||
62 | |||||||||
63 | |||||||||
64 | |||||||||
65 | |||||||||
66 | |||||||||
67 | |||||||||
68 | |||||||||
69 |