LCZero Benchmarks
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
%
123
 
 
 
 
 
 
 
 
 
ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
ThreadsEngine version/typeNeural Net sizeSpeed npsNeural Net NameRemark
2
RTX 2070 Super & 2070 & 206012lc0 v0.22.020x2568887811248-t 12 --backend=multiplexing "--backend-opts=(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)" --nncache=2000000 --minibatch-size=1024; go nodes 5000000
3
2x RTX TITAN4lc0 v0.20 dev (with PR 619) 20×25680000
--threads=4 --backend=roundrobin --nncache=10000000 --cpuct=3.0 --minibatch-size=256 --max-collision-events=64 --max-prefetch=64 --backend-opts=(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1) go infinite; NPS checked after 100 seconds (peak was over 100k, then it starts dropping)
4
4xV1004lc0 cuda92 cudnn714 ubuntu20x2567870010040
./lc0 --backend=multiplexing --backend-opts="x(backend=cudnnhalf,gpu=0,max_batch=512),y(backend=cudnnhalf,gpu=1,max_batch=512),yy(backend=cudnnhalf,gpu=2,max_batch=512),yyy(backend=cudnnhalf,gpu=3,max_batch=512)" --no-smart-pruning --minibatch-size=1024 --threads=4
5
2x RTX TITAN4lc0 v0.20 dev (with PR 619)20x2567840132332same as above
6
RTX 2070 Super & 207012lc0 v0.22.020x2567605242425-t 12 --backend=multiplexing "--backend-opts=(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)" --nncache=2000000 --minibatch-size=1024; go nodes 5000000
7
RTX 2070 & 20602lc0 v0.20.320x2565415032930-t 2 --backend=cudnn-fp16 --minibatch-size=1024 --nncache=20000000; go nodes 5000000
8
mkl3lc0 v0.20.420x2565383811248--threads=3 --backend=roundrobin --nncache=10000000 --cpuct=3.0 --minibatch-size=256 --max-collision-events=64 --max-prefetch=64 --backend-opts=(backend=cudnn-fp16,gpu=0) go infinite; NPS checked at 100 seconds
9
2 x RTX 20604lc0 v0.22.020x2565241040685-t 4 -backend=demux -nncache=1000000 -minibatch-size=512 -max-prefetch=32 -backend-opts=(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1) go nodes 5000000
10
2 x RTX 20604lc0 v0.22.020x32052340T40B.2-106-t 4 -backend=demux -nncache=1000000 -minibatch-size=512 -max-prefetch=32 -backend-opts=(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1) go nodes 5000000
11
RTX Titan3lc0 v0.20.1-rc120x2565055832392--minibatch-size=512 -t 3 --backend=cudnn-fp16 --nncache=10000000; go infinite; note down NPS after 1 min
12
RTX 2080 Ti @ 338W (~1785 MHz)2
lc0 v0.20.2 (linux: fedora 29, 415.27, cuda 10.0, cudnn 7.4.2.24)
20x2565045632930-t 2 --backend=cudnn-fp16 --minibatch-size=1024 --nncache=20000000; go nodes 5000000
13
RTX Titan2lc0 v0.20.1-rc120x2564644632392--backend=cudnn-fp16 --nncache=10000000; go infinite; note down NPS after 1 min
14
RTX 2080Ti3lc0 v0.18.120x2564291511250--minibatch-size=512 -t 3 --backend=cudnn-fp16 --nncache=2000000; go nodes 5000000
15
RTX 2080 Ti @ 1290 MHz (~169W)2
lc0 v0.20.2 (linux: fedora 29, 415.27, cuda 10.0, cudnn 7.4.2.24)
20x2563939032930-t 2 --backend=cudnn-fp16 --minibatch-size=1024 --nncache=20000000; go nodes 5000000
16
RTX 2080Ti2lc0 v0.18.120x2563749911250--minibatch-size=512 -t 2 --backend=cudnn-fp16 --nncache=2000000; go nodes 1000000
17
RTX 2080lc0-v0.19.1.1-windows-cuda.zip20x2563172332085--minibatch-size=1024 -t 2 --backend=multiplexing --backend-opts="x(backend=""cudnn-fp16"",gpu=0)" ; go nodes 1000000
18
RTX 2070 (slight OC, +66Mhz core)2lc0 v0.18.120x2563143311250
19
TITAN V2lc0-win-20180708-cuda92-cudnn71420x2563100410048--minibatch-size=512 -t 2 --backend=cudnn-fp16 --nncache=2000000; go nodes 1000000
20
10x 1080ti-t 20lc0 cuda92 cudnn714 ubuntu20x2563030210040
21
RTX 2070 & GTX 10806lc0-v0.19.020x2562932931748lc0 -t 6 -w weights_31748.txt --backend=multiplexing "--backend-opts="a(backend=cudnn-fp16,gpu=0,minibatch-size=512,nncache=2000000),b(backend=cudnn,gpu=1)"
22
RTX 20802
lc0-v0.18.1-windows-cuda10.0-cudnn7.3-for-2080.zip
20x2562613511250?--futile-search-aversion=0 --minibatch-size=1024 -t 2 --backend=multiplexing --backend-opts="x(backend=""cudnn-fp16"",gpu=0)"
23
RTX2080(laptop)2lc0 v0.22.020x25624142T40B.2-106lc0.exe -b cudnn-fp16 -w T40B.4-160;go nodes 100000
24
2lc0-v0.20.2?2179732742--minibatch-size=512 -t 2 --backend=cudnn-fp16 --nncache=2000000; go nodes 1000000
25
RTX 20604
lc0-v0.18.1-windows-cuda10.0-cudnn7.3-for-2080.zip
20x2562141311250.\lc0 --weights=weights_run1_11248.pb.gz --threads=4 --minibatch-size=256 --allowed-node-collisions=256 --cpuct=2.8 --nncache=10000000 --backend=multiplexing --backend-opts="(backend=cudnn,gpu=0),(backend=cudnn,gpu=1)"
26
4lc0-v0.18.120x2562141311248
27
GTX 1080 TiLC0 V17.2 dev20x256920810954GPU load 98-99%, GPU Temp <= 82°C, Fanspeed 95%, go movetime 130 000
28
TITAN XP2lc0-win-20180708-cuda92-cudnn71420x256868610751This benchmark is done in Arena Chess. I suspect on Cutechess-cli Leela will achieve an even higher NPS.
29
GTX 10802?800011250
30
2x GTX 1060 (6GB)1lc0-win-20180526 (cuda 9.2)20x2567596
kb1-256x20-2000000
-t 12 --backend=multiplexing "--backend-opts=(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)" --nncache=2000000 --minibatch-size=1024; go nodes 5000000
31
4LC0 Ver 17 RC2 (Windows)20x256700510970-t 4 --minibatch-size=512 --backend=multiplexing --backend-opts=(backend=cudnn,gpu=0,max_batch=1024),(backend=cudnn,gpu=1,max_batch=1024)
32
GTX 1070 @ stock2lc0-cudnn Batchsize=256 node-collisions=3220x2566672
kb1-256x20-2100000
33
GTX 1070Ti2lc0.20.1 (cuda)20x256649632965--nncache=8000000 --max-collision-events=256 --minibatch-size=256 --backend=multiplexing --cpuct=3.1
34
GTX 1070Ti2Lc0 v0.1720x256565711149--futile-search-aversion=0 (the equivalent of --no-smart-pruning) and otherwise default settings
35
GTX980m 10%underclock2LC0 Ver 17 RC2 (Windows)20x256359810970--threads=1 --fpu-reduction=0.2 --cpuct=1.2 --slowmover=1.5 --move-overhead=10 --no-smart-pruning
36
1lc0-win-20180522 (cuda 9.2)20x2562258
kb1-256x20-2000000
37
GTX 9602lc0 v0.18.1 windows cuda20x256212311248
38
GTX980m 10%underclock220x2561855
kb1-256x20-2000000
--no-smart-pruning (value adjusted from 1093 to 1855 after rerun)
39
GTX 750 Ti @ 1350/1350 MHz1lc0 v0.18.1 windows cuda20x256165811258GPU Load 98%, GPU Temp 59°C, Fan Speed 39%, unknown command line flag --no-smart-pruning; -nncache 300000; go nodes 725000
40
AMD R9 Fury X (core + mem 20% overclock)
4
v0.21.2-rc3 (Manjaro, OpenCL 2.1 AMD-APP (2841.4))
20x256151842512result at 130.000 nodes, peaks at 550.000 nodes at 1880nps, then starts decreasing
41
GTX 9501v0.16.0 custom build for Debian (cuda 9.1.85)20x256150110687--no-smart-pruning --minibatch-size=1024 --threads=2
42
GTX 750 Ti @ stock2lc0 v0.18.1 windows cuda20x256131411258GPU Load 100%, GPU Temp 50°C, Fan Speed 35%, unknown command line flag --no-smart-pruning
43
Nvidia Quadro K22002LC0 Ver 20.1 rc1 (Windows)20x256130432194
44
AMD RX480 (core + mem 20% overclock)
4
v0.21.2-rc3 (Manjaro, OpenCL 2.1 AMD-APP (2841.4))
20x256113342512result at 130.000 nodes, peaks at 540.000 nodes at 1385nps, then starts decreasing
45
MX1502LC0 Ver 17 RC2 (Windows)20x25680511045--minibatch-size=256 --backend=cudnn
46
Nvidia Geforce 840M4v0.21.2-rc3 (Manjaro, cuda 10.1.0, cudnn 7.5.1)20x25643342512
47
AMD Ryzen 7 2700X liquid cooled16v0.21.2 OpenBlas (Windows)20x25628842699
48
GTX 4702lczero v0.10 OpenCL (Windows)20x25613010021
49
AMD Ryzen 3 1200 @stock4lczero v0.10 OpenBlas (Windows)20x2563510021tested using bumblebee + optirun, peaks at 570.000 nodes at 539nps, then starts decreasing, stays around 527-528nps.
50
Intel Core i7-4900MQ 4x 2,80GHz 500GB 16GB Quadro K2100M RW
LC0 Ver 20.1 rc1 (Windows)20x25632805This was performed at the starting position. Whole game average was 63k (see latest paper and talkchess clarifications)
51
TITAN V3lc0 v0.20.1-rc220x25611248
52
4x TPU20x256?
--threads=4 --backend=roundrobin --nncache=10000000 --cpuct=3.0 --minibatch-size=256 --max-collision-events=64 --max-prefetch=64 --backend-opts=(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1) go infinite; NPS checked after 100 seconds (peak was over 100k, then it starts dropping)
53
(for cudnn client start with --no-smart-pruning flag and do "go nodes 130000" and wait until it finishes). For faster GPUs let it run for 1m or even 5m nodes.
×
54
CPU
55
Have clean up for the last time! after that i will close it!
56
LCZero Benchmark Nodes/sec. GPU & CPU
57
Please put your own bench scores here in sorted NPS order if you can. If you don't know what engine type, gpu is opencl and cpu is openblas + Networks ID !
58
Run go infinite from start position and abort after depth 30 and report NPS output.
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...