ABCDEFGHIJKLMNOPQRSTUVWX
1
This was something I put together when I was trying to understand a few things that were not clear to me after reading RetinaNet paper. Some of the questions I had was:
- Why there was (x, y) coordinates that were negative or too big to fit in the oiginal image?
- Why is the upper left anchor box does not start with (0, 0)?
- What would anchorboxes look like when the image is not square?
- When they say "scales", are they talking about the scale of the area or the edge length?

Here is what the coordinates looks like for the activations of P7 when your original image size is height 428 and width 640. As you can see, it has 3 aspect ratios and 3 scales (making it k=9). "Anchor Area" for P7 is set as 512 by 512 in the paper. I used these numbers to make sure that the I can explain coordinates of the anchor boxes are what I expect.

Here is some visualization: https://gist.github.com/anonymous/a9abce58edcd20d29d4a8180ac40132a","https://gist.github.com/anonymous/a9abce58edcd20d29d4a8180ac40132a
2
3
Anchor ctr_x, ctr_y
with respect to feature map (P7)
Anchor ctr_x, ctr_y
with respect to the original image
Aspect Ratios (w:h)
Area for P7Scales
4
widthheight
scale (1/128)
widthheightactual w scaleactual h scale1:21:12:1512*5122^02^1/32^2/3
5
540.00781256404281281070.51226214411.259921051.587401052
6
7
x0.51.52.53.54.564192320448576Aspect Ratios
8
y0.50.50.50.50.553.553.553.553.553.51:21:12:1
9
0.51.52.53.54.564192320448576Scales2^0 [w]362.038672512724.0773439
10
1.51.51.51.51.5160.5160.5160.5160.5160.52^0 [h]724.0773439512362.038672
11
0.51.52.53.54.5641923204485762^1/3 [w]456.1401437645.0795775912.2802874
12
2.52.52.52.52.5267.5267.5267.5267.5267.52^1/3 [h]912.2802874645.0795775456.1401437
13
0.51.52.53.54.5641923204485762^2/3 [w]574.7005687812.74933861149.401137
14
3.53.53.53.53.5374.5374.5374.5374.5374.52^2/3 [h]1149.401137812.7493386574.7005687
15
16
Anchors
17
All center x, center y combinations in feature map P7anchor_wh
18
center xcenter ywidthheight
19
6453.5362.038672724.0773439Visualization:
https://gist.github.com/anonymous/a9abce58edcd20d29d4a8180ac40132a
20
64160.5456.1401437912.2802874center xcenter ywidthheightxminyminxmaxymax
21
64267.5574.70056871149.4011376453.5362.038672724.0773439-117.019336-308.538672245.019336415.538672
22
64374.55125126453.5456.1401437912.2802874-164.0700718-402.6401437292.0700718509.6401437
23
19253.5645.0795775645.07957756453.5574.70056871149.401137-223.3502844-521.2005687351.3502844628.2005687
24
192160.5812.7493386812.74933866453.5512512-192-202.5320309.5
25
192267.5724.0773439362.0386726453.5645.0795775645.0795775-258.5397888-269.0397888386.5397888376.0397888
26
192374.5912.2802874456.14014376453.5812.7493386812.7493386-342.3746693-352.8746693470.3746693459.8746693
27
32053.51149.401137574.70056876453.5724.0773439362.038672-298.038672-127.519336426.038672234.519336
28
320160.56453.5912.2802874456.1401437-392.1401437-174.5700718520.1401437281.5700718
29
320267.56453.51149.401137574.7005687-510.7005687-233.8502844638.7005687340.8502844
30
320374.5
31
44853.5
32
448160.5
33
448267.5
34
448374.5
35
57653.5
36
576160.5
37
576267.5
38
576374.5
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100