![Object Detection API Assertion failed: [maximum box coordinate value is larger than 1.01: ] for resn](https://www.xszz.org/skin/wt/rpic/t17.jpg)
Object Detection API Assertion failed: [maximum box coordinate value is larger than 1.01: ] for resn
Question:
I'm using Tensorflow's <a href="https://github.com/tensorflow/models/tree/master/object_detection" rel="nofollow">Object Detection API</a>, but get the following error when training:
<blockquote>InvalidArgumentError (see above for traceback): assertion failed: [maximum box coordinate value is larger than 1.01: ] [1.47]
</blockquote>I get the error when I use any of the following:
<ul><li>faster_rcnn_inception_resnet_v2_atrous_coco </li> <li>rfcn_resnet101_coco</li> </ul>But NOT when I use:
<ul><li>ssd_inception_v2_coco </li> <li>ssd_mobilenet_v1_coco</li> </ul>My training images are a mixture of 300x300 and 450x450 pixels. I don't believe any of my bounding boxes are outside the image coordinates. Even if that's the case why would the last two models work but not the resnet models?
Answer1:After looking at my raw bounding box data, turns out there were a few random instances where the bounding box coordinates either had very large numbers or negative numbers (not sure how that happened to begin with). I deleted these and now I have no issue training any of the models.
Answer2:The first two networks you mentioned seem to be using a value between 0 and 1 to define the position of the bounding boxes. For that reason, I was getting the same error.
I had to change the script to create the TF records, from something like this:
# Assuming `x` & `y` are floats with the coordinates of the top-left corner:
xmin = x
ymin = y
# Assuming `width` & `height` are floats with the size of the box
xmax = x + width
ymax = y + height
To something like this:
# Assuming `x` & `y` are floats with the coordinates of the top-left corner:
xmin = x / image_width
ymin = y / image_height
# Assuming `width` & `height` are floats with the size of the box
xmax = (x + width) / image_width
ymax = (y + height) / image_height