注意
点击此处下载完整示例代码
准备 COCO 数据集¶
COCO 是一个大规模的物体检测、分割和图像标注数据集。本教程将详细介绍为 GluonCV 准备此数据集的步骤。

提示
您需要 42.7 GB 的磁盘空间来下载和解压此数据集。建议使用 SSD 而非 HDD,因为它具有更好的性能。
准备数据集的总时间取决于您的网速和磁盘性能。例如,在 AWS EC2 上使用 EBS 通常需要 20 分钟。
准备数据集¶
我们需要从COCO下载以下四个文件:
文件名 |
大小 |
SHA-1 |
---|---|---|
18 GB |
10ad623668ab00c62c096f0ed636d6aff41faca5 |
|
778 MB |
4950dc9d00dbe1c933ee0170f5797584351d2a41 |
|
241 MB |
8551ee4bb5860311e79dace7e79cb91e432e78b3 |
|
401 MB |
e7aa0f7515c07e23873a9f71d9095b06bcea3e12 |
下载和解压这些文件的最简单方法是下载帮助脚本 mscoco.py
并运行以下命令:
该命令将自动下载数据并解压到 ~/.mxnet/datasets/coco
。
如果您已经将上述文件保存在磁盘上,则可以将 --download-dir
设置为指向它们所在的目录。例如,假设文件保存在 ~/coco/
,您可以运行:
python mscoco.py --download-dir ~/coco
使用 GluonCV 读取¶
使用 gluoncv.data.COCODetection
加载图像和标签非常直接。
from gluoncv import data, utils
from matplotlib import pyplot as plt
train_dataset = data.COCODetection(splits=['instances_train2017'])
val_dataset = data.COCODetection(splits=['instances_val2017'])
print('Num of training images:', len(train_dataset))
print('Num of validation images:', len(val_dataset))
输出
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Collecting pycocotools
Downloading pycocotools-2.0.6.tar.gz (24 kB)
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing wheel metadata: started
Preparing wheel metadata: finished with status 'done'
Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (3.3.4)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from pycocotools) (1.19.5)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (3.0.9)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (0.11.0)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (8.2.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.8.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.3.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.6/dist-packages (from python-dateutil>=2.1->matplotlib>=2.1.0->pycocotools) (1.15.0)
Building wheels for collected packages: pycocotools
Building wheel for pycocotools (PEP 517): started
Building wheel for pycocotools (PEP 517): finished with status 'done'
Created wheel for pycocotools: filename=pycocotools-2.0.6-cp36-cp36m-linux_x86_64.whl size=267695 sha256=421f0bbcb34e45e2cfac0a788826c480b97f2a17e051abb6550e2036561d9d84
Stored in directory: /root/.cache/pip/wheels/39/5f/a6/d19eb746e1b7525795fa8910576ddc6108d0c9cf343e4155e8
Successfully built pycocotools
Installing collected packages: pycocotools
Successfully installed pycocotools-2.0.6
WARNING: You are using pip version 21.0.1; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
loading annotations into memory...
Done (t=17.99s)
creating index...
index created!
loading annotations into memory...
Done (t=1.11s)
creating index...
index created!
Num of training images: 117266
Num of validation images: 4952
现在让我们可视化一个例子。
train_image, train_label = train_dataset[0]
bounding_boxes = train_label[:, :4]
class_ids = train_label[:, 4:5]
print('Image size (height, width, RGB):', train_image.shape)
print('Num of objects:', bounding_boxes.shape[0])
print('Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):\n',
bounding_boxes)
print('Class IDs (num_boxes, ):\n', class_ids)
utils.viz.plot_bbox(train_image.asnumpy(), bounding_boxes, scores=None,
labels=class_ids, class_names=train_dataset.classes)
plt.show()

输出
Image size (height, width, RGB): (480, 640, 3)
Num of objects: 8
Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):
[[ 1.08 187.69 611.67 472.53]
[311.73 4.31 630.01 231.99]
[249.6 229.27 564.84 473.35]
[ 0. 13.51 433.48 387.63]
[376.2 40.36 450.75 85.89]
[465.78 38.97 522.85 84.64]
[385.7 73.66 468.72 143.17]
[364.05 2.49 457.81 72.56]]
Class IDs (num_boxes, ):
[[45.]
[45.]
[50.]
[45.]
[49.]
[49.]
[49.]
[49.]]
最后,要将 train_dataset
和 val_dataset
都用于训练,我们可以通过数据变换处理它们,并使用 mxnet.gluon.data.DataLoader
加载,请参阅 train_ssd.py
获取更多信息。
脚本总运行时间: ( 4 minutes 26.396 seconds)