准备 COCO 数据集¶

COCO 是一个大规模的物体检测、分割和图像标注数据集。本教程将详细介绍为 GluonCV 准备此数据集的步骤。

提示

您需要 42.7 GB 的磁盘空间来下载和解压此数据集。建议使用 SSD 而非 HDD，因为它具有更好的性能。

准备数据集的总时间取决于您的网速和磁盘性能。例如，在 AWS EC2 上使用 EBS 通常需要 20 分钟。

准备数据集¶

我们需要从COCO下载以下四个文件：

文件名	大小	SHA-1
train2017.zip	18 GB	10ad623668ab00c62c096f0ed636d6aff41faca5
val2017.zip	778 MB	4950dc9d00dbe1c933ee0170f5797584351d2a41
annotations_trainval2017.zip	241 MB	8551ee4bb5860311e79dace7e79cb91e432e78b3
stuff_annotations_trainval2017.zip	401 MB	e7aa0f7515c07e23873a9f71d9095b06bcea3e12

下载和解压这些文件的最简单方法是下载帮助脚本 mscoco.py 并运行以下命令：

该命令将自动下载数据并解压到 ~/.mxnet/datasets/coco。

如果您已经将上述文件保存在磁盘上，则可以将 --download-dir 设置为指向它们所在的目录。例如，假设文件保存在 ~/coco/，您可以运行：

python mscoco.py --download-dir ~/coco

使用 GluonCV 读取¶

使用 gluoncv.data.COCODetection 加载图像和标签非常直接。

from gluoncv import data, utils
from matplotlib import pyplot as plt

train_dataset = data.COCODetection(splits=['instances_train2017'])
val_dataset = data.COCODetection(splits=['instances_val2017'])
print('Num of training images:', len(train_dataset))
print('Num of validation images:', len(val_dataset))

输出

WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Collecting pycocotools
  Downloading pycocotools-2.0.6.tar.gz (24 kB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (3.3.4)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from pycocotools) (1.19.5)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (3.0.9)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (0.11.0)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (8.2.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.8.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.3.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.6/dist-packages (from python-dateutil>=2.1->matplotlib>=2.1.0->pycocotools) (1.15.0)
Building wheels for collected packages: pycocotools
  Building wheel for pycocotools (PEP 517): started
  Building wheel for pycocotools (PEP 517): finished with status 'done'
  Created wheel for pycocotools: filename=pycocotools-2.0.6-cp36-cp36m-linux_x86_64.whl size=267695 sha256=421f0bbcb34e45e2cfac0a788826c480b97f2a17e051abb6550e2036561d9d84
  Stored in directory: /root/.cache/pip/wheels/39/5f/a6/d19eb746e1b7525795fa8910576ddc6108d0c9cf343e4155e8
Successfully built pycocotools
Installing collected packages: pycocotools
Successfully installed pycocotools-2.0.6
WARNING: You are using pip version 21.0.1; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
loading annotations into memory...
Done (t=17.99s)
creating index...
index created!
loading annotations into memory...
Done (t=1.11s)
creating index...
index created!
Num of training images: 117266
Num of validation images: 4952

现在让我们可视化一个例子。

train_image, train_label = train_dataset[0]
bounding_boxes = train_label[:, :4]
class_ids = train_label[:, 4:5]
print('Image size (height, width, RGB):', train_image.shape)
print('Num of objects:', bounding_boxes.shape[0])
print('Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):\n',
      bounding_boxes)
print('Class IDs (num_boxes, ):\n', class_ids)

utils.viz.plot_bbox(train_image.asnumpy(), bounding_boxes, scores=None,
                    labels=class_ids, class_names=train_dataset.classes)
plt.show()

输出

Image size (height, width, RGB): (480, 640, 3)
Num of objects: 8
Bounding boxes (num_boxes, x_min, y_min, x_max, y_max):
 [[  1.08 187.69 611.67 472.53]
 [311.73   4.31 630.01 231.99]
 [249.6  229.27 564.84 473.35]
 [  0.    13.51 433.48 387.63]
 [376.2   40.36 450.75  85.89]
 [465.78  38.97 522.85  84.64]
 [385.7   73.66 468.72 143.17]
 [364.05   2.49 457.81  72.56]]
Class IDs (num_boxes, ):
 [[45.]
 [45.]
 [50.]
 [45.]
 [49.]
 [49.]
 [49.]
 [49.]]

最后，要将 train_dataset 和 val_dataset 都用于训练，我们可以通过数据变换处理它们，并使用 mxnet.gluon.data.DataLoader 加载，请参阅 train_ssd.py 获取更多信息。

脚本总运行时间： ( 4 minutes 26.396 seconds)

由 Sphinx-Gallery 生成的画廊