注意

点击这里下载完整示例代码

01. 使用 GluonCV Auto 模块加载网络数据集¶

本教程介绍了可以用于下载和加载任意自定义数据集的基本数据集预处理，只要它们遵循某些支持的数据格式。

当前版本支持加载以下数据集：- 图像分类（使用 csv 列表和原始图像，或按文件夹分隔的原始图像）- 目标检测（如 Pascal VOC 格式或 COCO json 注解）

敬请关注新的应用和格式，我们也期待看到为 GluonCV 带来新格式的贡献！

介绍就到这里，我们来看看如何将网络数据集加载到 GluonCV 自动模块支持的推荐格式中。

图像分类¶

管理图像分类数据集的标签非常简单。在此示例中，我们展示了几种组织它们的方式。

首先，我们可以自动从嵌套文件夹结构中推断标签，例如

root/car/0001.jpg
root/car/xxxa.jpg
root/car/yyyb.jpg
root/bus/123.png
root/bus/023.jpg
root/bus/wwww.jpg

或者更多，带有 train/val/test 分割，例如

root/train/car/0001.jpg
root/train/car/xxxa.jpg
root/train/bus/123.png
root/train/bus/023.jpg
root/test/car/yyyb.jpg
root/test/bus/wwww.jpg

其中 root 是根文件夹，car 和 bus 类别分别在子目录中组织良好

from gluoncv.auto.tasks import ImageClassification

我们可以使用 ImageClassification.Dataset 从文件夹加载数据集，这里的 root 可以是本地路径或 url，如果它是 url，默认情况下归档文件将自动下载并解压到 ~/.gluoncv，要更改默认行为，您可以编辑 ~/.gluoncv/config.yml

train, val, test = ImageClassification.Dataset.from_folders(
    'https://autogluon.s3.amazonaws.com/datasets/shopee-iet.zip',
    train='train', val='val', test='test', exts=('.jpg', '.jpeg', '.png'))

输出

Downloading /root/.gluoncv/archive/shopee-iet.zip from https://autogluon.s3.amazonaws.com/datasets/shopee-iet.zip...

  0%|          | 0/40895 [00:00<?, ?KB/s]
  0%|          | 51/40895 [00:00<01:39, 412.28KB/s]
  1%|          | 289/40895 [00:00<00:28, 1438.77KB/s]
  3%|3         | 1258/40895 [00:00<00:08, 4507.74KB/s]
  9%|9         | 3842/40895 [00:00<00:03, 11137.39KB/s]
 17%|#6        | 6914/40895 [00:00<00:02, 16116.47KB/s]
 24%|##4       | 9970/40895 [00:00<00:01, 19051.80KB/s]
 32%|###1      | 13042/40895 [00:00<00:01, 20918.29KB/s]
 39%|###9      | 16114/40895 [00:00<00:01, 22068.94KB/s]
 47%|####6     | 19186/40895 [00:01<00:00, 22863.27KB/s]
 54%|#####4    | 22258/40895 [00:01<00:00, 23348.44KB/s]
 62%|######1   | 25282/40895 [00:01<00:00, 23785.20KB/s]
 69%|######9   | 28273/40895 [00:01<00:00, 25382.63KB/s]
 75%|#######5  | 30835/40895 [00:01<00:00, 24110.76KB/s]
 81%|########1 | 33266/40895 [00:01<00:00, 23008.05KB/s]
 88%|########7 | 35970/40895 [00:01<00:00, 23698.54KB/s]
 95%|#########5| 38930/40895 [00:01<00:00, 24304.62KB/s]
100%|##########| 40895/40895 [00:02<00:00, 20447.05KB/s]
data/
├── test/
└── train/

训练集

print('train', train)

输出

train                                                  image  label
  /root/.gluoncv/datasets/shopee-iet/data/train/...      0
  /root/.gluoncv/datasets/shopee-iet/data/train/...      0
  /root/.gluoncv/datasets/shopee-iet/data/train/...      0
  /root/.gluoncv/datasets/shopee-iet/data/train/...      0
  /root/.gluoncv/datasets/shopee-iet/data/train/...      0
..                                                 ...    ...
/root/.gluoncv/datasets/shopee-iet/data/train/...      3
/root/.gluoncv/datasets/shopee-iet/data/train/...      3
/root/.gluoncv/datasets/shopee-iet/data/train/...      3
/root/.gluoncv/datasets/shopee-iet/data/train/...      3
/root/.gluoncv/datasets/shopee-iet/data/train/...      3

[800 rows x 2 columns]

测试集

print('test', test)

输出

test                                                 image  label
 /root/.gluoncv/datasets/shopee-iet/data/test/B...      0
 /root/.gluoncv/datasets/shopee-iet/data/test/B...      0
 /root/.gluoncv/datasets/shopee-iet/data/test/B...      0
 /root/.gluoncv/datasets/shopee-iet/data/test/B...      0
 /root/.gluoncv/datasets/shopee-iet/data/test/B...      0
..                                                ...    ...
/root/.gluoncv/datasets/shopee-iet/data/test/w...      3
/root/.gluoncv/datasets/shopee-iet/data/test/w...      3
/root/.gluoncv/datasets/shopee-iet/data/test/w...      3
/root/.gluoncv/datasets/shopee-iet/data/test/w...      3
/root/.gluoncv/datasets/shopee-iet/data/test/w...      3

[80 rows x 2 columns]

您可能会注意到数据集是 pandas DataFrame，这很方便，而且某些分割为空也没关系，比如在这种情况下，validation 分割为空

print('validation', val)

输出

validation Empty ImageClassificationDataset
Columns: [image, label]
Index: []

您可以将训练集分割为 train 和 val 用于训练和验证

train, val, _ = train.random_split(val_size=0.1, test_size=0)
print(len(train), len(val))

输出

721 79

在某些情况下，您可能会得到一个没有分割的原始文件夹，您可以使用 from_folders 代替

dataset = ImageClassification.Dataset.from_folder('https://s3.amazonaws.com/fast-ai-imageclas/oxford-iiit-pet.tgz')

输出

Downloading /root/.gluoncv/archive/oxford-iiit-pet.tgz from https://s3.amazonaws.com/fast-ai-imageclas/oxford-iiit-pet.tgz...

  0%|          | 0/792683 [00:00<?, ?KB/s]
  1%|1         | 9174/792683 [00:00<00:08, 91710.13KB/s]
  2%|2         | 18501/792683 [00:00<00:08, 92621.53KB/s]
  4%|3         | 27764/792683 [00:00<00:09, 83448.60KB/s]
  5%|4         | 37173/792683 [00:00<00:08, 87385.00KB/s]
  6%|5         | 45997/792683 [00:00<00:08, 84760.22KB/s]
  7%|6         | 54533/792683 [00:00<00:11, 63370.97KB/s]
  8%|7         | 61580/792683 [00:00<00:14, 49460.41KB/s]
  9%|8         | 71043/792683 [00:01<00:12, 59277.35KB/s]
 10%|#         | 80627/792683 [00:01<00:10, 67939.08KB/s]
 11%|#1        | 88398/792683 [00:01<00:10, 65309.23KB/s]
 12%|#2        | 95603/792683 [00:01<00:10, 64708.89KB/s]
 13%|#2        | 102535/792683 [00:01<00:10, 63335.44KB/s]
 14%|#3        | 109181/792683 [00:01<00:11, 60404.43KB/s]
 15%|#4        | 118136/792683 [00:01<00:09, 67976.42KB/s]
 16%|#6        | 127582/792683 [00:01<00:08, 75144.71KB/s]
 17%|#7        | 135371/792683 [00:01<00:09, 69762.13KB/s]
 18%|#8        | 144439/792683 [00:02<00:08, 75366.52KB/s]
 19%|#9        | 152219/792683 [00:02<00:08, 75507.12KB/s]
 20%|##        | 159941/792683 [00:02<00:10, 59474.87KB/s]
 21%|##1       | 166512/792683 [00:02<00:10, 60367.85KB/s]
 22%|##1       | 173005/792683 [00:02<00:10, 58914.48KB/s]
 23%|##2       | 181699/792683 [00:02<00:09, 66061.87KB/s]
 24%|##4       | 190676/792683 [00:02<00:08, 72427.49KB/s]
 25%|##5       | 199250/792683 [00:02<00:07, 76117.15KB/s]
 26%|##6       | 207868/792683 [00:03<00:07, 78902.74KB/s]
 27%|##7       | 216750/792683 [00:03<00:07, 81760.21KB/s]
 28%|##8       | 225684/792683 [00:03<00:06, 83969.83KB/s]
 30%|##9       | 234188/792683 [00:03<00:06, 81017.01KB/s]
 31%|###       | 242384/792683 [00:03<00:07, 77620.08KB/s]
 32%|###1      | 250234/792683 [00:03<00:07, 76599.55KB/s]
 33%|###2      | 258133/792683 [00:03<00:06, 77275.60KB/s]
 34%|###3      | 265906/792683 [00:03<00:09, 57582.32KB/s]
 34%|###4      | 272410/792683 [00:03<00:08, 58012.78KB/s]
 35%|###5      | 279779/792683 [00:04<00:08, 61880.85KB/s]
 36%|###6      | 286708/792683 [00:04<00:08, 57126.95KB/s]
 37%|###7      | 294899/792683 [00:04<00:09, 51322.45KB/s]
 38%|###8      | 303073/792683 [00:04<00:08, 58098.92KB/s]
 39%|###9      | 311584/792683 [00:04<00:07, 64652.80KB/s]
 40%|####      | 320190/792683 [00:04<00:06, 70162.67KB/s]
 41%|####1     | 328745/792683 [00:04<00:06, 74298.56KB/s]
 42%|####2     | 336754/792683 [00:04<00:06, 75904.11KB/s]
 44%|####3     | 345569/792683 [00:05<00:05, 79345.98KB/s]
 45%|####4     | 354172/792683 [00:05<00:05, 81184.15KB/s]
 46%|####5     | 362441/792683 [00:05<00:07, 60572.76KB/s]
 47%|####6     | 371360/792683 [00:05<00:06, 67329.43KB/s]
 48%|####7     | 378880/792683 [00:05<00:06, 67207.93KB/s]
 49%|####8     | 386504/792683 [00:05<00:05, 69543.10KB/s]
 50%|####9     | 394215/792683 [00:05<00:05, 71585.44KB/s]
 51%|#####     | 401692/792683 [00:05<00:05, 72074.74KB/s]
 52%|#####1    | 410634/792683 [00:05<00:04, 76988.39KB/s]
 53%|#####2    | 418670/792683 [00:06<00:04, 77957.13KB/s]
 54%|#####3    | 426873/792683 [00:06<00:04, 79143.65KB/s]
 55%|#####4    | 435910/792683 [00:06<00:04, 82443.54KB/s]
 56%|#####6    | 444800/792683 [00:06<00:04, 84352.36KB/s]
 57%|#####7    | 453291/792683 [00:06<00:04, 84308.51KB/s]
 58%|#####8    | 462409/792683 [00:06<00:03, 86348.88KB/s]
 59%|#####9    | 471246/792683 [00:06<00:03, 86950.58KB/s]
 61%|######    | 479962/792683 [00:06<00:03, 86887.79KB/s]
 62%|######1   | 488947/792683 [00:06<00:03, 87770.49KB/s]
 63%|######2   | 497735/792683 [00:06<00:03, 80872.81KB/s]
 64%|######3   | 505936/792683 [00:07<00:04, 70299.81KB/s]
 65%|######4   | 513266/792683 [00:07<00:04, 65311.92KB/s]
 66%|######5   | 520392/792683 [00:07<00:04, 66820.20KB/s]
 67%|######6   | 529603/792683 [00:07<00:03, 73546.20KB/s]
 68%|######7   | 537181/792683 [00:07<00:04, 60731.50KB/s]
 69%|######8   | 543746/792683 [00:07<00:04, 50496.43KB/s]
 69%|######9   | 549819/792683 [00:07<00:04, 52758.14KB/s]
 70%|#######   | 558689/792683 [00:08<00:03, 61381.72KB/s]
 71%|#######1  | 565524/792683 [00:08<00:03, 63157.69KB/s]
 72%|#######2  | 572256/792683 [00:08<00:05, 43919.13KB/s]
 73%|#######2  | 577707/792683 [00:08<00:05, 41188.72KB/s]
 74%|#######3  | 583947/792683 [00:08<00:04, 45618.96KB/s]
 74%|#######4  | 590326/792683 [00:08<00:04, 49793.18KB/s]
 76%|#######5  | 598662/792683 [00:08<00:03, 58087.01KB/s]
 76%|#######6  | 606023/792683 [00:08<00:03, 60863.59KB/s]
 77%|#######7  | 612563/792683 [00:09<00:03, 52424.66KB/s]
 78%|#######8  | 621192/792683 [00:09<00:02, 60682.26KB/s]
 79%|#######9  | 629994/792683 [00:09<00:02, 67745.55KB/s]
 81%|########  | 639137/792683 [00:09<00:02, 74129.92KB/s]
 82%|########1 | 647192/792683 [00:09<00:01, 75911.83KB/s]
 83%|########2 | 655348/792683 [00:09<00:02, 68232.20KB/s]
 84%|########3 | 662530/792683 [00:09<00:01, 69174.28KB/s]
 84%|########4 | 669712/792683 [00:09<00:01, 64575.84KB/s]
 85%|########5 | 676395/792683 [00:10<00:01, 62989.06KB/s]
 86%|########6 | 682846/792683 [00:10<00:01, 61113.85KB/s]
 87%|########7 | 691801/792683 [00:10<00:01, 68778.02KB/s]
 88%|########8 | 698838/792683 [00:10<00:02, 34432.34KB/s]
 89%|########8 | 704496/792683 [00:10<00:02, 34238.34KB/s]
 90%|########9 | 712696/792683 [00:11<00:01, 40755.57KB/s]
 91%|######### | 720888/792683 [00:11<00:01, 47597.96KB/s]
 92%|#########1| 727559/792683 [00:11<00:01, 51653.04KB/s]
 93%|#########2| 733694/792683 [00:11<00:01, 51860.92KB/s]
 94%|#########3| 742415/792683 [00:11<00:00, 60420.92KB/s]
 95%|#########4| 751415/792683 [00:11<00:00, 68023.74KB/s]
 96%|#########5| 759979/792683 [00:11<00:00, 72755.07KB/s]
 97%|#########7| 769182/792683 [00:11<00:00, 78099.49KB/s]
 98%|#########8| 777853/792683 [00:11<00:00, 80538.75KB/s]
 99%|#########9| 786419/792683 [00:11<00:00, 81948.42KB/s]
100%|##########| 792683/792683 [00:12<00:00, 65670.02KB/s]
oxford-iiit-pet/
├── annotations/
└── images/

print(dataset)

输出

                                                  image  label
   /root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
   /root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
   /root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
   /root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
   /root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
...                                                 ...    ...
/root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
/root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
/root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
/root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1
/root/.gluoncv/datasets/oxford-iiit-pet/oxford...      1

[7390 rows x 2 columns]

可视化图像分类数据集¶

您可以使用 show_images 绘制样本图像，例如

train.show_images(nsample=16, ncol=4, shuffle=True, fontsize=64)

womencasualshoes: 2, BabyPants: 0, BabyShirt: 1, womenchiffontop: 3, BabyPants: 0, BabyShirt: 1, BabyShirt: 1, womenchiffontop: 3, BabyShirt: 1, womencasualshoes: 2, womencasualshoes: 2, womencasualshoes: 2, womenchiffontop: 3, womencasualshoes: 2, womencasualshoes: 2, womenchiffontop: 3

目标检测¶

目标检测的标签要稍微复杂一些，必须以特定格式存储边界框坐标等附加信息。

在 GluonCV 中，我们支持从常见的 Pascal VOC 和 COCO 格式加载。

VOC 和 COCO 格式之间的主要区别在于注解的存储方式。

对于 VOC，原始图像和注解存储在唯一目录中，其中注解通常是基于每张图像的，例如，JPEGImages/0001.jpeg 和 Annotations/0001.xml 是一对有效的图像-标签对。

相比之下，COCO 格式将所有标签存储在一个注解文件中，例如，所有训练注解存储在 instances_train2017.json 中，验证注解存储在 instances_val2017.json 中。

除了识别所需数据集的有效格式外，将数据集加载到 gluoncv 中没有太多区别

from gluoncv.auto.tasks import ObjectDetection

Pascal VOC 的一个子集

dataset = ObjectDetection.Dataset.from_voc('https://autogluon.s3.amazonaws.com/datasets/tiny_motorbike.zip')

输出

Downloading /root/.gluoncv/archive/tiny_motorbike.zip from https://autogluon.s3.amazonaws.com/datasets/tiny_motorbike.zip...

  0%|          | 0/21272 [00:00<?, ?KB/s]
  0%|          | 42/21272 [00:00<01:04, 330.93KB/s]
  1%|1         | 279/21272 [00:00<00:15, 1375.63KB/s]
  5%|5         | 1096/21272 [00:00<00:05, 3808.11KB/s]
 14%|#4        | 3069/21272 [00:00<00:01, 9405.22KB/s]
 24%|##3       | 5021/21272 [00:00<00:01, 12168.03KB/s]
 36%|###5      | 7645/21272 [00:00<00:00, 16325.27KB/s]
 50%|####9     | 10609/21272 [00:00<00:00, 20392.37KB/s]
 60%|#####9    | 12708/21272 [00:00<00:00, 19989.52KB/s]
 72%|#######2  | 15325/21272 [00:01<00:00, 21130.39KB/s]
 86%|########5 | 18269/21272 [00:01<00:00, 23528.13KB/s]
 97%|#########7| 20655/21272 [00:01<00:00, 22295.62KB/s]
21273KB [00:01, 17295.73KB/s]
tiny_motorbike/
├── Annotations/
├── ImageSets/
└── JPEGImages/

数据集再次是一个 pandas DataFrame

print(dataset)

输出

                                                 image  ...                         image_attr
  /root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 375.0}
  /root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 375.0}
  /root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 333.0}
  /root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 375.0}
  /root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 333.0, 'height': 500.0}
..                                                 ...  ...                                ...
/root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 333.0}
/root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 375.0}
/root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 375.0}
/root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 375.0}
/root/.gluoncv/datasets/tiny_motorbike/tiny_mo...  ...  {'width': 500.0, 'height': 331.0}

[220 rows x 3 columns]

数据集也支持随机分割

train, val, test = dataset.random_split(val_size=0.1, test_size=0.1)
print('train', len(train), 'val', len(val), 'test', len(test))

输出

train 170 val 23 test 27

对于目标检测，rois 列是一个字典列表，包含边界框，`image_attr` 是可选属性，可以加速一些图像预处理函数，例如

print(train.loc[0])

输出

image         /root/.gluoncv/datasets/tiny_motorbike/tiny_mo...
rois          [{'class': 'bicycle', 'xmin': 0.316, 'ymin': 0...
image_attr                    {'width': 500.0, 'height': 375.0}
Name: 0, dtype: object

可视化目标检测数据集¶

您可以使用 show_images 绘制样本图像以及边界框，例如

train.show_images(nsample=16, ncol=4, shuffle=True, fontsize=64)

Image(54), Image(26), Image(130), Image(78), Image(107), Image(98), Image(42), Image(28), Image(53), Image(104), Image(95), Image(25), Image(80), Image(156), Image(128), Image(112)

下一步¶

您可以访问任意数据集，例如 kaggle 竞赛数据集，您可以通过查看这些教程开始训练：- 02. 使用 Auto Estimator 训练图像分类 - 03. 使用 GluonCV Auto 任务通过 HPO 训练分类器或检测器您还可以查看带有内置数据集的`d8 数据集 <http://preview.d2l.ai/d8/main/>`_。D8 数据集与 `gluoncv.auto` 完全兼容，您可以直接插入从 d8 加载的数据集，并使用 fit 函数进行训练。

脚本总运行时间： ( 0 minutes 46.529 seconds)

由 Sphinx-Gallery 生成的图库