AI Model Zoo

Introduction

This repository includes optimized deep learning models to speed up the deployment of deep learning inference on Xilinx™ platforms. These models cover different applications, including but not limited to ADAS/AD, video surveillance, robotics, data center, etc. You can get started with these free pre-trained models to enjoy the benefits of deep learning acceleration.

Model Information

The following table includes comprehensive information about each model, including application, framework, training and validation dataset, backbone, input size, computation as well as float and fixed-point precision.

Click here to view details

No.	Application	Model	Name	Framework	Backbone	Input Size	OPS per image	Training Set	Val Set	Float (Top1, Top5)/ mAP/mIoU	Fixed (Top1, Top5)/mAP/mIoU
1	Image Classification	resnet50	cf_resnet50_imagenet_224_224_7.7G	caffe	resnet50	224*224	7.7G	ImageNet Train	ImageNet Validataion	0.74828/0.92135	0.7338/0.9130
2	Image Classification	Inception_v1	cf_inceptionv1_imagenet_224_224_3.16G	caffe	inception_v1	224*224	3.16G	ImageNet Train	ImageNet Validataion	0.689/0.897	0.69882/0.894122
3	Image Classification	Inception_v2	cf_inceptionv2_imagenet_224_224_4G	caffe	bn-inception	224*224	4G	ImageNet Train	ImageNet Validataion	0.7283/0.9109	0.7170/0.9033
4	Image Classification	Inception_v3	cf_inceptionv3_imagenet_299_299_11.4G	caffe	inception_v3	299*299	11.4G	ImageNet Train	ImageNet Validataion	0.77058/0.93326	0.76264/0.930322
5	Image Classification	mobileNet_v2	cf_mobilenetv2_imagenet_224_224_0.59G	caffe	MobileNet_v2	224*224	608M	ImageNet Train	ImageNet Validataion	0.6649/0.872362	0.635219/0.850701
6	Image Classification	tf_resnet50	tf_resnet50_imagenet_224_224_6.97G	tensorflow	resnet50	224*224	6.97G	ImageNet Train	ImageNet Validataion	0.7520/0.9219	0.7420/0.9209
7	Image Classification	tf_inception_v1	tf_inceptionv1_imagenet_224_224_3G	tensorflow	inception_v1	224*224	3.0G	ImageNet Train	ImageNet Validataion	0.6976/0.8963	0.6786/0.8885
8	Image Classification	tf_mobilenet_v2	tf_mobilenetv2_imagenet_224_224_1.17G	tensorflow	MobileNet_v2	224*224	1.17G	ImageNet Train	ImageNet Validataion	0.7487/0.9250	0.2720/-
9	ADAS Vehicle Detection	ssd_adas_pruned_0.95	cf_ssdadas_bdd_360_480_0.95_6.3G	caffe	VGG-16	360*480	6.3G	bdd100k + private data	bdd100k + private data	0.426	0.424
10	ADAS Pedstrain Detection	ssd_pedestrain_pruned_0.97	cf_ssdpedestrian_coco_360_640_0.97_5.9G	caffe	VGG-bn-16	360*640	5.9G	coco2014_train_person and crowndhuman	coco2014_val_person	0.5899	0.585
11	Traffic Detection	ssd_traffic_pruned_0.9	cf_ssdtraffic_360_480_0.9_11.6G	caffe	VGG-16	360*480	11.6G	private data	private data	0.602	0.588
12	Object Detection	ssd_mobilnet_v2	cf_ssdmobilenetv2_bdd_360_480_6.57G	caffe	MobileNet_v2	360*480	6.57G	bdd100k train	bdd100k val	0.3186	0.3019
13	Object Detection	tf_ssd_voc	tf_ssd_voc_300_300_64.81G	tensorflow	VGG-bn-16	300*300	64.81G	voc07+12_trainval	voc07_test	0.7942(11 points)	0.7882(11 points)
14	Face Detection	densebox_320_320	cf_densebox_wider_320_320_0.49G	caffe	VGG-16	320*320	0.49G	wider_face	FDDB	0.8818	0.8768
15	Face Detection	densebox_360_640	cf_densebox_wider_360_640_1.11G	caffe	VGG-16	360*640	1.11G	wider_face	FDDB	0.8909	0.8909
16	ADAS Detection	yolov3_adas_prune_0.9	dk_yolov3_cityscapes_256_512_0.9_5.46G	darknet	darknet-53	256*512	5.46G	cityscape train	cityscape val	55.20%	53.00%
17	Object Detection	yolov3_voc	dk_yolov3_voc_416_416_65.42G	darknet	darknet-53	416*416	65.42G	voc07+12_trainval	voc07_test	82.4%(MaxIntegral)	81.5%(MaxIntegral)
18	Object Detection	tf_yolov3_voc	tf_yolov3_voc_416_416_65.63G	tensorflow	darknet-53	416*416	65.63G	voc07+12_trainval	voc07_test	78.46%(11 points)	77.38%(11 points)
19	Object Detection	refinedet_pruned_0.8	cf_refinedet_coco_360_480_0.8_25G	caffe	VGG-bn-16	360*480	25G	coco2014_train_person	coco2014_val_person	67.68%	67.47%
20	Object Detection	refinedet_pruned_0.92	cf_refinedet_coco_360_480_0.92_10.10G	caffe	VGG-bn-16	360*480	10.10G	coco2014_train_person	coco2014_val_person	64.60%	64.50%
21	Object Detection	refinedet_pruned_0.96	cf_refinedet_coco_360_480_0.96_5.08G	caffe	VGG-bn-16	360*480	5.08G	coco2014_train_person	coco2014_val_person	60.89%	60.65%
22	ADAS Segmentation	FPN	cf_fpn_cityscapes_256_512_8.9G	caffe	Google_v1_BN	256*512	8.9G	Cityscapes gtFineTrain(2975)	Cityscapes Val(500)	0.5669	0.5645
23	ADAS Lane Detection	VPGnet_pruned_0.99	cf_VPGnet_caltechlane_480_640_0.99_2.5G	caffe	VGG	480*640	2.5G	caltech-lanes-train-dataset	caltech lane	88.639%(F1-score)	87%(F1-score)
24	Pose Estimation	SP-net	cf_SPnet_aichallenger_224_128_0.54G	caffe	Google_v1_BN	128*224	548.6M	ai_challenger	ai_challenger	88.2%(PCKh0.5)	87.86%(PCKh0.5)
25	Pose Estimation	Openpose_pruned_0.3	cf_openpose_aichallenger_368_368_0.3_189.7G	caffe	VGG	368*368	49.88G	ai_challenger	ai_challenger	0.45067(OKs)	0.44287(Oks)
26	Object Detection	yolov2_voc	dk_yolov2_voc_448_448_34G	darknet	darknet-19	448*448	34G	voc07+12_trainval	voc07_test	78.45%(MaxIntegral)	77.39%(MaxIntegral)
27	Object Detection	yolov2_voc_pruned_0.66	dk_yolov2_voc_448_448_0.66_11.56G	darknet	darknet-19	448*448	11.56G	voc07+12_trainval	voc07_test	77%(MaxIntegral)	76%(MaxIntegral)
28	Object Detection	yolov2_voc_pruned_0.71	dk_yolov2_voc_448_448_0.71_9.86G	darknet	darknet-19	448*448	9.86G	voc07+12_trainval	voc07_test	76.7%(MaxIntegral)	75.3%(MaxIntegral)
29	Object Detection	yolov2_voc_pruned_0.77	dk_yolov2_voc_448_448_0.77_7.82G	darknet	darknet-19	448*448	7.82G	voc07+12_trainval	voc07_test	75.76%(MaxIntegral)	74.6%(MaxIntegral)
30	Image Classifiction	Inception-v4	cf_inceptionv4_imagenet_299_299_24.5G	caffe	inception	299*299	24.5G	ImageNet Train	ImageNet Validataion	79.59%/94.70%	78.99%/94.45%
31	Image Classifiction	SqueezeNet	cf_squeeze_imagenet_227_227_0.76G	caffe	squeezenet	227*227	0.76G	ImageNet Train	ImageNet Validataion	54.64%/78.20%	50.69%/77.01%
32	Face Recognition	face_landmark	cf_landmark_celeba_96_72_0.14G	caffe	lenet	96*72	0.14G	celebA	processed helen	0.03704(MAE)	0.03692(MAE)
33	Re-identification	reid	cf_reid_marketcuhk_160_80_0.95G	caffe	resnet18	160*80	0.95G	Market1501+CUHK03	Market1501	78.00%	77.60%
34	Object Detection	yolov3_bdd	cf_yolov3_bdd_288_512_53.7G	caffe	darknet-53	288*512	53.7G	bdd100k	bdd100k	50.60%	49.14%
35	Image Classifiction	tf_mobilenet_v1	tf_mobilenetv1_imagenet_224_224_1.14G	tensorflow	MobileNet_v1	224*224	1.14G	ImageNet Train	ImageNet Validataion	71.06%/89.72%	67.87%/87.67%
36	Image Classifiction	resnet18	cf_resnet18_imagenet_224_224_3.65G	caffe	resnet18	224*224	3.65G	ImageNet Train	ImageNet Validataion	68.44%/88.64%	66.94%/88.25%
37	Image Classifiction	resnet18_wide	tf_resnet18_imagenet_224_224_28G	tensorflow	resnet18	224*224	28G	ImageNet Train	ImageNet Validataion	68.91%/88.63%	69.86%/88.96%

Naming Rules

Model name: F_M_D_H_W_(P)_C

F specifies training framework: cf is Caffe, tf is Tensorflow, dk is Darknet, pt is PyTorch
M specifies the model
D specifies the dataset
H specifies the height of input data
W specifies the width of input data
P specifies the pruning ratio, it means how much computation is reduced. It is optional depending on whether the model is pruned.
C specifies the computation of the model: how many Gops per image

For example, cf_refinedet_coco_480_360_0.8_25G is a RefineDet model trained with Caffe using COCO dataset, input data size is 480*360, 80% pruned, and the computation per image is 25Gops.

Model Download

The following table lists various models, download link and MD5 checksum for the zip file of each model.

Note: To download all the models, visit all_models.zip.

Click here to view details

If you are a:

Linux user, use the get_model.sh script to download all the models.
Windows user, use the download link listed in the following table to download a model.

No.	Model	Size	Download link	Checksum
1	resnet50	226.61 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_resnet50_imagenet_224_224_7.7G.zip	a1158f0558254b94bbf05651b04893af
2	Inception_v1	86.47 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_inceptionv1_imagenet_224_224_3.16G.zip	9cad57664719e106d1dfe81f0730e1a2
3	Inception_v2	143.38 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_inceptionv2_imagenet_224_224_4G.zip	13439f7c01b769f72724d0d9bd5f1f87
4	Inception_v3	212.43 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_inceptionv3_imagenet_299_299_11.4G.zip	f6415422c49087dfbc933fd0d2e451ed
5	mobileNet_v2	33.17 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_mobilenetv2_imagenet_224_224_0.59G.zip	a698a297abc8607503e15f47ea5de539
6	tf_resnet50	204.41 MB	https://www.xilinx.com/bin/public/openDownload?filename=tf_resnet50_imagenet_224_224_6.97G.zip	ffce2c0461d0e914d6d1eb3e81b0c825
7	tf_inception_v1	53.44 MB	https://www.xilinx.com/bin/public/openDownload?filename=tf_inceptionv1_imagenet_224_224_3G.zip	64f58dd36e28726a62b964284bb91508
8	tf_mobilenet_v2	49.84 MB	https://www.xilinx.com/bin/public/openDownload?filename=tf_mobilenetv2_imagenet_224_224_1.17G.zip	47e70eae53af73e77664d9871456511f
9	ssd_adas_pruned_0.95	10.97 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_ssdadas_bdd_360_480_0.95_6.3G.zip	02c14f5b3a4641bef2f6713625f9bf95
10	ssd_pedestrain_pruned_0.97	7.32 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_ssdpedestrian_coco_360_640_0.97_5.9G.zip	d913a529e8885451b670f865bec21c3a
11	ssd_traffic_pruned_0.9	17.49 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_ssdtraffic_360_480_0.9_11.6G.zip	a978c750f14b879c45daf0379198c015
12	ssd_mobilnet_v2	98.48 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_ssdmobilenetv2_bdd_360_480_6.57G.zip	bbd9b6a5429db3341115df8eb19d30cc
13	tf_ssd_voc	209.66 MB	https://www.xilinx.com/bin/public/openDownload?filename=tf_ssd_voc_300_300_64.81G.zip	9f7081ec490148eb4709c0075b6db58e
14	densebox_320_320	4.64 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_densebox_wider_320_320_0.49G.zip	e7cf3260a84422640f115e4ae62bd963
15	densebox_360_640	4.64 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_densebox_wider_360_640_1.11G.zip	53da8c489d73c72ad94b38f624157380
16	yolov3_adas_prune_0.9	35.81 MB	https://www.xilinx.com/bin/public/openDownload?filename=dk_yolov3_cityscapes_256_512_5.46G.zip	20530268484ff9a2ff67804ad1c19b3b
17	yolov3_voc	940.03 MB	https://www.xilinx.com/bin/public/openDownload?filename=dk_yolov3_voc_416_416_65.42G.zip	d8265f80521da8e3251ea57798818c31
18	tf_yolov3_voc	500.07 MB	https://www.xilinx.com/bin/public/openDownload?filename=tf_yolov3_voc_416_416_65.63G.zip	c5923313c7570226d4a9249ea68b6fdd
19	refinedet_pruned_0.8	10.2 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_refinedet_coco_360_480_0.92_10.10G.zip	b3fa2804b699915e3dc6bf88478308d8
20	refinedet_pruned_0.92	5.07 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_refinedet_coco_360_480_0.96_5.08G.zip	51e8fb7639786a476829c8286b7e1843
21	refinedet_pruned_0.96	37.34 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_refinedet_coco_360_480_0.8_25G.zip	8ae8521ad5d754bb473a2527dfa5a805
22	FPN	55.98 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_fpn_cityscapes_256_512_8.9G.zip	2f29e526a604f81ae07654a5c5f50dc8
23	VPGnet_pruned_0.99	6.89 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_VPGnet_caltechlane_480_640_0.99_2.5G.zip	697672ac6d91418e16c19978889cb827
24	SP-net	17.32 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_SPnet_aichallenger_224_128_0.54G.zip	41769a269984a183362f2492f719a0d1
25	Openpose_pruned_0.3	315.37 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_openpose_aichallenger_368_368_0.3_189.7G.zip	3e2f9fac5dcdfbc30d663b2f218ebc6c
26	yolov2_voc	476.34 MB	https://www.xilinx.com/bin/public/openDownload?filename=dk_yolov2_voc_448_448_34G.zip	a6f439314bdf65d0d4684c8cdc96c3dd
27	yolov2_voc_pruned_0.66	223.22 MB	https://www.xilinx.com/bin/public/openDownload?filename=dk_yolov2_voc_448_448_0.66_11.56G.zip	9fa27b6cfe81e5f3a62004dc12cabbe7
28	yolov2_voc_pruned_0.71	202.25 MB	https://www.xilinx.com/bin/public/openDownload?filename=dk_yolov2_voc_448_448_0.71_9.86G.zip	6a67d3182cf52dae2023ef3255c128e6
29	yolov2_voc_pruned_0.77	146.51 MB	https://www.xilinx.com/bin/public/openDownload?filename=dk_yolov2_voc_448_448_0.77_7.82G.zip	662857523d9762c7fe74cc3597cf5fd6
30	Inception-v4	380.38 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_inceptionv4_imagenet_299_299_24.5G.zip	e75b600ca020446626b6700b04ba5f5f
31	SqueezeNet	11.27 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_squeeze_imagenet_227_227_0.76G.zip	20befe2e854d1e36230e77f283ee3d39
32	face_landmark	50.42 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_landmark_celeba_96_72_0.14G.zip	44236176d313f8a51098d060cf3ad07d
33	reid	98.33 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_reid_marketcuhk_160_80_0.95G.zip	bb2ca45bf1e57949a66cb3bf52adce8f
34	yolov3_bdd	944.14 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_yolov3_bdd_288_512_53.7G.zip	25802e6b0e0ae0ac3f0ccea105d2a829
35	tf_mobilenet_v1	42.43 MB	https://www.xilinx.com/bin/public/openDownload?filename=tf_mobilenetv1_imagenet_224_224_1.14G.zip	4337b02322441ce1686ce19fc1a36d82
36	resnet18	178.45 MB	https://www.xilinx.com/bin/public/openDownload?filename=cf_resnet18_imagenet_224_224_3.65G.zip	2380212df49e7c9584bdaef646c470f7
37	resnet18_wide	393.64 MB	https://www.xilinx.com/bin/public/openDownload?filename=tf_resnet18_imagenet_224_224_28G.zip	32f782a084f2f2de089c9eb4f1c3e364
/	All models	6.31GB	https://www.xilinx.com/bin/public/openDownload?filename=all_models.zip	0fc242102699cad110027ecfff453d91

Model Directory Structure

Download and extract the model archive to your working area on the local hard disk. For details on the various models, their download link and MD5 checksum for the zip file of each model, see Model Download.

Caffe Model Directory Structure

For a Caffe model, you should see the following directory structure:

├── labelmap.prototxt               # Contains information of the detection class for some models 
│                                     such as SSD, RefineDet.
├── readme.md                       # Contains the environment requirement and data preprocess information. 
│                                     Refer this file to know more about creating `float.prototxt` by adding
│                                     datalayer to `test.prototxt` in the `float` directory.
├── deploy                          
│   ├── deploy.caffemodel           # Input to the compiler. The same with deploy.caffemodel in the `fix` directory.
│   └── deploy.prototxt             # Input to the compiler. The modified prototxt based on deploy.prototxt
│                                     in the `fix` directory, which removes unnecessary or unsupported layers 
│                                     for compilation.
├── fix                             
│   ├── deploy.caffemodel           # Quantized weights, the output of decent_q without modification.
│   ├── deploy.prototxt             # Quantized prototxt, the output of decent_q without modification.
│   ├── fix_test.prototxt           # Used to run evaluation with fix_train_test.caffemodel on GPU 
│   │                                 using python test code released in near future. Some models 
│   │                                 don't have this file if they are converted from Darknet (Yolov2, Yolov3),
│   │                                 Pytorch (ReID) or there is no Caffe Test (Densebox).
│   ├── fix_train_test.caffemodel   # Quantized weights can be used for fixed-point training and evaluation.    
│   └── fix_train_test.prototxt     # Used for fixed-point training and testing with fix_train_test.caffemodel
│                                     on GPU when datalayer modified to user's data path.
└── float                           
    ├── float.caffemodel            # Trained float-point weights.
    ├── float.prototxt              # Modified test.prototxt as the input to decent_q along 
    │                                 with float.caffemodel. decent_q is Xilinx quantization tool 
    │                                 which quantizes float-point to fixed-point model with minimal 
    │                                 accuracy loss. 
    ├── test.prototxt               # Used to run evaluation with python test codes released in near future.    
    └── trainval.prorotxt           # Used for training and testing with caffe train/test command 
                                      when datalayer modified to user's data path. Some models don't 
                                      have this file if they are converted from Darknet (Yolov2, Yolov3),
                                      Pytorch (ReID) or there is no Caffe Test (Densebox).

Note: For more information on decent_q, see the DNNDK User Guide.

Tensorflow Model Directory Structure

For a Tensorflow model, you should see the following directory structure:

├── input_fn.py                     # Python function to read images in calibration dataset and do data preprocess.
├── readme.md                       # Contains the environment requirement, the input and output nodes as well as 
│                                     the data preprocess and postprocess information.
├── fix                          
│   ├── deploy.model.pb             # Quantized model for the compiler (extended Tensorflow format).
│   └── quantize_eval_model.pb      # Quantized model for evaluation.
└── float                             
    └── frozen.pb                   # Float-point frozen model, the input to the `decent_q`.

Model Performance

All the models in the Model Zoo have been deployed on Xilinx hardware with DNNDK™ (Deep Neural Network Development Kit) and Xilinx AI SDK. The performance number including end-to-end throughput and latency for each model on various boards with different DPU configurations are listed in the following sections.

For more information about DPU, see DPU IP Product Guide.

Note: The model performance number listed in the following sections is generated with DNNDK v3.1 and Xilinx AI SDK v2.0.x. For each board, a different DPU configuration is used. DNNDK and Xilinx AI SDK can be downloaded for free from https://www.xilinx.com/products/design-tools/ai-inference/ai-developer-hub.html.

Performance on ZCU102 (0432055-04)

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the ZCU102 (0432055-04) board with a 3 * B4096 @ 287MHz V1.4.0 DPU configuration:

No.	Model	Name	E2E latency (ms) Thread num =1	E2E throughput -fps(Single Thread)	E2E throughput -fps(Multi Thread)
1	resnet50	cf_resnet50_imagenet_224_224_7.7G	12.85	77.8	179.3
2	Inception_v1	cf_inceptionv1_imagenet_224_224_3.16G	5.47	182.683	485.533
3	Inception_v2	cf_inceptionv2_imagenet_224_224_4G	6.76	147.933	373.267
4	Inception_v3	cf_inceptionv3_imagenet_299_299_11.4G	17	58.8333	155.4
5	mobileNet_v2	cf_mobilenetv2_imagenet_224_224_0.59G	4.09	244.617	638.067
6	tf_resnet50	tf_resnet50_imagenet_224_224_6.97G	11.94	83.7833	191.417
7	tf_inception_v1	tf_inceptionv1_imagenet_224_224_3G	6.72	148.867	358.283
8	tf_mobilenet_v2	tf_mobilenetv2_imagenet_224_224_1.17G	5.46	183.117	458.65
9	ssd_adas_pruned_0.95	cf_ssdadas_bdd_360_480_0.95_6.3G	11.33	88.2667	320.5
10	ssd_pedestrain_pruned_0.97	cf_ssdpedestrian_coco_360_640_0.97_5.9G	12.96	77.1833	314.717
11	ssd_traffic_pruned_0.9	cf_ssdtraffic_360_480_0.9_11.6G	17.49	57.1833	218.183
12	ssd_mobilnet_v2	cf_ssdmobilenetv2_bdd_360_480_6.57G	24.21	41.3	141.233
13	tf_ssd_voc	tf_ssd_voc_300_300_64.81G	69.28	14.4333	46.7833
14	densebox_320_320	cf_densebox_wider_320_320_0.49G	2.43	412.183	1416.63
15	densebox_360_640	cf_densebox_wider_360_640_1.11G	5.01	199.717	719.75
16	yolov3_adas_prune_0.9	dk_yolov3_cityscapes_256_512_0.9_5.46G	11.09	90.1667	259.65
17	yolov3_voc	dk_yolov3_voc_416_416_65.42G	70.51	14.1833	44.4
18	tf_yolov3_voc	tf_yolov3_voc_416_416_65.63G	70.75	14.1333	44.0167
19	refinedet_pruned_0.8	cf_refinedet_coco_360_480_0.8_25G	29.91	33.4333	109.067
20	refinedet_pruned_0.92	cf_refinedet_coco_360_480_0.92_10.10G	15.39	64.9667	216.317
21	refinedet_pruned_0.96	cf_refinedet_coco_360_480_0.96_5.08G	11.04	90.5833	312
22	FPN	cf_fpn_cityscapes_256_512_8.9G	16.58	60.3	203.867
23	VPGnet_pruned_0.99	cf_VPGnet_caltechlane_480_640_0.99_2.5G	9.44	105.9	424.667
24	SP-net	cf_SPnet_aichallenger_224_128_0.54G	1.73	579.067	1620.67
25	Openpose_pruned_0.3	cf_openpose_aichallenger_368_368_0.3_189.7G	279.07	3.58333	16.55
26	yolov2_voc	dk_yolov2_voc_448_448_34G	39.76	25.15	86.35
27	yolov2_voc_pruned_0.66	dk_yolov2_voc_448_448_0.66_11.56G	18.42	54.2833	211.217
28	yolov2_voc_pruned_0.71	dk_yolov2_voc_448_448_0.71_9.86G	16.42	60.9167	242.433
29	yolov2_voc_pruned_0.77	dk_yolov2_voc_448_448_0.77_7.82G	14.46	69.1667	286.733
30	Inception-v4	cf_inceptionv4_imagenet_299_299_24.5G	34.25	29.2	84.25
31	SqueezeNet	cf_squeeze_imagenet_227_227_0.76G	3.6	277.65	1080.77
32	face_landmark	cf_landmark_celeba_96_72_0.14G	1.13	885.033	1623.3
33	reid	cf_reid_marketcuhk_160_80_0.95G	2.67	375	773.533
34	yolov3_bdd	cf_yolov3_bdd_288_512_53.7G	73.89	13.5333	42.8833
35	tf_mobilenet_v1	tf_mobilenetv1_imagenet_224_224_1.14G	3.2	312.067	875.967
36	resnet18	cf_resnet18_imagenet_224_224_3.65G	5.1	195.95	524.433
37	resnet18_wide	tf_resnet18_imagenet_224_224_28G	33.28	30.05	83.4167

Performance on ZCU102 (0432055-05)

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the ZCU102 (0432055-05) board with a 3 * B4096 @ 287MHz V1.4.0 DPU configuration:

No.	Model	Name	E2E latency (ms) Thread num =1	E2E throughput -fps(Single Thread)	E2E throughput -fps(Multi Thread)
1	resnet50	cf_resnet50_imagenet_224_224_7.7G	12.98	77.0167	163.417
2	Inception_v1	cf_inceptionv1_imagenet_224_224_3.16G	5.51	181.65	452.4
3	Inception_v2	cf_inceptionv2_imagenet_224_224_4G	6.8	147	345.7
4	Inception_v3	cf_inceptionv3_imagenet_299_299_11.4G	17.11	58.45	144.9
5	mobileNet_v2	cf_mobilenetv2_imagenet_224_224_0.59G	4.13	241.9	587.25
6	tf_resnet50	tf_resnet50_imagenet_224_224_6.97G	12.07	82.85	173.267
7	tf_inception_v1	tf_inceptionv1_imagenet_224_224_3G	6.77	147.65	330.583
8	tf_mobilenet_v2	tf_mobilenetv2_imagenet_224_224_1.17G	5.52	181.067	422.15
9	ssd_adas_pruned_0.95	cf_ssdadas_bdd_360_480_0.95_6.3G	11.32	88.3167	306.267
10	ssd_pedestrain_pruned_0.97	cf_ssdpedestrian_coco_360_640_0.97_5.9G	12.96	77.1667	309.4
11	ssd_traffic_pruned_0.9	cf_ssdtraffic_360_480_0.9_11.6G	17.48	57.2	216
12	ssd_mobilnet_v2	cf_ssdmobilenetv2_bdd_360_480_6.57G	24.67	40.5333	124.733
13	tf_ssd_voc	tf_ssd_voc_300_300_64.81G	69.61	14.3667	46.9833
14	densebox_320_320	cf_densebox_wider_320_320_0.49G	2.46	406.2	1311.8
15	densebox_360_640	cf_densebox_wider_360_640_1.11G	5.04	198.533	645.567
16	yolov3_adas_prune_0.9	dk_yolov3_cityscapes_256_512_0.9_5.46G	11.16	89.6333	239.667
17	yolov3_voc	dk_yolov3_voc_416_416_65.42G	70.67	14.15	43.6167
18	tf_yolov3_voc	tf_yolov3_voc_416_416_65.63G	71.01	14.0833	43.0833
19	refinedet_pruned_0.8	cf_refinedet_coco_360_480_0.8_25G	29.94	33.4	107.533
20	refinedet_pruned_0.92	cf_refinedet_coco_360_480_0.92_10.10G	15.48	64.6167	210.817
21	refinedet_pruned_0.96	cf_refinedet_coco_360_480_0.96_5.08G	11.06	90.45	298.217
22	FPN	cf_fpn_cityscapes_256_512_8.9G	16.68	59.95	188.533
23	VPGnet_pruned_0.99	cf_VPGnet_caltechlane_480_640_0.99_2.5G	9.39	106.45	396.85
24	SP-net	cf_SPnet_aichallenger_224_128_0.54G	1.74	574.833	1516.78
25	Openpose_pruned_0.3	cf_openpose_aichallenger_368_368_0.3_189.7G	279.07	3.58333	16.6333
26	yolov2_voc	dk_yolov2_voc_448_448_34G	39.84	25.1	84.5667
27	yolov2_voc_pruned_0.66	dk_yolov2_voc_448_448_0.66_11.56G	18.44	54.2333	206.067
28	yolov2_voc_pruned_0.71	dk_yolov2_voc_448_448_0.71_9.86G	16.44	60.8167	238.017
29	yolov2_voc_pruned_0.77	dk_yolov2_voc_448_448_0.77_7.82G	14.48	69.0667	279.35
30	Inception-v4	cf_inceptionv4_imagenet_299_299_24.5G	34.46	29.0167	78.5
31	SqueezeNet	cf_squeeze_imagenet_227_227_0.76G	3.64	274.767	1012.17
32	face_landmark	cf_landmark_celeba_96_72_0.14G	1.15	871.333	1444.25
33	reid	cf_reid_marketcuhk_160_80_0.95G	2.7	370.317	702.8
34	yolov3_bdd	cf_yolov3_bdd_288_512_53.7G	74.07	13.5	42.0833
35	tf_mobilenet_v1	tf_mobilenetv1_imagenet_224_224_1.14G	3.23	309.65	809.5
36	resnet18	cf_resnet18_imagenet_224_224_3.65G	5.18	193.067	477.05
37	resnet18_wide	tf_resnet18_imagenet_224_224_28G	33.41	29.9333	80.0667

Performance on FPGA board: ZCU104

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the ZCU104 board with a 2 * B4096 @ 305MHz V1.4.0 DPU configuration:

No.	Model	Name	E2E latency (ms) Thread num =1	E2E throughput -fps(Single Thread)	E2E throughput -fps(Multi Thread)
1	resnet50	cf_resnet50_imagenet_224_224_7.7G	12.13	82.45	151.8
2	Inception_v1	cf_inceptionv1_imagenet_224_224_3.16G	5.07	197.333	404.933
3	Inception_v2	cf_inceptionv2_imagenet_224_224_4G	6.33	158.033	310.15
4	Inception_v3	cf_inceptionv3_imagenet_299_299_11.4G	16.03	62.3667	126.283
5	mobileNet_v2	cf_mobilenetv2_imagenet_224_224_0.59G	3.85	259.833	536.95
6	tf_resnet50	tf_resnet50_imagenet_224_224_6.97G	11.31	88.45	163.65
7	tf_inception_v1	tf_inceptionv1_imagenet_224_224_3G	6.35	157.367	305.467
8	tf_mobilenet_v2	tf_mobilenetv2_imagenet_224_224_1.17G	5.21	191.867	380.933
9	ssd_adas_pruned_0.95	cf_ssdadas_bdd_360_480_0.95_6.3G	10.69	93.5333	242.917
10	ssd_pedestrain_pruned_0.97	cf_ssdpedestrian_coco_360_640_0.97_5.9G	12.13	82.45	236.083
11	ssd_traffic_pruned_0.9	cf_ssdtraffic_360_480_0.9_11.6G	16.48	60.6667	159.617
12	ssd_mobilnet_v2	cf_ssdmobilenetv2_bdd_360_480_6.57G	37.78	26.4667	116.433
13	tf_ssd_voc	tf_ssd_voc_300_300_64.81G	75.09	13.3167	33.5667
14	densebox_320_320	cf_densebox_wider_320_320_0.49G	2.33	428.533	1167.35
15	densebox_360_640	cf_densebox_wider_360_640_1.11G	4.65	215.017	626.317
16	yolov3_adas_prune_0.9	dk_yolov3_cityscapes_256_512_0.9_5.46G	10.51	95.1667	228.383
17	yolov3_voc	dk_yolov3_voc_416_416_65.42G	66.37	15.0667	33
18	tf_yolov3_voc	tf_yolov3_voc_416_416_65.63G	66.74	14.9833	32.8
19	refinedet_pruned_0.8	cf_refinedet_coco_360_480_0.8_25G	28	35.7167	79.1333
20	refinedet_pruned_0.92	cf_refinedet_coco_360_480_0.92_10.10G	14.54	68.7833	160.6
21	refinedet_pruned_0.96	cf_refinedet_coco_360_480_0.96_5.08G	10.39	96.2333	241.783
22	FPN	cf_fpn_cityscapes_256_512_8.9G	15.72	63.6167	177.333
23	VPGnet_pruned_0.99	cf_VPGnet_caltechlane_480_640_0.99_2.5G	8.91	112.233	355.717
24	SP-net	cf_SPnet_aichallenger_224_128_0.54G	1.6	626.5	1337.33
25	Openpose_pruned_0.3	cf_openpose_aichallenger_368_368_0.3_189.7G	267.86	3.73333	12.1333
26	yolov2_voc	dk_yolov2_voc_448_448_34G	37.66	26.55	63.7833
27	yolov2_voc_pruned_0.66	dk_yolov2_voc_448_448_0.66_11.56G	17.51	57.1167	158.917
28	yolov2_voc_pruned_0.71	dk_yolov2_voc_448_448_0.71_9.86G	15.63	63.9667	186.867
29	yolov2_voc_pruned_0.77	dk_yolov2_voc_448_448_0.77_7.82G	13.78	72.55	224.883
30	Inception-v4	cf_inceptionv4_imagenet_299_299_24.5G	32.33	30.9333	64.6
31	SqueezeNet	cf_squeeze_imagenet_227_227_0.76G	3.52	284.033	940.917
32	face_landmark	cf_landmark_celeba_96_72_0.14G	1.02	977.683	1428.2
33	reid	cf_reid_marketcuhk_160_80_0.95G	2.45	407.583	702.717
34	yolov3_bdd	cf_yolov3_bdd_288_512_53.7G	69.77	14.3333	31.7
35	tf_mobilenet_v1	tf_mobilenetv1_imagenet_224_224_1.14G	3.03	330.25	728.35
36	resnet18	cf_resnet18_imagenet_224_224_3.65G	4.84	206.65	428.55
37	resnet18_wide	tf_resnet18_imagenet_224_224_28G	31.23	32.0167	62.7667

Performance on Ultra96

Click here to view details

The following table lists the performance number including end-to-end throughput and latency for each model on the Ultra96 board with a 1 * B1600 @ 287MHz V1.4.0 DPU configuration:

Note: The original power supply of Ultra96 is not designed for high performance AI workload. The board may occasionally hang to run few models, When multi-thread is used. For such situations, NA is specified in the following table.

No.	Model	Name	E2E latency (ms) Thread num =1	E2E throughput -fps(Single Thread)	E2E throughput -fps(Multi Thread)
1	resnet50	cf_resnet50_imagenet_224_224_7.7G	30.8	32.4667	33.4667
2	Inception_v1	cf_inceptionv1_imagenet_224_224_3.16G	13.98	71.55	75.0667
3	Inception_v2	cf_inceptionv2_imagenet_224_224_4G	17.16	58.2667	61.2833
4	Inception_v3	cf_inceptionv3_imagenet_299_299_11.4G	44.05	22.7	23.4333
5	mobileNet_v2	cf_mobilenetv2_imagenet_224_224_0.59G	7.34	136.183	NA
6	tf_resnet50	tf_resnet50_imagenet_224_224_6.97G	28.02	35.6833	36.6
7	tf_inception_v1	tf_inceptionv1_imagenet_224_224_3G	16.96	58.9667	61.2833
8	tf_mobilenet_v2	tf_mobilenetv2_imagenet_224_224_1.17G	10.17	98.3	104.25
9	ssd_adas_pruned_0.95	cf_ssdadas_bdd_360_480_0.95_6.3G	24.3	41.15	46.2
10	ssd_pedestrain_pruned_0.97	cf_ssdpedestrian_coco_360_640_0.97_5.9G	23.29	42.9333	50.8
11	ssd_traffic_pruned_0.9	cf_ssdtraffic_360_480_0.9_11.6G	35.5	28.1667	31.8
12	ssd_mobilnet_v2	cf_ssdmobilenetv2_bdd_360_480_6.57G	60.79	16.45	27.8167
13	tf_ssd_voc	tf_ssd_voc_300_300_64.81G	186.92	5.35	5.81667
14	densebox_320_320	cf_densebox_wider_320_320_0.49G	4.17	239.883	334.167
15	densebox_360_640	cf_densebox_wider_360_640_1.11G	8.55	117	167.2
16	yolov3_adas_prune_0.9	dk_yolov3_cityscapes_256_512_0.9_5.46G	22.79	43.8833	49.6833
17	yolov3_voc	dk_yolov3_voc_416_416_65.42G	185.19	5.4	5.53
18	tf_yolov3_voc	tf_yolov3_voc_416_416_65.63G	199.34	5.01667	5.1
19	refinedet_pruned_0.8	cf_refinedet_coco_360_480_0.8_25G	66.37	15.0667	NA
20	refinedet_pruned_0.92	cf_refinedet_coco_360_480_0.92_10.10G	32.17	31.0883	33.6667
21	refinedet_pruned_0.96	cf_refinedet_coco_360_480_0.96_5.08G	20.29	49.2833	55.25
22	FPN	cf_fpn_cityscapes_256_512_8.9G	36.34	27.5167	NA
23	VPGnet_pruned_0.99	cf_VPGnet_caltechlane_480_640_0.99_2.5G	13.9	71.9333	NA
24	SP-net	cf_SPnet_aichallenger_224_128_0.54G	3.82	261.55	277.4
25	Openpose_pruned_0.3	cf_openpose_aichallenger_368_368_0.3_189.7G	560.75	1.78333	NA
26	yolov2_voc	dk_yolov2_voc_448_448_34G	118.11	8.46667	8.9
27	yolov2_voc_pruned_0.66	dk_yolov2_voc_448_448_0.66_11.56G	37.5	26.6667	30.65
28	yolov2_voc_pruned_0.71	dk_yolov2_voc_448_448_0.71_9.86G	30.99	32.2667	38.35
29	yolov2_voc_pruned_0.77	dk_yolov2_voc_448_448_0.77_7.82G	26.29	38.03333	46.8333
30	Inception-v4	cf_inceptionv4_imagenet_299_299_24.5G	88.76	11.2667	11.5333
31	SqueezeNet	cf_squeeze_imagenet_227_227_0.76G	5.96	167.867	283.583
32	face_landmark	cf_landmark_celeba_96_72_0.14G	2.95	339.183	347.633
33	reid	cf_reid_marketcuhk_160_80_0.95G	6.28	159.15	166.633
34	yolov3_bdd	cf_yolov3_bdd_288_512_53.7G	193.55	5.16667	5.31667
35	tf_mobilenet_v1	tf_mobilenetv1_imagenet_224_224_1.14G	5.97	167.567	186.55
36	resnet18	cf_resnet18_imagenet_224_224_3.65G	13.47	74.2167	77.8167
37	resnet18_wide	tf_resnet18_imagenet_224_224_28G	97.72	10.2333	10.3833

Contributing

We welcome community contributions. When contributing to this repository, first discuss the change you wish to make via:

You can also submit a pull request with details on how to improve the product. Prior to submitting your pull request, ensure that you can build the product and run all the demos with your patch. In case of a larger feature, provide a relevant demo.

License

Xilinx AI Model Zoo is licensed under Apache License Version 2.0. By contributing to the project, you agree to the license and copyright terms therein and release your contribution under these terms.

msc1210y5 / ai-model-zoo Goto Github PK

ai-model-zoo's Introduction

AI Model Zoo

Introduction

Model Information

Naming Rules

Model Download

Model Directory Structure

Caffe Model Directory Structure

Tensorflow Model Directory Structure

Model Performance

Performance on ZCU102 (0432055-04)

Performance on ZCU102 (0432055-05)

Performance on FPGA board: ZCU104

Performance on Ultra96

Contributing

License

ai-model-zoo's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org