Coder Social home page Coder Social logo

xgen-report's People

Contributors

uniyushu avatar xipengshen avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

xgen-report's Issues

Downloaded model cannot be detected by XGen immediately

In XGen interactive mode, if the selected model is not present, it asks the user to download the model. After I have downloaded the model, and relaunch XGen interactive mode, XGen still quits and asks to download the model if I select the same model.

If it is due to docker's problem (i.e., that we cannot fix), it is better to give a notice to the user about the potential issue so that the user won't be confused.

Btw, I didn't exit the container throughout the process, since the notice looks to assume the user should download within the container. It is better to give an explicit notice whether the user should exit the container or not. (Otherwise the interactive mode operations and the container shell operations are confusing to people who are not familiar with container.)

How to enable fp16 quantization on XGen?

How does one execute quantized models with XGen? I tried manually modifying the ONNX file to convert fp32 to fp16 weights. That didn't work. So then I tried converting a fp16 PyTorch model to ONNX format. It gives me the error below.

Is quantization to fp16 not supported by XGen yet? The paper does state that XGen used quantization to achieve speedups over TFLite, but I'm not sure if this is natively under the hood in XGen or if it's a completely different type of quantization that is not fp32->fp16. Maybe I missed a detail in the paper or documentation, but I cannot figure out what the root cause is. As far as I know hardware wise, FP16 should be possible on both CPU (asimd asimdhp asimdrdm asimddp flags) and GPU (Adreno 640) given the hardware I am using (Samsung S10e).

onnx_latency,fallback_latency,output_dir,slowest_device,all,input_shape,output_shape,error,device,file_name,file_path,pruning,time_cost,params,IR,MACs
inf,inf,unable to produce model.,R38M20BDTME,"{'R38M20BDTME': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""float16"" is not supported in deepopt mode as tensor type for (conv1.weight)\nInput model is not supported in deepopt mode\nInput model is not supported in fallback mode\n'}}","{'input': [1, 1, 640, 360]}","{'output': [1, 1, 2560, 1440]}",unable to produce model.,R38M20BDTME,SuperResolutionTwitter.onnx,dnnModels/SuperResolutionTwitter.onnx,1.0,7.956344127655029,61680,88473600,14207385600.0
inf,inf,unable to produce model.,R38M20BDTME,"{'R38M20BDTME': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""AddV2, Conv2D, DepthToSpace, Relu, Transpose"" is not supported in fallback mode.\nOperator ""float16"" is not supported in deepopt mode as tensor type for (conv1.weight)\nInput model is not supported in deepopt mode\nInput model is not supported in fallback mode\n'}}","{'input': [1, 1, 640, 360]}","{'output': [1, 1, 2560, 1440]}",unable to produce model.,R38M20BDTME,SuperResolutionTwitter.onnx,dnnModels/SuperResolutionTwitter.onnx,0.0,17.748647451400757,61690,88473600,14207385600
inf,inf,unable to produce model.,RF8M21Y9MNR,"{'RF8M21Y9MNR': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""float16"" is not supported in deepopt mode as tensor type for (conv1.weight)\nInput model is not supported in deepopt mode\nInput model is not supported in fallback mode\n'}}","{'input': [1, 1, 640, 360]}","{'output': [1, 1, 2560, 1440]}",unable to produce model.,RF8M21Y9MNR,SuperResolutionTwitter.onnx,dnnModels/SuperResolutionTwitter.onnx,1.0,3.979487657546997,61680,88473600,14207385600.0
inf,inf,unable to produce model.,RF8M21Y9MNR,"{'RF8M21Y9MNR': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""AddV2, Conv2D, DepthToSpace, Relu, Transpose"" is not supported in fallback mode.\nOperator ""float16"" is not supported in deepopt mode as tensor type for (conv1.weight)\nInput model is not supported in deepopt mode\nInput model is not supported in fallback mode\n'}}","{'input': [1, 1, 640, 360]}","{'output': [1, 1, 2560, 1440]}",unable to produce model.,RF8M21Y9MNR,SuperResolutionTwitter.onnx,dnnModels/SuperResolutionTwitter.onnx,0.0,17.093093633651733,61690,88473600,14207385600

support for distributed training

XGen v1.1 seems to only support training with one GPU. For example, it can not address the following code with custom AI.

python -m torch.distributed.launch --nproc_per_node=8 train.py

It is better to support distributed training with multiple GPUs in future versions.

Fail to use built-in Yolov6

Dear Author,

I found Yolov6 have the similar problem I met two days ago.
Could you help me fix that.
Thanks

Here is the training log file I have:

2023-10-25T13:26:25.661866+0000 - DEBUG - xgen_scripts.main:93 - XGen is running in /root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim, xgen_logger_setup in xgen_scripts.py
2023-10-25T13:26:26.066104+0000 - INFO - core.training:374 - Your current workplace is /root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim
2023-10-25T13:26:26.066511+0000 - INFO - core.training:390 - A new search is started!
2023-10-25T13:26:26.067789+0000 - DEBUG - task_gen.get_default_scenarios_plan:76 - default_plan: {'1': '1', '2': '2', '3': '1'}
xgen-config-path:  /root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim/xgen_config.json
xgen-workplace:  /root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim
xgen-resume:  False
xgen-mode:  scaling
xgen-pretrained-model-path:  /root/Projects/object-detection-yolov6/yolov6_xgen/yolov6_config/xgen.pt
detail args:  
{
    'origin': {
        'common_train_epochs': 300,
        'root_path': './Xgen/',
        'pretrain_model_weights_path': None,
        'train_data_path': '/data/object-detection-yolov6/coco',
        'train_label_path': None,
        'eval_data_path': '/data/object-detection-yolov6/coco',
        'eval_label_path': None,
        'learning_rate': 0.01,
        'batch_size': 16,
        'hyp': './data/hyps/hyp.scratch-high.yaml',
        'data': '/root/Projects/object-detection-yolov6/yolov6_xgen/data/coco.yaml',
        'conf_file': './configs/yolov6s.py',
        'weights': None,
        'device': None,
        'imgsz': 640,
        'width_multiple': 0.5,
        'depth_multiple': 0.33,
        'scaling_factor': 1,
        'workers': 16,
        'noplots': True,
        'num_classes': 80
    },
    'general': {'user_id': 'test', 'work_place': '/root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim', 'random_seed': 3407, 'enable_ddp': False, 'CUDA_VISIBLE_DEVICES': '0', 'tran_scripts_path': None},
    'prune': {
        'sp_store_weights': None,
        'sp_lars': False,
        'sp_lars_trust_coef': 0.001,
        'sp_backbone': False,
        'sp_retrain': False,
        'sp_admm': False,
        'sp_admm_multi': False,
        'sp_retrain_multi': False,
        'sp_config_file': None,
        'sp_subset_progressive': False,
        'sp_admm_fixed_params': False,
        'sp_no_harden': False,
        'nv_sparse': False,
        'sp_load_prune_params': None,
        'sp_store_prune_params': None,
        'generate_rand_seq_gap_yaml': False,
        'sp_admm_update_epoch': 5,
        'sp_admm_update_batch': None,
        'sp_admm_rho': 0.001,
        'sparsity_type': 'block_punched',
        'sp_admm_lr': 0.01,
        'admm_debug': False,
        'sp_global_weight_sparsity': False,
        'sp_prune_threshold': -1.0,
        'sp_block_irregular_sparsity': '(0,0)',
        'sp_block_permute_multiplier': 2,
        'sp_admm_block': '(8,4)',
        'sp_admm_buckets_num': 16,
        'sp_admm_elem_per_row': 1,
        'sp_admm_tile': None,
        'sp_admm_select_number': 4,
        'sp_admm_pattern_row_sub': 1,
        'sp_admm_pattern_col_sub': 4,
        'sp_admm_data_format': None,
        'sp_admm_do_not_permute_conv': False,
        'sp_gs_output_v': None,
        'sp_gs_output_ptr': None,
        'sp_load_frozen_weights': None,
        'retrain_mask_pattern': 'weight',
        'sp_update_init_method': 'weight',
        'sp_mask_update_freq': 10,
        'retrain_mask_sparsity': -1.0,
        'retrain_mask_seed': None,
        'sp_prune_before_retrain': False,
        'output_compressed_format': False,
        'sp_grad_update': False,
        'sp_grad_decay': 0.98,
        'sp_grad_restore_threshold': -1,
        'sp_global_magnitude': False,
        'sp_pre_defined_mask_dir': None,
        'sp_prune_ratios': 0
    },
    'quantization': {
        'qt_aimet': False,
        'qat': True,
        'fold_layers': True,
        'cross_layer_equalization': False,
        'bias_correction': True,
        'rounding_mode': 'nearest',
        'num_quant_samples': 1000,
        'num_bias_correct_samples': 1000,
        'weight_bw': 8,
        'act_bw': 8,
        'quant_scheme': 'tf_enhanced',
        'layers_to_ignore': [],
        'auto_add_bias': True,
        'perform_only_empirical_bias_corr': True
    },
    'pas': {'pas_ratio': 0, 'pas': False, 'limit_loss_weights': 5.0, 'use_limit_loss': False, 'pas_debug': False, 'pas_rebuild': False, 'pas_finetune_epoch': 200, 'pas_pretrained_weight_path': None, 'pas_ignore': ['neck', 'detect', 'cv'], 'pas_searching_ratio': [0.1, 0.2, 0.3]},
    'task': {'specific_scenarios': 'BasicScaling', 'pretrained_model_path': '/root/Projects/object-detection-yolov6/yolov6_xgen/yolov6_config/xgen.pt', 'state': {'stage': 0, 'cycles': 0}, 'max_searching': 10, 'args_2': {'cycles': 10}},
    'user_requirements': {
        'power': None,
        'accuracy': 0.32,
        'accuracy_reverse_yn': 0,
        'model_size': None,
        'memory_size': None,
        'latency': 40.0,
        'margin': 4.0,
        'primary_type': 'latency',
        'primary_range': '<',
        'secondary_type': 'accuracy',
        'secondary_range': '>',
        'searching_variable': 'scaling_factor',
        'searching_range': [0.3, 1],
        'searching_step_size': 0.05,
        'searching_pas_variable': 'pas',
        'express_path': True,
        'target_type': 'latency',
        'searching_granularity': None,
        'using_default_dataset': True,
        'user_model': 'YOLOv6',
        'using_express_path': True,
        'express_mode': 0,
        'use_distillation': False,
        'use_default_distillation_model': True,
        'is_training': True
    },
    'train': {'common_save_best_yn': 1, 'trained_yn': False, 'larger_better': True},
    'compiler': {'input_shape': '(1,3,640,640)', 'opset_version': 11, 'devices': [], 'ios_devices': []},
    'distillation': {
        'distillation_method': 'classic_distillation',
        'enable_ddp': False,
        'enable_dp': False,
        'input_shape': None,
        'original_loss_weights': 0.1,
        'tag_loss_weights': 0.9,
        'tag_loss': 'kl',
        'tag_temperature': 4,
        'tag_loss_combination_method': 'avg',
        'feature_loss_weights': 0.9,
        'feature_default_temperature': 1,
        'advance_feature_mapping': {},
        'regularization_loss_weights': 1,
        'regularization_loss_types': [['tag_discriminator', 1]],
        'discriminator_lr': 0.0001
    }
}
Current search total stages:  3
           Current search stages info           
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Stage       ┃ Max search cycles              ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1           │ 1                              │
│ 2           │ 10                             │
│ 3           │ 1                              │
└─────────────┴────────────────────────────────┘
Using express path to find a suitable model...
2023-10-25T13:26:26.102893+0000 - INFO - core.training:427 - Current Session ID: session-xgen-ac03e012
2023-10-25T13:26:26.110832+0000 - DEBUG - task_gen.task_gen:48 - job_list: [['{"origin": {"common_train_epochs": 0, "root_path": "./Xgen/", "pretrain_model_weights_path": "/root/Projects/.checkpoints/yolov6/yolov6n_xgen2.pt", "train_data_path": "/data/object-detection-yolov6/coco", "train_label_path": null, "eval_data_path": "/data/object-detection-yolov6/coco", "eval_label_path": null, "learning_rate": 0.01, "batch_size": 16, "hyp": "./data/hyps/hyp.scratch-high.yaml", "data": "/root/Projects/object-detection-yolov6/yolov6_xgen/data/coco.yaml", "conf_file": "./configs/yolov6n.py", "weights": null, "device": null, "imgsz": 640, "width_multiple": 0.5, "depth_multiple": 0.33, "scaling_factor": 1, "workers": 16, "noplots": true, "num_classes": 80, "img_size": 640}, "general": {"user_id": "test", "work_place": "/root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim", "random_seed": 3407, "enable_ddp": false, "CUDA_VISIBLE_DEVICES": "0", "tran_scripts_path": null}, "prune": {"sp_store_weights": null, "sp_lars": false, "sp_lars_trust_coef": 0.001, "sp_backbone": false, "sp_retrain": false, "sp_admm": false, "sp_admm_multi": false, "sp_retrain_multi": false, "sp_config_file": null, "sp_subset_progressive": false, "sp_admm_fixed_params": false, "sp_no_harden": false, "nv_sparse": false, "sp_load_prune_params": null, "sp_store_prune_params": null, "generate_rand_seq_gap_yaml": false, "sp_admm_update_epoch": 5, "sp_admm_update_batch": null, "sp_admm_rho": 0.001, "sparsity_type": "block_punched", "sp_admm_lr": 0.01, "admm_debug": false, "sp_global_weight_sparsity": false, "sp_prune_threshold": -1.0, "sp_block_irregular_sparsity": "(0,0)", "sp_block_permute_multiplier": 2, "sp_admm_block": "(8,4)", "sp_admm_buckets_num": 16, "sp_admm_elem_per_row": 1, "sp_admm_tile": null, "sp_admm_select_number": 4, "sp_admm_pattern_row_sub": 1, "sp_admm_pattern_col_sub": 4, "sp_admm_data_format": null, "sp_admm_do_not_permute_conv": false, "sp_gs_output_v": null, "sp_gs_output_ptr": null, "sp_load_frozen_weights": null, "retrain_mask_pattern": "weight", "sp_update_init_method": "weight", "sp_mask_update_freq": 10, "retrain_mask_sparsity": -1.0, "retrain_mask_seed": null, "sp_prune_before_retrain": false, "output_compressed_format": false, "sp_grad_update": false, "sp_grad_decay": 0.98, "sp_grad_restore_threshold": -1, "sp_global_magnitude": false, "sp_pre_defined_mask_dir": null, "sp_prune_ratios": 0}, "quantization": {"qt_aimet": false, "qat": true, "fold_layers": true, "cross_layer_equalization": false, "bias_correction": true, "rounding_mode": "nearest", "num_quant_samples": 1000, "num_bias_correct_samples": 1000, "weight_bw": 8, "act_bw": 8, "quant_scheme": "tf_enhanced", "layers_to_ignore": [], "auto_add_bias": true, "perform_only_empirical_bias_corr": true}, "pas": {"pas_ratio": 0, "pas": false, "limit_loss_weights": 5.0, "use_limit_loss": false, "pas_debug": false, "pas_rebuild": false, "pas_finetune_epoch": 200, "pas_pretrained_weight_path": null, "pas_ignore": ["neck", "detect", "cv"], "pas_searching_ratio": [0.1, 0.2, 0.3]}, "task": {"specific_scenarios": "BasicScaling", "pretrained_model_path": "/root/Projects/object-detection-yolov6/yolov6_xgen/yolov6_config/xgen.pt", "state": {"stage": 1, "cycles": 0}, "max_searching": 10, "args_2": {"cycles": 10}, "args_1": {"cycles": 1}}, "user_requirements": {"power": null, "accuracy": 0.32, "accuracy_reverse_yn": 0, "model_size": null, "memory_size": null, "latency": 40.0, "margin": 4.0, "primary_type": "latency", "primary_range": "<", "secondary_type": "accuracy", "secondary_range": ">", "searching_variable": "scaling_factor", "searching_range": [0.3, 1], "searching_step_size": 0.05, "searching_pas_variable": "pas", "express_path": true, "target_type": "latency", "searching_granularity": null, "using_default_dataset": true, "user_model": "YOLOv6", "using_express_path": true, "express_mode": 0, "use_distillation": false, "use_default_distillation_model": true, "is_training": true}, "train": {"common_save_best_yn": 1, "trained_yn": true, "larger_better": true, "uuid": "9ce87035-6997-4a"}, "compiler": {"input_shape": "(1,3,640,640)", "opset_version": 11, "devices": [], "ios_devices": []}, "distillation": {"distillation_method": "classic_distillation", "enable_ddp": false, "enable_dp": false, "input_shape": null, "original_loss_weights": 0.1, "tag_loss_weights": 0.9, "tag_loss": "kl", "tag_temperature": 4, "tag_loss_combination_method": "avg", "feature_loss_weights": 0.9, "feature_default_temperature": 1, "advance_feature_mapping": {}, "regularization_loss_weights": 1, "regularization_loss_types": [["tag_discriminator", 1]], "discriminator_lr": 0.0001}}'], ['{"origin": {"common_train_epochs": 0, "root_path": "./Xgen/", "pretrain_model_weights_path": "/root/Projects/.checkpoints/yolov6/yolov6n70_xgen.pt", "train_data_path": "/data/object-detection-yolov6/coco", "train_label_path": null, "eval_data_path": "/data/object-detection-yolov6/coco", "eval_label_path": null, "learning_rate": 0.01, "batch_size": 16, "hyp": "./data/hyps/hyp.scratch-high.yaml", "data": "/root/Projects/object-detection-yolov6/yolov6_xgen/data/coco.yaml", "conf_file": "./configs/yolov6n_70.py", "weights": null, "device": null, "imgsz": 640, "width_multiple": 0.5, "depth_multiple": 0.33, "scaling_factor": 1, "workers": 16, "noplots": true, "num_classes": 80, "img_size": 640}, "general": {"user_id": "test", "work_place": "/root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim", "random_seed": 3407, "enable_ddp": false, "CUDA_VISIBLE_DEVICES": "0", "tran_scripts_path": null}, "prune": {"sp_store_weights": null, "sp_lars": false, "sp_lars_trust_coef": 0.001, "sp_backbone": false, "sp_retrain": false, "sp_admm": false, "sp_admm_multi": false, "sp_retrain_multi": false, "sp_config_file": null, "sp_subset_progressive": false, "sp_admm_fixed_params": false, "sp_no_harden": false, "nv_sparse": false, "sp_load_prune_params": null, "sp_store_prune_params": null, "generate_rand_seq_gap_yaml": false, "sp_admm_update_epoch": 5, "sp_admm_update_batch": null, "sp_admm_rho": 0.001, "sparsity_type": "block_punched", "sp_admm_lr": 0.01, "admm_debug": false, "sp_global_weight_sparsity": false, "sp_prune_threshold": -1.0, "sp_block_irregular_sparsity": "(0,0)", "sp_block_permute_multiplier": 2, "sp_admm_block": "(8,4)", "sp_admm_buckets_num": 16, "sp_admm_elem_per_row": 1, "sp_admm_tile": null, "sp_admm_select_number": 4, "sp_admm_pattern_row_sub": 1, "sp_admm_pattern_col_sub": 4, "sp_admm_data_format": null, "sp_admm_do_not_permute_conv": false, "sp_gs_output_v": null, "sp_gs_output_ptr": null, "sp_load_frozen_weights": null, "retrain_mask_pattern": "weight", "sp_update_init_method": "weight", "sp_mask_update_freq": 10, "retrain_mask_sparsity": -1.0, "retrain_mask_seed": null, "sp_prune_before_retrain": false, "output_compressed_format": false, "sp_grad_update": false, "sp_grad_decay": 0.98, "sp_grad_restore_threshold": -1, "sp_global_magnitude": false, "sp_pre_defined_mask_dir": null, "sp_prune_ratios": 0}, "quantization": {"qt_aimet": false, "qat": true, "fold_layers": true, "cross_layer_equalization": false, "bias_correction": true, "rounding_mode": "nearest", "num_quant_samples": 1000, "num_bias_correct_samples": 1000, "weight_bw": 8, "act_bw": 8, "quant_scheme": "tf_enhanced", "layers_to_ignore": [], "auto_add_bias": true, "perform_only_empirical_bias_corr": true}, "pas": {"pas_ratio": 0, "pas": false, "limit_loss_weights": 5.0, "use_limit_loss": false, "pas_debug": false, "pas_rebuild": false, "pas_finetune_epoch": 200, "pas_pretrained_weight_path": null, "pas_ignore": ["neck", "detect", "cv"], "pas_searching_ratio": [0.1, 0.2, 0.3]}, "task": {"specific_scenarios": "BasicScaling", "pretrained_model_path": "/root/Projects/object-detection-yolov6/yolov6_xgen/yolov6_config/xgen.pt", "state": {"stage": 1, "cycles": 0}, "max_searching": 10, "args_2": {"cycles": 10}, "args_1": {"cycles": 1}}, "user_requirements": {"power": null, "accuracy": 0.32, "accuracy_reverse_yn": 0, "model_size": null, "memory_size": null, "latency": 40.0, "margin": 4.0, "primary_type": "latency", "primary_range": "<", "secondary_type": "accuracy", "secondary_range": ">", "searching_variable": "scaling_factor", "searching_range": [0.3, 1], "searching_step_size": 0.05, "searching_pas_variable": "pas", "express_path": true, "target_type": "latency", "searching_granularity": null, "using_default_dataset": true, "user_model": "YOLOv6", "using_express_path": true, "express_mode": 0, "use_distillation": false, "use_default_distillation_model": true, "is_training": true}, "train": {"common_save_best_yn": 1, "trained_yn": true, "larger_better": true, "uuid": "19051967-0d0f-42"}, "compiler": {"input_shape": "(1,3,640,640)", "opset_version": 11, "devices": [], "ios_devices": []}, "distillation": {"distillation_method": "classic_distillation", "enable_ddp": false, "enable_dp": false, "input_shape": null, "original_loss_weights": 0.1, "tag_loss_weights": 0.9, "tag_loss": "kl", "tag_temperature": 4, "tag_loss_combination_method": "avg", "feature_loss_weights": 0.9, "feature_default_temperature": 1, "advance_feature_mapping": {}, "regularization_loss_weights": 1, "regularization_loss_types": [["tag_discriminator", 1]], "discriminator_lr": 0.0001}}'], ['{"origin": {"common_train_epochs": 0, "root_path": "./Xgen/", "pretrain_model_weights_path": "/root/Projects/.checkpoints/yolov6/yolov6n85_xgen.pt", "train_data_path": "/data/object-detection-yolov6/coco", "train_label_path": null, "eval_data_path": "/data/object-detection-yolov6/coco", "eval_label_path": null, "learning_rate": 0.01, "batch_size": 16, "hyp": "./data/hyps/hyp.scratch-high.yaml", "data": "/root/Projects/object-detection-yolov6/yolov6_xgen/data/coco.yaml", "conf_file": "./configs/yolov6n_85.py", "weights": null, "device": null, "imgsz": 640, "width_multiple": 0.5, "depth_multiple": 0.33, "scaling_factor": 1, "workers": 16, "noplots": true, "num_classes": 80, "img_size": 640}, "general": {"user_id": "test", "work_place": "/root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim", "random_seed": 3407, "enable_ddp": false, "CUDA_VISIBLE_DEVICES": "0", "tran_scripts_path": null}, "prune": {"sp_store_weights": null, "sp_lars": false, "sp_lars_trust_coef": 0.001, "sp_backbone": false, "sp_retrain": false, "sp_admm": false, "sp_admm_multi": false, "sp_retrain_multi": false, "sp_config_file": null, "sp_subset_progressive": false, "sp_admm_fixed_params": false, "sp_no_harden": false, "nv_sparse": false, "sp_load_prune_params": null, "sp_store_prune_params": null, "generate_rand_seq_gap_yaml": false, "sp_admm_update_epoch": 5, "sp_admm_update_batch": null, "sp_admm_rho": 0.001, "sparsity_type": "block_punched", "sp_admm_lr": 0.01, "admm_debug": false, "sp_global_weight_sparsity": false, "sp_prune_threshold": -1.0, "sp_block_irregular_sparsity": "(0,0)", "sp_block_permute_multiplier": 2, "sp_admm_block": "(8,4)", "sp_admm_buckets_num": 16, "sp_admm_elem_per_row": 1, "sp_admm_tile": null, "sp_admm_select_number": 4, "sp_admm_pattern_row_sub": 1, "sp_admm_pattern_col_sub": 4, "sp_admm_data_format": null, "sp_admm_do_not_permute_conv": false, "sp_gs_output_v": null, "sp_gs_output_ptr": null, "sp_load_frozen_weights": null, "retrain_mask_pattern": "weight", "sp_update_init_method": "weight", "sp_mask_update_freq": 10, "retrain_mask_sparsity": -1.0, "retrain_mask_seed": null, "sp_prune_before_retrain": false, "output_compressed_format": false, "sp_grad_update": false, "sp_grad_decay": 0.98, "sp_grad_restore_threshold": -1, "sp_global_magnitude": false, "sp_pre_defined_mask_dir": null, "sp_prune_ratios": 0}, "quantization": {"qt_aimet": false, "qat": true, "fold_layers": true, "cross_layer_equalization": false, "bias_correction": true, "rounding_mode": "nearest", "num_quant_samples": 1000, "num_bias_correct_samples": 1000, "weight_bw": 8, "act_bw": 8, "quant_scheme": "tf_enhanced", "layers_to_ignore": [], "auto_add_bias": true, "perform_only_empirical_bias_corr": true}, "pas": {"pas_ratio": 0, "pas": false, "limit_loss_weights": 5.0, "use_limit_loss": false, "pas_debug": false, "pas_rebuild": false, "pas_finetune_epoch": 200, "pas_pretrained_weight_path": null, "pas_ignore": ["neck", "detect", "cv"], "pas_searching_ratio": [0.1, 0.2, 0.3]}, "task": {"specific_scenarios": "BasicScaling", "pretrained_model_path": "/root/Projects/object-detection-yolov6/yolov6_xgen/yolov6_config/xgen.pt", "state": {"stage": 1, "cycles": 0}, "max_searching": 10, "args_2": {"cycles": 10}, "args_1": {"cycles": 1}}, "user_requirements": {"power": null, "accuracy": 0.32, "accuracy_reverse_yn": 0, "model_size": null, "memory_size": null, "latency": 40.0, "margin": 4.0, "primary_type": "latency", "primary_range": "<", "secondary_type": "accuracy", "secondary_range": ">", "searching_variable": "scaling_factor", "searching_range": [0.3, 1], "searching_step_size": 0.05, "searching_pas_variable": "pas", "express_path": true, "target_type": "latency", "searching_granularity": null, "using_default_dataset": true, "user_model": "YOLOv6", "using_express_path": true, "express_mode": 0, "use_distillation": false, "use_default_distillation_model": true, "is_training": true}, "train": {"common_save_best_yn": 1, "trained_yn": true, "larger_better": true, "uuid": "e3383f89-a255-40"}, "compiler": {"input_shape": "(1,3,640,640)", "opset_version": 11, "devices": [], "ios_devices": []}, "distillation": {"distillation_method": "classic_distillation", "enable_ddp": false, "enable_dp": false, "input_shape": null, "original_loss_weights": 0.1, "tag_loss_weights": 0.9, "tag_loss": "kl", "tag_temperature": 4, "tag_loss_combination_method": "avg", "feature_loss_weights": 0.9, "feature_default_temperature": 1, "advance_feature_mapping": {}, "regularization_loss_weights": 1, "regularization_loss_types": [["tag_discriminator", 1]], "discriminator_lr": 0.0001}}']]
processing job 1/3
2023-10-25T13:26:26.113081+0000 - INFO - train_module.model_train_main:155 - MKL_THREADING_LAYER=GNU CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 python train_script_main.py
2023-10-25T13:26:26.113302+0000 - DEBUG - train_module.model_train_main:156 - dp mode
2023-10-25T13:26:26.113408+0000 - DEBUG - sys.run_cmd_with_logger:25 - Running command: MKL_THREADING_LAYER=GNU CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 python train_script_main.py
2023-10-25T13:26:30.053944+0000 - INFO - sys.run_cmd_with_logger:32 - training args are: Namespace(batch_size=16, bs_per_gpu=16, calib=False, check_images=False, check_labels=False, common_train_epochs=0, conf_file='./configs/yolov6n.py', config=None, data='/root/Projects/object-detection-yolov6/yolov6_xgen/data/coco.yaml', data_path='./data/coco.yaml', depth_multiple=0.33, device=None, dist_url='env://', distill=False, distill_feat=False, epochs=0, eval_data_path='/data/object-detection-yolov6/coco', eval_final_only=False, eval_interval=20, eval_label_path=None, gpu_count=0, heavy_eval_range=50, hyp='./data/hyps/hyp.scratch-high.yaml', img_size=640, imgsz=640, learning_rate=0.01, local_rank=-1, name='exp', noplots=True, num_classes=80, output_dir='./runs/train', pretrain_model_weights_path='/root/Projects/.checkpoints/yolov6/yolov6n_xgen2.pt', quant=False, rank=-1, resume=False, root_path='./Xgen/', save_ckpt_on_last_n_epoch=-1, save_dir='runs/train/exp', scaling_factor=1, stop_aug_last_n_epoch=15, teacher_model_path=None, temperature=20, train_data_path='/data/object-detection-yolov6/coco', train_label_path=None, weights=None, width_multiple=0.5, workers=16, world_size=1, write_trainbatch_tb=False)
2023-10-25T13:26:30.054645+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-25T13:26:45.438005+0000 - INFO - sys.run_cmd_with_logger:32 - Train: Final numbers of valid images: 118287/ labels: 118287.
2023-10-25T13:26:45.441744+0000 - INFO - sys.run_cmd_with_logger:32 - 15.4s for dataset initialization.
2023-10-25T13:26:45.462053+0000 - INFO - sys.run_cmd_with_logger:32 - args_ai before_tranier===>
2023-10-25T13:26:49.569788+0000 - INFO - sys.run_cmd_with_logger:32 - Val: Final numbers of valid images: 5000/ labels: 5000.
2023-10-25T13:26:49.578563+0000 - INFO - sys.run_cmd_with_logger:32 - 4.0s for dataset initialization.

...

2023-10-25T13:26:53.883393+0000 - INFO - sys.run_cmd_with_logger:32 - model is not a DataParallel model
2023-10-25T13:26:54.491226+0000 - INFO - sys.run_cmd_with_logger:32 - Load model failed,tensors used as indices must be long, byte or bool tensors
2023-10-25T13:26:54.491741+0000 - INFO - sys.run_cmd_with_logger:32 - Load model 1_1 failed,Error(s) in loading state_dict for Model:
2023-10-25T13:26:54.492148+0000 - INFO - sys.run_cmd_with_logger:32 - 	Unexpected key(s) in state_dict: 

....

2023-10-25T13:26:55.494718+0000 - INFO - sys.run_cmd_with_logger:32 - During handling of the above exception, another exception occurred:
2023-10-25T13:26:55.494916+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-25T13:26:55.495138+0000 - INFO - sys.run_cmd_with_logger:32 - Traceback (most recent call last):
2023-10-25T13:26:55.495299+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_tools-1.0.9-py3.7.egg/xgen_tools/xgen_load.py", line 743, in xgen_load
2023-10-25T13:26:55.495407+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1379, in load_state_dict
2023-10-25T13:26:55.495522+0000 - INFO - sys.run_cmd_with_logger:32 -     state_dict = state_dict.copy()
2023-10-25T13:26:55.495696+0000 - INFO - sys.run_cmd_with_logger:32 - AttributeError: 'Tensor' object has no attribute 'copy'
2023-10-25T13:26:55.495867+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-25T13:26:55.496168+0000 - INFO - sys.run_cmd_with_logger:32 - During handling of the above exception, another exception occurred:
2023-10-25T13:26:55.496287+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-25T13:26:55.496608+0000 - INFO - sys.run_cmd_with_logger:32 - Traceback (most recent call last):
2023-10-25T13:26:55.496730+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_tools-1.0.9-py3.7.egg/xgen_tools/xgen_load.py", line 768, in xgen_load
2023-10-25T13:26:55.497046+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_tools-1.0.9-py3.7.egg/xgen_tools/xgen_load.py", line 768, in <dictcomp>
2023-10-25T13:26:55.497160+0000 - INFO - sys.run_cmd_with_logger:32 - IndexError: tensors used as indices must be long, byte or bool tensors
2023-10-25T13:26:55.497476+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-25T13:26:55.497588+0000 - INFO - sys.run_cmd_with_logger:32 - During handling of the above exception, another exception occurred:
2023-10-25T13:26:55.497835+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-25T13:26:55.497947+0000 - INFO - sys.run_cmd_with_logger:32 - Traceback (most recent call last):
2023-10-25T13:26:55.498282+0000 - INFO - sys.run_cmd_with_logger:32 -   File "train_script_main.py", line 173, in <module>
2023-10-25T13:26:55.498578+0000 - INFO - sys.run_cmd_with_logger:32 -     training_main(args_ai=args_ai)
2023-10-25T13:26:55.498987+0000 - INFO - sys.run_cmd_with_logger:32 -   File "train_script_main.py", line 168, in training_main
2023-10-25T13:26:55.499271+0000 - INFO - sys.run_cmd_with_logger:32 -     main(args_ai)
2023-10-25T13:26:55.499395+0000 - INFO - sys.run_cmd_with_logger:32 -   File "train_script_main.py", line 154, in main
2023-10-25T13:26:55.499506+0000 - INFO - sys.run_cmd_with_logger:32 -     trainer = Trainer(opt, cfg, device, args_ai)
2023-10-25T13:26:55.499758+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim/yolov6_xgen/yolov6/core/engine.py", line 92, in __init__
2023-10-25T13:26:55.500076+0000 - INFO - sys.run_cmd_with_logger:32 -     xgen_load(self.model, args_ai=args_ai)
2023-10-25T13:26:55.500374+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_tools-1.0.9-py3.7.egg/xgen_tools/helper.py", line 94, in __call__
2023-10-25T13:26:55.500639+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_tools-1.0.9-py3.7.egg/xgen_tools/xgen_load.py", line 834, in xgen_load
2023-10-25T13:26:55.500919+0000 - INFO - sys.run_cmd_with_logger:32 - Exception: can't load pretrained weights, please double check the path or weights formate!
Traceback (most recent call last):
  File "xgen_scripts.py", line 107, in main
    training(training_main, training_script_path=training_script_path, log_path=log_path)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/training/core.py", line 449, in training
    internal_data = train_module(job, training_main)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/training/train_module.py", line 184, in train_module
    args_ai = model_train_main(job, training_main)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/training/train_module.py", line 163, in model_train_main
    raise Exception('Training failed')
Exception: Training failed
2023-10-25T13:26:57.305850+0000 - ERROR - xgen_scripts.main:116 - Error found. Please check log file at /root/output/YOLOv6_CoCo2017/20231025132603_040_32_sim/xgen-training.log
2023-10-25T13:26:57.306265+0000 - ERROR - xgen_scripts.main:117 - Cancel started session.

Best,
Hsin-Hsaun

XGen does not check some of the invalid inputs before executing

Issues

XGen does not validate some of the inputs before executing.

Environment

  1. Do not connect phones to the host.

Reproduce Steps

  1. Type in the configurations shown in the following snapshot.
    invalid_input
    invalid_input2

Results

XGen still can do execution.

Expectations

  1. The desired latency should not be negative values.
  2. In the 2nd snapshot, "10m" is valid for XGen but the unit "m" is not shown in examples.

errors when testing onnx models

I find some onnx models on github. Here is the link: https://github.com/onnx/models
It is a onnx model zoo with many onnx models. However, if I use XGen to test these models with the command:

XGen test-onnx-latency  --model-path /root/Projects/onnx/    --output-path /root/Projects/onnx_results/

I always have the same error, the results are attached.
result_github_onnx.csv
All the models are collected from the above github link with the same model name.

amazon aws also provides some onnx models.
I downloaded two onnx models from amazon aws with the following link

wget https://s3.amazonaws.com/onnx-model-zoo/vgg/vgg16/vgg16.onnx
wget https://s3.amazonaws.com/onnx-model-zoo/resnet/resnet50v1/resnet50v1.onnx

It still have some errors to test their latency with XGen following the command,

XGen test-onnx-latency  --model-path /root/Projects/onnx/    --output-path /root/Projects/onnx_results/

I just put the downloaded aws onnx model in the model-path directory and delete other models. The results are attached.
result_aws_resnet_onnx.csv
result_aws_vgg_onnx.csv

The Error when train the Yolov8 with built-in configuration

Dear author,

I keep getting this error after I did train once for Yolov8.

Here is the xgen_train.log:

2023-10-24T01:08:24.499131+0000 - DEBUG - xgen_scripts.main:93 - XGen is running in /root/output/YOLOv8_CoCo2017/20231024010820, xgen_logger_setup in xgen_scripts.py
2023-10-24T01:08:25.196550+0000 - INFO - core.training:374 - Your current workplace is /root/output/YOLOv8_CoCo2017/20231024010820
2023-10-24T01:08:25.197096+0000 - INFO - core.training:390 - A new search is started!
2023-10-24T01:08:25.198375+0000 - DEBUG - task_gen.get_default_scenarios_plan:76 - default_plan: {'1': '1', '2': '2', '3': '1'}
xgen-config-path:  /root/output/YOLOv8_CoCo2017/20231024010820/xgen_config.json
xgen-workplace:  /root/output/YOLOv8_CoCo2017/20231024010820
xgen-resume:  False
xgen-mode:  scaling
xgen-pretrained-model-path:  /root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/xgen.pt
detail args:  
{
    'origin': {
        'common_train_epochs': 10,
        'root_path': './Xgen/',
        'pretrain_model_weights_path': None,
        'train_data_path': '/data/object-detection-yolov6/coco',
        'train_label_path': None,
        'eval_data_path': '/data/object-detection-yolov6/coco',
        'eval_label_path': None,
        'learning_rate': 0.01,
        'batch_size': 16,
        'data': '/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/coco.yaml',
        'conf_file': 'yolov8m.yaml',
        'weights': None,
        'device': None,
        'imgsz': 640,
        'width_multiple': 0.5,
        'depth_multiple': 0.33,
        'scaling_factor': 1,
        'workers': 16,
        'noplots': True,
        'num_classes': 80,
        'device_num': 1
    },
    'general': {'user_id': 'test', 'work_place': '/root/output/YOLOv8_CoCo2017/20231024010820', 'random_seed': 3407, 'enable_ddp': False, 'CUDA_VISIBLE_DEVICES': '0', 'tran_scripts_path': None},
    'prune': {
        'sp_store_weights': None,
        'sp_lars': False,
        'sp_lars_trust_coef': 0.001,
        'sp_backbone': False,
        'sp_retrain': False,
        'sp_admm': False,
        'sp_admm_multi': False,
        'sp_retrain_multi': False,
        'sp_config_file': None,
        'sp_subset_progressive': False,
        'sp_admm_fixed_params': False,
        'sp_no_harden': False,
        'nv_sparse': False,
        'sp_load_prune_params': None,
        'sp_store_prune_params': None,
        'generate_rand_seq_gap_yaml': False,
        'sp_admm_update_epoch': 5,
        'sp_admm_update_batch': None,
        'sp_admm_rho': 0.001,
        'sparsity_type': 'block_punched',
        'sp_admm_lr': 0.01,
        'admm_debug': False,
        'sp_global_weight_sparsity': False,
        'sp_prune_threshold': -1.0,
        'sp_block_irregular_sparsity': '(0,0)',
        'sp_block_permute_multiplier': 2,
        'sp_admm_block': '(8,4)',
        'sp_admm_buckets_num': 16,
        'sp_admm_elem_per_row': 1,
        'sp_admm_tile': None,
        'sp_admm_select_number': 4,
        'sp_admm_pattern_row_sub': 1,
        'sp_admm_pattern_col_sub': 4,
        'sp_admm_data_format': None,
        'sp_admm_do_not_permute_conv': False,
        'sp_gs_output_v': None,
        'sp_gs_output_ptr': None,
        'sp_load_frozen_weights': None,
        'retrain_mask_pattern': 'weight',
        'sp_update_init_method': 'weight',
        'sp_mask_update_freq': 10,
        'retrain_mask_sparsity': -1.0,
        'retrain_mask_seed': None,
        'sp_prune_before_retrain': False,
        'output_compressed_format': False,
        'sp_grad_update': False,
        'sp_grad_decay': 0.98,
        'sp_grad_restore_threshold': -1,
        'sp_global_magnitude': False,
        'sp_pre_defined_mask_dir': None,
        'sp_prune_ratios': 0
    },
    'quantization': {
        'qt_aimet': False,
        'qat': True,
        'fold_layers': True,
        'cross_layer_equalization': False,
        'bias_correction': True,
        'rounding_mode': 'nearest',
        'num_quant_samples': 1000,
        'num_bias_correct_samples': 1000,
        'weight_bw': 8,
        'act_bw': 8,
        'quant_scheme': 'tf_enhanced',
        'layers_to_ignore': [],
        'auto_add_bias': True,
        'perform_only_empirical_bias_corr': True
    },
    'pas': {'pas_ratio': 0, 'pas': False, 'limit_loss_weights': 5.0, 'use_limit_loss': False, 'pas_debug': False, 'pas_rebuild': False, 'pas_finetune_epoch': 200, 'pas_pretrained_weight_path': None, 'pas_ignore': ['neck', 'detect', 'cv'], 'pas_searching_ratio': [0.1, 0.2, 0.3]},
    'task': {'specific_scenarios': 'BasicScaling', 'pretrained_model_path': '/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/xgen.pt', 'state': {'stage': 0, 'cycles': 0}, 'max_searching': 10, 'args_2': {'cycles': 10}},
    'user_requirements': {
        'power': None,
        'accuracy': 0.35,
        'accuracy_reverse_yn': 0,
        'model_size': None,
        'memory_size': None,
        'latency': 75.0,
        'margin': 7.5,
        'primary_type': 'latency',
        'primary_range': '<',
        'secondary_type': 'accuracy',
        'secondary_range': '>',
        'searching_variable': 'scaling_factor',
        'searching_range': [0.2, 1],
        'searching_step_size': 0.05,
        'searching_pas_variable': 'pas',
        'express_path': True,
        'target_type': 'latency',
        'searching_granularity': None,
        'using_default_dataset': True,
        'user_model': 'YOLOv8',
        'using_express_path': True,
        'express_mode': 0,
        'use_distillation': False,
        'use_default_distillation_model': True,
        'is_training': True
    },
    'train': {'common_save_best_yn': 1, 'trained_yn': False, 'larger_better': True},
    'compiler': {
        'input_shape': '(1,3,640,640)',
        'opset_version': 11,
        'devices': [
            {
                'task_queue_size': 0,
                'device': {
                    'uuid': 'R5CRC1NFW2E',
                    'device_type': 'android',
                    'connection_status': 'available',
                    'task_status': 'idle',
                    'info': {'uuid': 'R5CRC1NFW2E', 'cpu': 'SM8350', 'gpu': 'Qualcomm, Adreno (TM) 660', 'memory': '5.24 GB', 'battery': '100', 'brand': 'samsung', 'model': 'SM-G990U1', 'os_type': 'android'}
                },
                'agent_id': 'agent-localhost'
            }
        ],
        'ios_devices': []
    },
    'distillation': {
        'distillation_method': 'classic_distillation',
        'enable_ddp': False,
        'enable_dp': False,
        'input_shape': None,
        'original_loss_weights': 0.1,
        'tag_loss_weights': 0.9,
        'tag_loss': 'kl',
        'tag_temperature': 4,
        'tag_loss_combination_method': 'avg',
        'feature_loss_weights': 0.9,
        'feature_default_temperature': 1,
        'advance_feature_mapping': {},
        'regularization_loss_weights': 1,
        'regularization_loss_types': [['tag_discriminator', 1]],
        'discriminator_lr': 0.0001
    }
}
Current search total stages:  3
           Current search stages info           
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Stage       ┃ Max search cycles              ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1           │ 1                              │
│ 2           │ 10                             │
│ 3           │ 1                              │
└─────────────┴────────────────────────────────┘
2023-10-24T01:08:25.236168+0000 - INFO - core.training:427 - Current Session ID: session-xgen-32a44e70
2023-10-24T01:08:25.246143+0000 - DEBUG - task_gen.task_gen:48 - job_list: [['{"origin": {"common_train_epochs": 0, "root_path": "./Xgen/", "pretrain_model_weights_path": "/root/Projects/.checkpoints/yolov8/yolov8s_xgen.pt", "train_data_path": "/data/object-detection-yolov6/coco", "train_label_path": null, "eval_data_path": "/data/object-detection-yolov6/coco", "eval_label_path": null, "learning_rate": 0.01, "batch_size": 16, "data": "/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/coco.yaml", "conf_file": "yolov8s.yaml", "weights": null, "device": null, "imgsz": 640, "width_multiple": 0.5, "depth_multiple": 0.33, "scaling_factor": 1, "workers": 16, "noplots": true, "num_classes": 80, "device_num": 1}, "general": {"user_id": "test", "work_place": "/root/output/YOLOv8_CoCo2017/20231024010820", "random_seed": 3407, "enable_ddp": false, "CUDA_VISIBLE_DEVICES": "0", "tran_scripts_path": null}, "prune": {"sp_store_weights": null, "sp_lars": false, "sp_lars_trust_coef": 0.001, "sp_backbone": false, "sp_retrain": false, "sp_admm": false, "sp_admm_multi": false, "sp_retrain_multi": false, "sp_config_file": null, "sp_subset_progressive": false, "sp_admm_fixed_params": false, "sp_no_harden": false, "nv_sparse": false, "sp_load_prune_params": null, "sp_store_prune_params": null, "generate_rand_seq_gap_yaml": false, "sp_admm_update_epoch": 5, "sp_admm_update_batch": null, "sp_admm_rho": 0.001, "sparsity_type": "block_punched", "sp_admm_lr": 0.01, "admm_debug": false, "sp_global_weight_sparsity": false, "sp_prune_threshold": -1.0, "sp_block_irregular_sparsity": "(0,0)", "sp_block_permute_multiplier": 2, "sp_admm_block": "(8,4)", "sp_admm_buckets_num": 16, "sp_admm_elem_per_row": 1, "sp_admm_tile": null, "sp_admm_select_number": 4, "sp_admm_pattern_row_sub": 1, "sp_admm_pattern_col_sub": 4, "sp_admm_data_format": null, "sp_admm_do_not_permute_conv": false, "sp_gs_output_v": null, "sp_gs_output_ptr": null, "sp_load_frozen_weights": null, "retrain_mask_pattern": "weight", "sp_update_init_method": "weight", "sp_mask_update_freq": 10, "retrain_mask_sparsity": -1.0, "retrain_mask_seed": null, "sp_prune_before_retrain": false, "output_compressed_format": false, "sp_grad_update": false, "sp_grad_decay": 0.98, "sp_grad_restore_threshold": -1, "sp_global_magnitude": false, "sp_pre_defined_mask_dir": null, "sp_prune_ratios": 0}, "quantization": {"qt_aimet": false, "qat": true, "fold_layers": true, "cross_layer_equalization": false, "bias_correction": true, "rounding_mode": "nearest", "num_quant_samples": 1000, "num_bias_correct_samples": 1000, "weight_bw": 8, "act_bw": 8, "quant_scheme": "tf_enhanced", "layers_to_ignore": [], "auto_add_bias": true, "perform_only_empirical_bias_corr": true}, "pas": {"pas_ratio": 0, "pas": false, "limit_loss_weights": 5.0, "use_limit_loss": false, "pas_debug": false, "pas_rebuild": false, "pas_finetune_epoch": 200, "pas_pretrained_weight_path": null, "pas_ignore": ["neck", "detect", "cv"], "pas_searching_ratio": [0.1, 0.2, 0.3]}, "task": {"specific_scenarios": "BasicScaling", "pretrained_model_path": "/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/xgen.pt", "state": {"stage": 1, "cycles": 0}, "max_searching": 10, "args_2": {"cycles": 10}, "args_1": {"cycles": 1}}, "user_requirements": {"power": null, "accuracy": 0.35, "accuracy_reverse_yn": 0, "model_size": null, "memory_size": null, "latency": 75.0, "margin": 7.5, "primary_type": "latency", "primary_range": "<", "secondary_type": "accuracy", "secondary_range": ">", "searching_variable": "scaling_factor", "searching_range": [0.2, 1], "searching_step_size": 0.05, "searching_pas_variable": "pas", "express_path": true, "target_type": "latency", "searching_granularity": null, "using_default_dataset": true, "user_model": "YOLOv8", "using_express_path": true, "express_mode": 0, "use_distillation": false, "use_default_distillation_model": true, "is_training": true}, "train": {"common_save_best_yn": 1, "trained_yn": true, "larger_better": true, "uuid": "9ae04efe-469c-42"}, "compiler": {"input_shape": "(1,3,640,640)", "opset_version": 11, "devices": [{"task_queue_size": 0, "device": {"uuid": "R5CRC1NFW2E", "device_type": "android", "connection_status": "available", "task_status": "idle", "info": {"uuid": "R5CRC1NFW2E", "cpu": "SM8350", "gpu": "Qualcomm, Adreno (TM) 660", "memory": "5.24 GB", "battery": "100", "brand": "samsung", "model": "SM-G990U1", "os_type": "android"}}, "agent_id": "agent-localhost"}], "ios_devices": []}, "distillation": {"distillation_method": "classic_distillation", "enable_ddp": false, "enable_dp": false, "input_shape": null, "original_loss_weights": 0.1, "tag_loss_weights": 0.9, "tag_loss": "kl", "tag_temperature": 4, "tag_loss_combination_method": "avg", "feature_loss_weights": 0.9, "feature_default_temperature": 1, "advance_feature_mapping": {}, "regularization_loss_weights": 1, "regularization_loss_types": [["tag_discriminator", 1]], "discriminator_lr": 0.0001}}'], ['{"origin": {"common_train_epochs": 0, "root_path": "./Xgen/", "pretrain_model_weights_path": "/root/Projects/.checkpoints/yolov8/yolov8n_xgen.pt", "train_data_path": "/data/object-detection-yolov6/coco", "train_label_path": null, "eval_data_path": "/data/object-detection-yolov6/coco", "eval_label_path": null, "learning_rate": 0.01, "batch_size": 16, "data": "/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/coco.yaml", "conf_file": "yolov8n.yaml", "weights": null, "device": null, "imgsz": 640, "width_multiple": 0.5, "depth_multiple": 0.33, "scaling_factor": 1, "workers": 16, "noplots": true, "num_classes": 80, "device_num": 1}, "general": {"user_id": "test", "work_place": "/root/output/YOLOv8_CoCo2017/20231024010820", "random_seed": 3407, "enable_ddp": false, "CUDA_VISIBLE_DEVICES": "0", "tran_scripts_path": null}, "prune": {"sp_store_weights": null, "sp_lars": false, "sp_lars_trust_coef": 0.001, "sp_backbone": false, "sp_retrain": false, "sp_admm": false, "sp_admm_multi": false, "sp_retrain_multi": false, "sp_config_file": null, "sp_subset_progressive": false, "sp_admm_fixed_params": false, "sp_no_harden": false, "nv_sparse": false, "sp_load_prune_params": null, "sp_store_prune_params": null, "generate_rand_seq_gap_yaml": false, "sp_admm_update_epoch": 5, "sp_admm_update_batch": null, "sp_admm_rho": 0.001, "sparsity_type": "block_punched", "sp_admm_lr": 0.01, "admm_debug": false, "sp_global_weight_sparsity": false, "sp_prune_threshold": -1.0, "sp_block_irregular_sparsity": "(0,0)", "sp_block_permute_multiplier": 2, "sp_admm_block": "(8,4)", "sp_admm_buckets_num": 16, "sp_admm_elem_per_row": 1, "sp_admm_tile": null, "sp_admm_select_number": 4, "sp_admm_pattern_row_sub": 1, "sp_admm_pattern_col_sub": 4, "sp_admm_data_format": null, "sp_admm_do_not_permute_conv": false, "sp_gs_output_v": null, "sp_gs_output_ptr": null, "sp_load_frozen_weights": null, "retrain_mask_pattern": "weight", "sp_update_init_method": "weight", "sp_mask_update_freq": 10, "retrain_mask_sparsity": -1.0, "retrain_mask_seed": null, "sp_prune_before_retrain": false, "output_compressed_format": false, "sp_grad_update": false, "sp_grad_decay": 0.98, "sp_grad_restore_threshold": -1, "sp_global_magnitude": false, "sp_pre_defined_mask_dir": null, "sp_prune_ratios": 0}, "quantization": {"qt_aimet": false, "qat": true, "fold_layers": true, "cross_layer_equalization": false, "bias_correction": true, "rounding_mode": "nearest", "num_quant_samples": 1000, "num_bias_correct_samples": 1000, "weight_bw": 8, "act_bw": 8, "quant_scheme": "tf_enhanced", "layers_to_ignore": [], "auto_add_bias": true, "perform_only_empirical_bias_corr": true}, "pas": {"pas_ratio": 0, "pas": false, "limit_loss_weights": 5.0, "use_limit_loss": false, "pas_debug": false, "pas_rebuild": false, "pas_finetune_epoch": 200, "pas_pretrained_weight_path": null, "pas_ignore": ["neck", "detect", "cv"], "pas_searching_ratio": [0.1, 0.2, 0.3]}, "task": {"specific_scenarios": "BasicScaling", "pretrained_model_path": "/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/xgen.pt", "state": {"stage": 1, "cycles": 0}, "max_searching": 10, "args_2": {"cycles": 10}, "args_1": {"cycles": 1}}, "user_requirements": {"power": null, "accuracy": 0.35, "accuracy_reverse_yn": 0, "model_size": null, "memory_size": null, "latency": 75.0, "margin": 7.5, "primary_type": "latency", "primary_range": "<", "secondary_type": "accuracy", "secondary_range": ">", "searching_variable": "scaling_factor", "searching_range": [0.2, 1], "searching_step_size": 0.05, "searching_pas_variable": "pas", "express_path": true, "target_type": "latency", "searching_granularity": null, "using_default_dataset": true, "user_model": "YOLOv8", "using_express_path": true, "express_mode": 0, "use_distillation": false, "use_default_distillation_model": true, "is_training": true}, "train": {"common_save_best_yn": 1, "trained_yn": true, "larger_better": true, "uuid": "00a9c858-6655-4e"}, "compiler": {"input_shape": "(1,3,640,640)", "opset_version": 11, "devices": [{"task_queue_size": 0, "device": {"uuid": "R5CRC1NFW2E", "device_type": "android", "connection_status": "available", "task_status": "idle", "info": {"uuid": "R5CRC1NFW2E", "cpu": "SM8350", "gpu": "Qualcomm, Adreno (TM) 660", "memory": "5.24 GB", "battery": "100", "brand": "samsung", "model": "SM-G990U1", "os_type": "android"}}, "agent_id": "agent-localhost"}], "ios_devices": []}, "distillation": {"distillation_method": "classic_distillation", "enable_ddp": false, "enable_dp": false, "input_shape": null, "original_loss_weights": 0.1, "tag_loss_weights": 0.9, "tag_loss": "kl", "tag_temperature": 4, "tag_loss_combination_method": "avg", "feature_loss_weights": 0.9, "feature_default_temperature": 1, "advance_feature_mapping": {}, "regularization_loss_weights": 1, "regularization_loss_types": [["tag_discriminator", 1]], "discriminator_lr": 0.0001}}'], ['{"origin": {"common_train_epochs": 0, "root_path": "./Xgen/", "pretrain_model_weights_path": "/root/Projects/.checkpoints/yolov8/yolov8m_xgen.pt", "train_data_path": "/data/object-detection-yolov6/coco", "train_label_path": null, "eval_data_path": "/data/object-detection-yolov6/coco", "eval_label_path": null, "learning_rate": 0.01, "batch_size": 16, "data": "/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/coco.yaml", "conf_file": "yolov8m.yaml", "weights": null, "device": null, "imgsz": 640, "width_multiple": 0.5, "depth_multiple": 0.33, "scaling_factor": 1, "workers": 16, "noplots": true, "num_classes": 80, "device_num": 1}, "general": {"user_id": "test", "work_place": "/root/output/YOLOv8_CoCo2017/20231024010820", "random_seed": 3407, "enable_ddp": false, "CUDA_VISIBLE_DEVICES": "0", "tran_scripts_path": null}, "prune": {"sp_store_weights": null, "sp_lars": false, "sp_lars_trust_coef": 0.001, "sp_backbone": false, "sp_retrain": false, "sp_admm": false, "sp_admm_multi": false, "sp_retrain_multi": false, "sp_config_file": null, "sp_subset_progressive": false, "sp_admm_fixed_params": false, "sp_no_harden": false, "nv_sparse": false, "sp_load_prune_params": null, "sp_store_prune_params": null, "generate_rand_seq_gap_yaml": false, "sp_admm_update_epoch": 5, "sp_admm_update_batch": null, "sp_admm_rho": 0.001, "sparsity_type": "block_punched", "sp_admm_lr": 0.01, "admm_debug": false, "sp_global_weight_sparsity": false, "sp_prune_threshold": -1.0, "sp_block_irregular_sparsity": "(0,0)", "sp_block_permute_multiplier": 2, "sp_admm_block": "(8,4)", "sp_admm_buckets_num": 16, "sp_admm_elem_per_row": 1, "sp_admm_tile": null, "sp_admm_select_number": 4, "sp_admm_pattern_row_sub": 1, "sp_admm_pattern_col_sub": 4, "sp_admm_data_format": null, "sp_admm_do_not_permute_conv": false, "sp_gs_output_v": null, "sp_gs_output_ptr": null, "sp_load_frozen_weights": null, "retrain_mask_pattern": "weight", "sp_update_init_method": "weight", "sp_mask_update_freq": 10, "retrain_mask_sparsity": -1.0, "retrain_mask_seed": null, "sp_prune_before_retrain": false, "output_compressed_format": false, "sp_grad_update": false, "sp_grad_decay": 0.98, "sp_grad_restore_threshold": -1, "sp_global_magnitude": false, "sp_pre_defined_mask_dir": null, "sp_prune_ratios": 0}, "quantization": {"qt_aimet": false, "qat": true, "fold_layers": true, "cross_layer_equalization": false, "bias_correction": true, "rounding_mode": "nearest", "num_quant_samples": 1000, "num_bias_correct_samples": 1000, "weight_bw": 8, "act_bw": 8, "quant_scheme": "tf_enhanced", "layers_to_ignore": [], "auto_add_bias": true, "perform_only_empirical_bias_corr": true}, "pas": {"pas_ratio": 0, "pas": false, "limit_loss_weights": 5.0, "use_limit_loss": false, "pas_debug": false, "pas_rebuild": false, "pas_finetune_epoch": 200, "pas_pretrained_weight_path": null, "pas_ignore": ["neck", "detect", "cv"], "pas_searching_ratio": [0.1, 0.2, 0.3]}, "task": {"specific_scenarios": "BasicScaling", "pretrained_model_path": "/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/xgen.pt", "state": {"stage": 1, "cycles": 0}, "max_searching": 10, "args_2": {"cycles": 10}, "args_1": {"cycles": 1}}, "user_requirements": {"power": null, "accuracy": 0.35, "accuracy_reverse_yn": 0, "model_size": null, "memory_size": null, "latency": 75.0, "margin": 7.5, "primary_type": "latency", "primary_range": "<", "secondary_type": "accuracy", "secondary_range": ">", "searching_variable": "scaling_factor", "searching_range": [0.2, 1], "searching_step_size": 0.05, "searching_pas_variable": "pas", "express_path": true, "target_type": "latency", "searching_granularity": null, "using_default_dataset": true, "user_model": "YOLOv8", "using_express_path": true, "express_mode": 0, "use_distillation": false, "use_default_distillation_model": true, "is_training": true}, "train": {"common_save_best_yn": 1, "trained_yn": true, "larger_better": true, "uuid": "96c7ab72-553a-4c"}, "compiler": {"input_shape": "(1,3,640,640)", "opset_version": 11, "devices": [{"task_queue_size": 0, "device": {"uuid": "R5CRC1NFW2E", "device_type": "android", "connection_status": "available", "task_status": "idle", "info": {"uuid": "R5CRC1NFW2E", "cpu": "SM8350", "gpu": "Qualcomm, Adreno (TM) 660", "memory": "5.24 GB", "battery": "100", "brand": "samsung", "model": "SM-G990U1", "os_type": "android"}}, "agent_id": "agent-localhost"}], "ios_devices": []}, "distillation": {"distillation_method": "classic_distillation", "enable_ddp": false, "enable_dp": false, "input_shape": null, "original_loss_weights": 0.1, "tag_loss_weights": 0.9, "tag_loss": "kl", "tag_temperature": 4, "tag_loss_combination_method": "avg", "feature_loss_weights": 0.9, "feature_default_temperature": 1, "advance_feature_mapping": {}, "regularization_loss_weights": 1, "regularization_loss_types": [["tag_discriminator", 1]], "discriminator_lr": 0.0001}}']]
Using express path to find a suitable model...
processing job 1/3
2023-10-24T01:08:25.249064+0000 - INFO - train_module.model_train_main:155 - MKL_THREADING_LAYER=GNU CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 python train_script_main.py
2023-10-24T01:08:25.249362+0000 - DEBUG - train_module.model_train_main:156 - dp mode
2023-10-24T01:08:25.249477+0000 - DEBUG - sys.run_cmd_with_logger:25 - Running command: MKL_THREADING_LAYER=GNU CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 python train_script_main.py
2023-10-24T01:08:29.192012+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-24T01:08:29.192599+0000 - INFO - sys.run_cmd_with_logger:32 -                    from  n    params  module                                       arguments
2023-10-24T01:08:29.212763+0000 - INFO - sys.run_cmd_with_logger:32 -   0                  -1  1       928  ultralytics.nn.modules.conv.Conv             [3, 32, 3, 2]
2023-10-24T01:08:29.213227+0000 - INFO - sys.run_cmd_with_logger:32 -   1                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]
2023-10-24T01:08:29.214555+0000 - INFO - sys.run_cmd_with_logger:32 -   2                  -1  1     29056  ultralytics.nn.modules.block.C2f             [64, 64, 1, True]
2023-10-24T01:08:29.215422+0000 - INFO - sys.run_cmd_with_logger:32 -   3                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]
2023-10-24T01:08:29.218726+0000 - INFO - sys.run_cmd_with_logger:32 -   4                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]
2023-10-24T01:08:29.221483+0000 - INFO - sys.run_cmd_with_logger:32 -   5                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128, 256, 3, 2]
2023-10-24T01:08:29.228793+0000 - INFO - sys.run_cmd_with_logger:32 -   6                  -1  2    788480  ultralytics.nn.modules.block.C2f             [256, 256, 2, True]
2023-10-24T01:08:29.237477+0000 - INFO - sys.run_cmd_with_logger:32 -   7                  -1  1   1180672  ultralytics.nn.modules.conv.Conv             [256, 512, 3, 2]
2023-10-24T01:08:29.251787+0000 - INFO - sys.run_cmd_with_logger:32 -   8                  -1  1   1838080  ultralytics.nn.modules.block.C2f             [512, 512, 1, True]
2023-10-24T01:08:29.258220+0000 - INFO - sys.run_cmd_with_logger:32 -   9                  -1  1    656896  ultralytics.nn.modules.block.SPPF            [512, 512, 5]
2023-10-24T01:08:29.258478+0000 - INFO - sys.run_cmd_with_logger:32 -  10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']
2023-10-24T01:08:29.260018+0000 - INFO - sys.run_cmd_with_logger:32 -  11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]
2023-10-24T01:08:29.264290+0000 - INFO - sys.run_cmd_with_logger:32 -  12                  -1  1    591360  ultralytics.nn.modules.block.C2f             [768, 256, 1]
2023-10-24T01:08:29.265057+0000 - INFO - sys.run_cmd_with_logger:32 -  13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']
2023-10-24T01:08:29.265462+0000 - INFO - sys.run_cmd_with_logger:32 -  14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]
2023-10-24T01:08:29.266906+0000 - INFO - sys.run_cmd_with_logger:32 -  15                  -1  1    148224  ultralytics.nn.modules.block.C2f             [384, 128, 1]
2023-10-24T01:08:29.268544+0000 - INFO - sys.run_cmd_with_logger:32 -  16                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]
2023-10-24T01:08:29.269048+0000 - INFO - sys.run_cmd_with_logger:32 -  17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]
2023-10-24T01:08:29.273959+0000 - INFO - sys.run_cmd_with_logger:32 -  18                  -1  1    493056  ultralytics.nn.modules.block.C2f             [384, 256, 1]
2023-10-24T01:08:29.278476+0000 - INFO - sys.run_cmd_with_logger:32 -  19                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]
2023-10-24T01:08:29.279169+0000 - INFO - sys.run_cmd_with_logger:32 -  20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]
2023-10-24T01:08:29.293942+0000 - INFO - sys.run_cmd_with_logger:32 -  21                  -1  1   1969152  ultralytics.nn.modules.block.C2f             [768, 512, 1]
2023-10-24T01:08:29.314642+0000 - INFO - sys.run_cmd_with_logger:32 -  22        [15, 18, 21]  1   2147008  ultralytics.nn.modules.head.Detect           [80, [128, 256, 512]]
2023-10-24T01:08:39.940493+0000 - INFO - sys.run_cmd_with_logger:32 - YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.7 GFLOPs
2023-10-24T01:08:39.941336+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-24T01:08:40.146653+0000 - INFO - sys.run_cmd_with_logger:32 - New https://pypi.org/project/ultralytics/8.0.200 available 😃 Update with 'pip install -U ultralytics'
2023-10-24T01:08:40.175484+0000 - INFO - sys.run_cmd_with_logger:32 - Ultralytics YOLOv8.0.172 🚀 Python-3.7.16 torch-1.9.1+cu111 CUDA:0 (NVIDIA TITAN V, 12064MiB)
2023-10-24T01:08:40.175983+0000 - INFO - sys.run_cmd_with_logger:32 - WARNING ⚠️ Upgrade to torch>=2.0.0 for deterministic training.
2023-10-24T01:08:40.561716+0000 - INFO - sys.run_cmd_with_logger:32 - �[34m�[1mengine/trainer: �[0mtask=detect, mode=train, model=yolov8s.yaml, data=/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/coco.yaml, epochs=0, patience=0, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=0, workers=16, project=None, name=None, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, vid_stride=1, stream_buffer=False, line_width=None, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, tracker=botsort.yaml, common_train_epochs=0, root_path=./Xgen/, pretrain_model_weights_path=/root/Projects/.checkpoints/yolov8/yolov8s_xgen.pt, train_data_path=/data/object-detection-yolov6/coco, train_label_path=None, eval_data_path=/data/object-detection-yolov6/coco, eval_label_path=None, learning_rate=0.01, batch_size=16, conf_file=yolov8s.yaml, weights=None, width_multiple=0.5, depth_multiple=0.33, scaling_factor=1, noplots=True, num_classes=80, device_num=1, args=Namespace(agnostic_nms=False, amp=True, augment=False, batch=16, batch_size=16, box=7.5, boxes=True, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, common_train_epochs=0, conf=None, conf_file='yolov8s.yaml', copy_paste=0.0, cos_lr=False, data='/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/coco.yaml', degrees=0.0, depth_multiple=0.33, deterministic=True, device=None, device_num=1, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, epochs=0, eval_data_path='/data/object-detection-yolov6/coco', eval_label_path=None, exist_ok=False, fliplr=0.5, flipud=0.0, format='torchscript', fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, label_smoothing=0.0, learning_rate=0.01, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode='train', model=None, momentum=0.937, mosaic=1.0, name=None, nbs=64, nms=False, noplots=True, num_classes=80, opset=None, optimize=False, optimizer='auto', overlap_mask=True, patience=0, perspective=0.0, plots=True, pose=12.0, pretrain_model_weights_path='/root/Projects/.checkpoints/yolov8/yolov8s_xgen.pt', pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, root_path='./Xgen/', save=True, save_conf=False, save_crop=False, save_hybrid=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, scaling_factor=1, seed=0, shear=0.0, show=False, show_conf=True, show_labels=True, simplify=False, single_cls=False, source=None, split='val', stream_buffer=False, task='detect', tracker='botsort.yaml', train_data_path='/data/object-detection-yolov6/coco', train_label_path=None, translate=0.1, val=True, verbose=True, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, weights=None, width_multiple=0.5, workers=16, workspace=4), args_ai={'origin': {'common_train_epochs': 0, 'root_path': './Xgen/', 'pretrain_model_weights_path': '/root/Projects/.checkpoints/yolov8/yolov8s_xgen.pt', 'train_data_path': '/data/object-detection-yolov6/coco', 'train_label_path': None, 'eval_data_path': '/data/object-detection-yolov6/coco', 'eval_label_path': None, 'learning_rate': 0.01, 'batch_size': 16, 'data': '/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/coco.yaml', 'conf_file': 'yolov8s.yaml', 'weights': None, 'device': None, 'imgsz': 640, 'width_multiple': 0.5, 'depth_multiple': 0.33, 'scaling_factor': 1, 'workers': 16, 'noplots': True, 'num_classes': 80, 'device_num': 1}, 'general': {'user_id': 'test', 'work_place': '/root/output/YOLOv8_CoCo2017/20231024010820', 'random_seed': 3407, 'enable_ddp': False, 'CUDA_VISIBLE_DEVICES': '0', 'tran_scripts_path': None}, 'prune': {'sp_store_weights': None, 'sp_lars': False, 'sp_lars_trust_coef': 0.001, 'sp_backbone': False, 'sp_retrain': False, 'sp_admm': False, 'sp_admm_multi': False, 'sp_retrain_multi': False, 'sp_config_file': None, 'sp_subset_progressive': False, 'sp_admm_fixed_params': False, 'sp_no_harden': False, 'nv_sparse': False, 'sp_load_prune_params': None, 'sp_store_prune_params': None, 'generate_rand_seq_gap_yaml': False, 'sp_admm_update_epoch': 5, 'sp_admm_update_batch': None, 'sp_admm_rho': 0.001, 'sparsity_type': 'block_punched', 'sp_admm_lr': 0.01, 'admm_debug': False, 'sp_global_weight_sparsity': False, 'sp_prune_threshold': -1.0, 'sp_block_irregular_sparsity': '(0,0)', 'sp_block_permute_multiplier': 2, 'sp_admm_block': '(8,4)', 'sp_admm_buckets_num': 16, 'sp_admm_elem_per_row': 1, 'sp_admm_tile': None, 'sp_admm_select_number': 4, 'sp_admm_pattern_row_sub': 1, 'sp_admm_pattern_col_sub': 4, 'sp_admm_data_format': None, 'sp_admm_do_not_permute_conv': False, 'sp_gs_output_v': None, 'sp_gs_output_ptr': None, 'sp_load_frozen_weights': None, 'retrain_mask_pattern': 'weight', 'sp_update_init_method': 'weight', 'sp_mask_update_freq': 10, 'retrain_mask_sparsity': -1.0, 'retrain_mask_seed': None, 'sp_prune_before_retrain': False, 'output_compressed_format': False, 'sp_grad_update': False, 'sp_grad_decay': 0.98, 'sp_grad_restore_threshold': -1, 'sp_global_magnitude': False, 'sp_pre_defined_mask_dir': None, 'sp_prune_ratios': 0}, 'quantization': {'qt_aimet': False, 'qat': True, 'fold_layers': True, 'cross_layer_equalization': False, 'bias_correction': True, 'rounding_mode': 'nearest', 'num_quant_samples': 1000, 'num_bias_correct_samples': 1000, 'weight_bw': 8, 'act_bw': 8, 'quant_scheme': 'tf_enhanced', 'layers_to_ignore': [], 'auto_add_bias': True, 'perform_only_empirical_bias_corr': True}, 'pas': {'pas_ratio': 0, 'pas': False, 'limit_loss_weights': 5.0, 'use_limit_loss': False, 'pas_debug': False, 'pas_rebuild': False, 'pas_finetune_epoch': 200, 'pas_pretrained_weight_path': None, 'pas_ignore': ['neck', 'detect', 'cv'], 'pas_searching_ratio': [0.1, 0.2, 0.3]}, 'task': {'specific_scenarios': 'BasicScaling', 'pretrained_model_path': '/root/Projects/object-detection-yolov8/yolov8_xgen/yolov8_config/xgen.pt', 'state': {'stage': 1, 'cycles': 0}, 'max_searching': 10, 'args_2': {'cycles': 10}, 'args_1': {'cycles': 1}}, 'user_requirements': {'power': None, 'accuracy': 0.35, 'accuracy_reverse_yn': 0, 'model_size': None, 'memory_size': None, 'latency': 75.0, 'margin': 7.5, 'primary_type': 'latency', 'primary_range': '<', 'secondary_type': 'accuracy', 'secondary_range': '>', 'searching_variable': 'scaling_factor', 'searching_range': [0.2, 1], 'searching_step_size': 0.05, 'searching_pas_variable': 'pas', 'express_path': True, 'target_type': 'latency', 'searching_granularity': None, 'using_default_dataset': True, 'user_model': 'YOLOv8', 'using_express_path': True, 'express_mode': 0, 'use_distillation': False, 'use_default_distillation_model': True, 'is_training': True}, 'train': {'common_save_best_yn': 1, 'trained_yn': True, 'larger_better': True, 'uuid': '9ae04efe-469c-42'}, 'compiler': {'input_shape': '(1,3,640,640)', 'opset_version': 11, 'devices': [{'task_queue_size': 0, 'device': {'uuid': 'R5CRC1NFW2E', 'device_type': 'android', 'connection_status': 'available', 'task_status': 'idle', 'info': {'uuid': 'R5CRC1NFW2E', 'cpu': 'SM8350', 'gpu': 'Qualcomm, Adreno (TM) 660', 'memory': '5.24 GB', 'battery': '100', 'brand': 'samsung', 'model': 'SM-G990U1', 'os_type': 'android'}}, 'agent_id': 'agent-localhost'}], 'ios_devices': []}}, save_dir=runs/detect/train
2023-10-24T01:08:40.600204+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-24T01:08:40.600468+0000 - INFO - sys.run_cmd_with_logger:32 -                    from  n    params  module                                       arguments
2023-10-24T01:08:40.601073+0000 - INFO - sys.run_cmd_with_logger:32 -   0                  -1  1       928  ultralytics.nn.modules.conv.Conv             [3, 32, 3, 2]
2023-10-24T01:08:40.601264+0000 - INFO - sys.run_cmd_with_logger:32 -   1                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]
2023-10-24T01:08:40.603408+0000 - INFO - sys.run_cmd_with_logger:32 -   2                  -1  1     29056  ultralytics.nn.modules.block.C2f             [64, 64, 1, True]
2023-10-24T01:08:40.604276+0000 - INFO - sys.run_cmd_with_logger:32 -   3                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]
2023-10-24T01:08:40.606853+0000 - INFO - sys.run_cmd_with_logger:32 -   4                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]
2023-10-24T01:08:40.608873+0000 - INFO - sys.run_cmd_with_logger:32 -   5                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128, 256, 3, 2]
2023-10-24T01:08:40.614586+0000 - INFO - sys.run_cmd_with_logger:32 -   6                  -1  2    788480  ultralytics.nn.modules.block.C2f             [256, 256, 2, True]
2023-10-24T01:08:40.623840+0000 - INFO - sys.run_cmd_with_logger:32 -   7                  -1  1   1180672  ultralytics.nn.modules.conv.Conv             [256, 512, 3, 2]
2023-10-24T01:08:40.634369+0000 - INFO - sys.run_cmd_with_logger:32 -   8                  -1  1   1838080  ultralytics.nn.modules.block.C2f             [512, 512, 1, True]
2023-10-24T01:08:40.638723+0000 - INFO - sys.run_cmd_with_logger:32 -   9                  -1  1    656896  ultralytics.nn.modules.block.SPPF            [512, 512, 5]
2023-10-24T01:08:40.639013+0000 - INFO - sys.run_cmd_with_logger:32 -  10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']
2023-10-24T01:08:40.639503+0000 - INFO - sys.run_cmd_with_logger:32 -  11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]
2023-10-24T01:08:40.643705+0000 - INFO - sys.run_cmd_with_logger:32 -  12                  -1  1    591360  ultralytics.nn.modules.block.C2f             [768, 256, 1]
2023-10-24T01:08:40.644311+0000 - INFO - sys.run_cmd_with_logger:32 -  13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']
2023-10-24T01:08:40.644508+0000 - INFO - sys.run_cmd_with_logger:32 -  14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]
2023-10-24T01:08:40.645914+0000 - INFO - sys.run_cmd_with_logger:32 -  15                  -1  1    148224  ultralytics.nn.modules.block.C2f             [384, 128, 1]
2023-10-24T01:08:40.647037+0000 - INFO - sys.run_cmd_with_logger:32 -  16                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]
2023-10-24T01:08:40.647294+0000 - INFO - sys.run_cmd_with_logger:32 -  17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]
2023-10-24T01:08:40.651157+0000 - INFO - sys.run_cmd_with_logger:32 -  18                  -1  1    493056  ultralytics.nn.modules.block.C2f             [384, 256, 1]
2023-10-24T01:08:40.655821+0000 - INFO - sys.run_cmd_with_logger:32 -  19                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]
2023-10-24T01:08:40.656210+0000 - INFO - sys.run_cmd_with_logger:32 -  20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]
2023-10-24T01:08:40.671960+0000 - INFO - sys.run_cmd_with_logger:32 -  21                  -1  1   1969152  ultralytics.nn.modules.block.C2f             [768, 512, 1]
2023-10-24T01:08:40.687185+0000 - INFO - sys.run_cmd_with_logger:32 -  22        [15, 18, 21]  1   2147008  ultralytics.nn.modules.head.Detect           [80, [128, 256, 512]]
2023-10-24T01:08:51.878774+0000 - INFO - sys.run_cmd_with_logger:32 - YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.7 GFLOPs
2023-10-24T01:08:51.879297+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-24T01:08:51.885047+0000 - INFO - sys.run_cmd_with_logger:32 - �[34m�[1mTensorBoard: �[0mStart with 'tensorboard --logdir runs/detect/train', view at http://localhost:6006/
2023-10-24T01:08:55.739030+0000 - INFO - sys.run_cmd_with_logger:32 - Freezing layer 'model.22.dfl.conv.weight'
2023-10-24T01:09:26.625978+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-24T01:09:26.626206+0000 - INFO - sys.run_cmd_with_logger:32 - �[34m�[1mtrain: �[0mScanning /data/object-detection-yolov6/coco/labels/train2017.cache... 117266 images, 1021 backgrounds, 0 corrupt: 100%|██████████| 118287/118287 [00:00<?, ?it/s]�[0m
2023-10-24T01:09:28.232390+0000 - INFO - sys.run_cmd_with_logger:32 - �[34m�[1mtrain: �[0mScanning /data/object-detection-yolov6/coco/labels/train2017.cache... 117266 images, 1021 backgrounds, 0 corrupt: 100%|██████████| 118287/118287 [00:00<?, ?it/s]�[0m
2023-10-24T01:09:30.203838+0000 - INFO - sys.run_cmd_with_logger:32 - 
2023-10-24T01:09:30.204406+0000 - INFO - sys.run_cmd_with_logger:32 - �[34m�[1mval: �[0mScanning /data/object-detection-yolov6/coco/labels/val2017.cache... 4952 images, 48 backgrounds, 0 corrupt: 100%|██████████| 5000/5000 [00:00<?, ?it/s]�[0m
2023-10-24T01:09:30.258672+0000 - INFO - sys.run_cmd_with_logger:32 - �[34m�[1mval: �[0mScanning /data/object-detection-yolov6/coco/labels/val2017.cache... 4952 images, 48 backgrounds, 0 corrupt: 100%|██████████| 5000/5000 [00:00<?, ?it/s]�[0m
2023-10-24T01:09:31.119089+0000 - INFO - sys.run_cmd_with_logger:32 - Plotting labels to runs/detect/train/labels.jpg...
2023-10-24T01:09:35.704396+0000 - INFO - sys.run_cmd_with_logger:32 - �[34m�[1moptimizer:�[0m AdamW(lr=0.000119, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
2023-10-24T01:09:35.704949+0000 - INFO - sys.run_cmd_with_logger:32 - model is not a DataParallel model
2023-10-24T01:09:35.705215+0000 - INFO - sys.run_cmd_with_logger:32 - can't find file in /root/Projects/.checkpoints/yolov8/yolov8s_xgen.pt
2023-10-24T01:09:35.939356+0000 - INFO - sys.run_cmd_with_logger:32 - Traceback (most recent call last):
2023-10-24T01:09:35.939619+0000 - INFO - sys.run_cmd_with_logger:32 -   File "train_script_main.py", line 25, in <module>
2023-10-24T01:09:35.939834+0000 - INFO - sys.run_cmd_with_logger:32 -     training_main(args_ai=None)
2023-10-24T01:09:35.940261+0000 - INFO - sys.run_cmd_with_logger:32 -   File "train_script_main.py", line 21, in training_main
2023-10-24T01:09:35.940546+0000 - INFO - sys.run_cmd_with_logger:32 -     model.train(data=args.data, batch=args.batch, args=args, args_ai=args_ai, device=args_ai['general']['CUDA_VISIBLE_DEVICES'])
2023-10-24T01:09:35.940805+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/root/output/YOLOv8_CoCo2017/20231024010820/yolov8_xgen/ultralytics/engine/model.py", line 357, in train
2023-10-24T01:09:35.941075+0000 - INFO - sys.run_cmd_with_logger:32 -     self.trainer.train()
2023-10-24T01:09:35.941430+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/root/output/YOLOv8_CoCo2017/20231024010820/yolov8_xgen/ultralytics/engine/trainer.py", line 204, in train
2023-10-24T01:09:35.941700+0000 - INFO - sys.run_cmd_with_logger:32 -     self._do_train(world_size)
2023-10-24T01:09:35.941985+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/root/output/YOLOv8_CoCo2017/20231024010820/yolov8_xgen/ultralytics/engine/trainer.py", line 314, in _do_train
2023-10-24T01:09:35.942273+0000 - INFO - sys.run_cmd_with_logger:32 -     self._setup_train(world_size)
2023-10-24T01:09:35.942505+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/root/output/YOLOv8_CoCo2017/20231024010820/yolov8_xgen/ultralytics/engine/trainer.py", line 301, in _setup_train
2023-10-24T01:09:35.942840+0000 - INFO - sys.run_cmd_with_logger:32 -     xgen_load(self.model, args_ai=self.args_ai)
2023-10-24T01:09:35.943866+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_tools-1.0.9-py3.7.egg/xgen_tools/helper.py", line 94, in __call__
2023-10-24T01:09:35.944016+0000 - INFO - sys.run_cmd_with_logger:32 -   File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_tools-1.0.9-py3.7.egg/xgen_tools/xgen_load.py", line 846, in xgen_load
2023-10-24T01:09:35.944200+0000 - INFO - sys.run_cmd_with_logger:32 - FileNotFoundError
Traceback (most recent call last):
  File "xgen_scripts.py", line 109, in main
    training(training_main, training_script_path=training_script_path, log_path=log_path)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/training/core.py", line 449, in training
    internal_data = train_module(job, training_main)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/training/train_module.py", line 184, in train_module
    args_ai = model_train_main(job, training_main)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/training/train_module.py", line 163, in model_train_main
    raise Exception('Training failed')
Exception: Training failed
2023-10-24T01:09:37.585222+0000 - ERROR - xgen_scripts.main:116 - Error found. Please check log file at /root/output/YOLOv8_CoCo2017/20231024010820/xgen-training.log
2023-10-24T01:09:37.585508+0000 - ERROR - xgen_scripts.main:117 - Cancel started session.

test only

This is just a test to ensure that there is no permission problems for issue reporting.

More clear about ADB settings

I guess this could be more explicit about where the commands are executed. First, it should connect the devices to a local machine (in most cases), which may be confusing for those unfamiliar with mobile development. It might be clearer if we added a phrase like "On the local/host machine" to the first figure.

First

image

Second

image

support to choose GPU for custom AI

Unlike other common AI tasks where we can choose which GPU to use, custom AI does not provide this option and it seems to use the GPU0 always. Users can not proceed with custom AI if GPU0 has been utilized. Maybe it is better to provide the same option to choose which GPU to use like common AI.

issues in custom ai

When I use custom ai, I see the following problem.
2022-09-17-04-12-44-log.txt
It seems to ask for Key: 'global_sparsity', which is not available.

The setting is attached.
1122
I tried other settings such as compatibility test and pruning, the problem still happens.

Here is my original train script and xgen.json.
scripts.zip

Redis docker fail to start

Dear Author,

I met the problem when I start the XGen.
The docker of registry.cn-beijing.aliyuncs.com/cocopie_development/redis:7.0 is failed to start.
It will keep restarting again and again.

Here is the docker ps -a restult:

2adb82c19ca9   registry.cn-beijing.aliyuncs.com/cocopie_development/redis:7.0                                             "docker-entrypoint.s…"   11 seconds ago   Restarting (1) 2 seconds ago                                                                                                   xgen_redis

Here is the log information inside docker log:

*  Executing task: docker logs --tail 1000 -f 2adb82c19ca905fc8c370c7dcb983822b2b562166c65f22fabcfe26f78376336 

1:C 27 Oct 2023 20:35:51.365 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 27 Oct 2023 20:35:51.365 # Redis version=7.0.11, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 27 Oct 2023 20:35:51.365 # Configuration loaded
1:M 27 Oct 2023 20:35:51.365 * monotonic clock: POSIX clock_gettime
1:M 27 Oct 2023 20:35:51.368 * Running mode=standalone, port=6379.
1:M 27 Oct 2023 20:35:51.368 # Server initialized
1:M 27 Oct 2023 20:35:51.368 * Reading RDB base file on AOF loading...
1:M 27 Oct 2023 20:35:51.368 * Loading RDB produced by version 7.0.11
1:M 27 Oct 2023 20:35:51.368 * RDB age 47860 seconds
1:M 27 Oct 2023 20:35:51.368 * RDB memory usage when created 1.33 Mb
1:M 27 Oct 2023 20:35:51.368 * RDB is base AOF
1:M 27 Oct 2023 20:35:51.368 * Done loading RDB, keys loaded: 54, keys expired: 0.
1:M 27 Oct 2023 20:35:51.368 * DB loaded from base file appendonly.aof.8.base.rdb: 0.000 seconds
1:M 27 Oct 2023 20:35:51.686 # Bad file format reading the append only file appendonly.aof.8.incr.aof: make a backup of your AOF file, then use ./redis-check-aof --fix <filename.manifest>
1:C 27 Oct 2023 20:35:53.454 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 27 Oct 2023 20:35:53.454 # Redis version=7.0.11, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 27 Oct 2023 20:35:53.454 # Configuration loaded
1:M 27 Oct 2023 20:35:53.455 * monotonic clock: POSIX clock_gettime
1:M 27 Oct 2023 20:35:53.457 * Running mode=standalone, port=6379.
1:M 27 Oct 2023 20:35:53.457 # Server initialized
1:M 27 Oct 2023 20:35:53.457 * Reading RDB base file on AOF loading...
1:M 27 Oct 2023 20:35:53.457 * Loading RDB produced by version 7.0.11
1:M 27 Oct 2023 20:35:53.457 * RDB age 47862 seconds
1:M 27 Oct 2023 20:35:53.457 * RDB memory usage when created 1.33 Mb
1:M 27 Oct 2023 20:35:53.457 * RDB is base AOF
1:M 27 Oct 2023 20:35:53.457 * Done loading RDB, keys loaded: 54, keys expired: 0.
1:M 27 Oct 2023 20:35:53.457 * DB loaded from base file appendonly.aof.8.base.rdb: 0.000 seconds
1:M 27 Oct 2023 20:35:53.821 # Bad file format reading the append only file appendonly.aof.8.incr.aof: make a backup of your AOF file, then use ./redis-check-aof --fix <filename.manifest>
1:C 27 Oct 2023 20:35:54.393 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 27 Oct 2023 20:35:54.393 # Redis version=7.0.11, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 27 Oct 2023 20:35:54.393 # Configuration loaded
1:M 27 Oct 2023 20:35:54.393 * monotonic clock: POSIX clock_gettime
1:M 27 Oct 2023 20:35:54.396 * Running mode=standalone, port=6379.
1:M 27 Oct 2023 20:35:54.396 # Server initialized
1:M 27 Oct 2023 20:35:54.397 * Reading RDB base file on AOF loading...
1:M 27 Oct 2023 20:35:54.397 * Loading RDB produced by version 7.0.11
1:M 27 Oct 2023 20:35:54.397 * RDB age 47863 seconds
1:M 27 Oct 2023 20:35:54.397 * RDB memory usage when created 1.33 Mb
1:M 27 Oct 2023 20:35:54.397 * RDB is base AOF
1:M 27 Oct 2023 20:35:54.397 * Done loading RDB, keys loaded: 54, keys expired: 0.
1:M 27 Oct 2023 20:35:54.397 * DB loaded from base file appendonly.aof.8.base.rdb: 0.000 seconds
1:M 27 Oct 2023 20:35:54.655 # Bad file format reading the append only file appendonly.aof.8.incr.aof: make a backup of your AOF file, then use ./redis-check-aof --fix <filename.manifest>

Thanks.
Best,
Hsin-Hsuan

inconsistent usage between specifications and example codes

In the manual with the link https://xgen.cocopie.ai/v1.1.0/4_Usage/#step-iii-prepare-training-scripts , in the part of Step III: Prepare Training Scripts in custom AI, in step 3, it claims that we should use the following,

originalArgs, args_ai = xgen_init(user_args, args_ai, COCOPIE_MAP)

But in the following example code, it becomes

user_args,args_ai = xgen_init(orginalArgs,map = COCOPIE_MAP)

They are inconsistent.

Besides, it has a new variable user_args. But in the optimizer, it still uses lr=args.learning_rate where args is not user_args or orginalArgs.

License expiration error

Problem

The license of XGen is valid as the start of XGen shows "Your license has 90 days left". But at the end of compatibility test, the following error appeared:

ERROR: test on device "RF8MA1SEARB" failed
The license for XGen inference engine has expired.

Setup

I was remotely logged into eb2-3226-lin07 and ran XGen on that remote machine. But the smartphone was connected to my local Mac laptop.

Log

[xshen5@eb2-3226-lin07:> run_xgen
Existing XGen containers found.
Running multiple instances could result in file damage if multiple users share one XGen Data or Projects directory.
Successfully entering XGen environment. Next, you can type command "XGen" to interact with the powerful toolchain.
(xgen) root@eb2-3226-lin07:> XGen

XGen CLI by CoCoPIE
XGen(v1.0.17)

A DNN optimizer built on compression-compilation co-design.
It makes AI deployment sweet as pie by simultaneously
optimizing model size, accuracy, speed, and more.

© 2022 CoCoPIE Inc. All Rights Reserved.

Third Party Software License Disclaimer In:
/root/third_party_license.txt

Your license has 90 days left.

Please choose your device(s):

[*] RF8MA1SEARB model:SM_G973U1
Press or for multi-selection, and or letter key and to move, to accept.

Please choose the base model (and the default dataset) to start with:
Image classification I: EfficientNet (ImageNet)
Image classification II: ResNet (ImageNet)

Image classification III: ResNet (Cifar)
Image classification IV: MobileNet (ImageNet)
Object detection: Yolov5 (CoCo)
Segmentation: UNet (ISBI-2012)
Video classification: R2+1d (UCF101)
Video super resolution: WDSR (DIV2K)
Your own model

Do you choose to download common AI model(ResNet) now?

Yes
No

Will you use the default dataset: Cifar?

Yes
No

What do you want to do? (Pruning recommended):

Compatibility test
Pruning
Scaling
Customized operation

Which GPU(s) do you want to use:
[ ] GPU 0: Quadro K600 (UUID: GPU-e08cba0c-cb67-dd6b-54ec-adaf92784537)

[*] GPU 1: NVIDIA GeForce GTX TITAN X (UUID: GPU-cc712512-b07f-d448-0baf-293e495924ec)
Press or for multi-selection, and or letter key and to move, to accept.

What is the batch size per GPU: 128
What is the learning rate: 0.01
How many epochs: 200
You chose "Compatibility test". We will create a temporary workplace for you, and will delete it after the testing.
2022-10-01 03:23:05,142 - root - INFO - AIMET
Your current workplace is /tmp/8b7f7da48acd4ed18a60d8d4ab3744c5
A new search is started!
config summary********************
xgen-config-path: /tmp/8b7f7da48acd4ed18a60d8d4ab3744c5/xgen_config.json
xgen-workplace: /tmp/8b7f7da48acd4ed18a60d8d4ab3744c5
xgen-resume: False
xgen-mode: compatible_testing
xgen-pretrained-model-path: ./checkpoint/ckpt.pth
Detail args: {'origin': {'common_train_epochs': 200, 'root_path': './workplace/', 'pretrain_model_weights_path': None, 'train_data_path': './data', 'train_label_path': None,
'eval_data_path': './data', 'eval_label_path': None, 'scaling_factor': 2, 'num_classes': 10, 'batch_size': 128, 'learning_rate': 0.01}, 'general': {'user_id': 'test', 'work_place':
'/tmp/8b7f7da48acd4ed18a60d8d4ab3744c5', 'random_seed': 3407, 'enable_ddp': False, 'CUDA_VISIBLE_DEVICES': '1', 'tran_scripts_path': None}, 'prune': {'sp_store_weights': None, 'sp_lars': False,
'sp_lars_trust_coef': 0.001, 'sp_backbone': False, 'sp_retrain': False, 'sp_admm': False, 'sp_admm_multi': False, 'sp_retrain_multi': False, 'sp_config_file': None, 'sp_subset_progressive':
False, 'sp_admm_fixed_params': False, 'sp_no_harden': False, 'nv_sparse': False, 'sp_load_prune_params': None, 'sp_store_prune_params': None, 'generate_rand_seq_gap_yaml': False,
'sp_admm_update_epoch': 5, 'sp_admm_update_batch': None, 'sp_admm_rho': 0.001, 'sparsity_type': 'block_punched', 'sp_admm_lr': 0.01, 'admm_debug': False, 'sp_global_weight_sparsity': False,
'sp_prune_threshold': -1.0, 'sp_block_irregular_sparsity': '(0,0)', 'sp_block_permute_multiplier': 2, 'sp_admm_block': '(8,4)', 'sp_admm_buckets_num': 16, 'sp_admm_elem_per_row': 1,
'sp_admm_tile': None, 'sp_admm_select_number': 4, 'sp_admm_pattern_row_sub': 1, 'sp_admm_pattern_col_sub': 4, 'sp_admm_data_format': None, 'sp_admm_do_not_permute_conv': False, 'sp_gs_output_v':
None, 'sp_gs_output_ptr': None, 'sp_load_frozen_weights': None, 'retrain_mask_pattern': 'weight', 'sp_update_init_method': 'weight', 'sp_mask_update_freq': 10, 'retrain_mask_sparsity': -1.0,
'retrain_mask_seed': None, 'sp_prune_before_retrain': False, 'output_compressed_format': False, 'sp_grad_update': False, 'sp_grad_decay': 0.98, 'sp_grad_restore_threshold': -1,
'sp_global_magnitude': False, 'sp_pre_defined_mask_dir': None, 'sp_prune_ratios': 0, 'admm_sparsity_type': 'block_punched', 'admm_block': '(8,4)', 'prune_threshold': -1.0}, 'quantization':
{'qt_aimet': False, 'qat': True, 'fold_layers': True, 'cross_layer_equalization': False, 'bias_correction': True, 'rounding_mode': 'nearest', 'num_quant_samples': 1000,
'num_bias_correct_samples': 1000, 'weight_bw': 8, 'act_bw': 8, 'quant_scheme': 'tf_enhanced', 'layers_to_ignore': [], 'auto_add_bias': True, 'perform_only_empirical_bias_corr': True}, 'task':
{'specific_scenarios': 'BasicTest', 'pretrained_model_path': './checkpoint/ckpt.pth', 'state': {'stage': 0, 'cycles': 0}, 'max_searching': 10}, 'user_requirements': {'power': None, 'accuracy':
None, 'accuracy_reverse_yn': 0, 'model_size': None, 'memory_size': None, 'latency': 0.1, 'margin': 0.1, 'primary_type': 'latency', 'primary_range': '>', 'secondary_type': 'accuracy',
'secondary_range': '<', 'searching_variable': 'scaling_factor', 'searching_range': [1, 23], 'searching_step_size': 1, 'target_type': 'latency'}, 'train': {'common_save_best_yn': 1, 'trained_yn':
False, 'larger_better': True}, 'compiler': {'input_shape': '(1, 3, 32, 32)', 'opset_version': 11, 'devices': ['RF8MA1SEARB']}, 'distillation': {'distillation_method': None, 'enable_ddp': False,
'enable_dp': False, 'input_shape': None, 'original_loss_weights': 0.1, 'tag_loss_weights': 0.9, 'tag_loss': 'kl', 'tag_temperature': 4, 'tag_loss_combination_method': 'avg',
'feature_loss_weights': 0.9, 'feature_default_temperature': 1, 'advance_feature_mapping': {}, 'regularization_loss_weights': 1, 'regularization_loss_types': [], 'discriminator_lr': 0.0001}}
Current search has 1 stages
Stage: 1
Max search cycles: 1
config summary********************
Current state: stage: 1/1| cycles: 1/1
Total jobs:1
processing job 1/1
Training...
MKL_THREADING_LAYER=GNU CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=1 python train_script_main.py
2022-10-01 03:23:06,951 - root - INFO - AIMET
==> Preparing data..
==> Building model..
load model from ./checkpoint/ckpt.pth
original:
testing top-1 accuracy is 95.47
2022-10-01 03:23:35,398 - CoLib - INFO - Start CoLib.init
2022-10-01 03:23:35,400 - PruneOptimizer - INFO - Start PruneOptimizer.init
2022-10-01 03:23:35,400 - QuantizationOptimizer - INFO - Start QuantizationOptimizer.init
2022-10-01 03:23:35,401 - CoLib - INFO - available compression algrithm is {'PruneOptimizer': ['Admm', 'Magnitude'], 'QuantizationOptimizer': ['AimetQt']}
2022-10-01 03:23:35,401 - CoLib - INFO - If you find some algorithm was not shown here, please check whether fully installed the co_lib or include all co_lib related package
2022-10-01 03:23:35,401 - CoLib - INFO - Applied algorithm is {}
after harden:
testing top-1 accuracy is 95.47
testing top-1 accuracy is 95.47
[W NNPACK.cpp:79] Could not initialize NNPACK! Reason: Unsupported hardware.
Compiling...
Onnx file path: /tmp/8b7f7da48acd4ed18a60d8d4ab3744c5/4c3ce2ae-9b6f-41.onnx
INFO: running model, _4c3ce2ae_9b6f_41, on device RF8MA1SEARB
[W init.h:137] Caffe2 GlobalInit should be run before any other API calls.
2022-10-01 03:23:45,170 - onnx_optimizer.base.onnx_model - INFO - Sort graphs in topological order
2022-10-01 03:23:45,171 - onnx_optimizer.base.onnx_model - INFO - Output model to /tmp/tmp99i1h8ee
2022-10-01 03:23:45,632 - onnx_optimizer.base.onnx_model - INFO - Sort graphs in topological order
2022-10-01 03:23:45,633 - onnx_optimizer.base.onnx_model - INFO - Output model to /tmp/tmp99i1h8ee
2022-10-01 03:23:45,810 - onnx_optimizer.base.onnx_model - INFO - Sort graphs in topological order
2022-10-01 03:23:45,811 - onnx_optimizer.base.onnx_model - INFO - Output model to /tmp/tmpyr20_2gw
ERROR: test on device "RF8MA1SEARB" failed
The license for XGen inference engine has expired.
(xgen) root@eb2-3226-lin07:~#

Compatibility issue between XGen and OpenCV

Steps to reproduce:

  1. Download OpenCV version 4.6.0 for android platform
  2. Go to the XGen Android demo repository
  3. Incorporate the downloaded OpenCV into the project
  4. Try using any OpenCV functionalities by creating a new C++ native file
  5. The application will exit with fatal error coming from libxgen.so

Could this be a dependency issue with OpenMP version?

XGen folder is not created after install command

All previous steps worked well, except the bash xgen_docker_deploy.sh --install command.
Output from running this command: Tools checking passed.
But no XGen folder is created under the home directory.

Fail to start the mongo and redis docker

Dear authors,

The version of XGen I use is v1.3.0.
After installation, I got this error when I execute the command "xgen_run":

Starting XGen environment...

Traceback (most recent call last):
  File "/usr/local/bin/init_aio_db", line 8, in <module>
    sys.exit(init_aio_db())
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/xgen_controller_dashboard/init_db.py", line 25, in init_aio_db
    asyncio.run(do_init())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/site-packages/xgen_controller_dashboard/init_db.py", line 17, in do_init
    await initiate_database()
  File "/usr/local/lib/python3.10/site-packages/xgen_controller_dashboard/app/database.py", line 25, in initiate_database
    await init_beanie(
  File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/init.py", line 530, in init_beanie
    await Initializer(
  File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/init.py", line 89, in __await__
    yield from self.init_class(model).__await__()
  File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/init.py", line 500, in init_class
    await self.init_document(cls)
  File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/init.py", line 324, in init_document
    build_info = await self.database.command({"buildInfo": 1})
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/site-packages/pymongo/_csot.py", line 105, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/pymongo/database.py", line 805, in command
    with self.__client._socket_for_reads(read_preference, session) as (
  File "/usr/local/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1296, in _socket_for_reads
    server = self._select_server(read_preference, session)
  File "/usr/local/lib/python3.10/site-packages/pymongo/mongo_client.py", line 1257, in _select_server
    server = topology.select_server(server_selector)
  File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 272, in select_server
    server = self._select_server(selector, server_selection_timeout, address)
  File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 261, in _select_server
    servers = self.select_servers(selector, server_selection_timeout, address)
  File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 223, in select_servers
    server_descriptions = self._select_servers_loop(selector, server_timeout, address)
  File "/usr/local/lib/python3.10/site-packages/pymongo/topology.py", line 238, in _select_servers_loop
    raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: xgen_mongodb:27017: [Errno -3] Temporary failure in name resolution, Timeout: 30s, Topology Description: <TopologyDescription id: 653421f3231a643eb83f8ec8, topology_type: Unknown, servers: [<ServerDescription ('xgen_mongodb', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('xgen_mongodb:27017: [Errno -3] Temporary failure in name resolution')>]>

I still can execute xgen_run in the second time but getting the connection error with redis:

Starting XGen environment...

Successfully entered the XGen environment. Next, you can type command "XGen" to interact with the powerful toolchain.
(xgen) root@hsung2-MS-7A45:~# XGen
Traceback (most recent call last):
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 716, in connect
    lambda: self._connect(), lambda error: self.disconnect(error)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/retry.py", line 46, in call_with_retry
    return do()
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 716, in <lambda>
    lambda: self._connect(), lambda error: self.disconnect(error)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 781, in _connect
    raise err
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 769, in _connect
    sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda3/envs/xgen/bin/XGen", line 33, in <module>
    sys.exit(load_entry_point('xgen-main==1.2.3', 'console_scripts', 'XGen')())
  File "/usr/local/miniconda3/envs/xgen/bin/XGen", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/importlib_metadata/__init__.py", line 209, in load
    module = import_module(match.group('module'))
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/main.py", line 5, in <module>
    from xgen.device_lab.cancel_job import cancel_job
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/device_lab/__init__.py", line 1, in <module>
    from .helper import get_all_devices
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/device_lab/helper.py", line 6, in <module>
    from xgen.utils.redis_ipc import RedisIPC
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/utils/redis_ipc.py", line 14, in <module>
    class RedisIPC:
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.2.3-py3.7.egg/xgen/utils/redis_ipc.py", line 24, in RedisIPC
    r.setnx(Config.device_status_key, json.dumps(None))
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/commands/core.py", line 2335, in setnx
    return self.execute_command("SETNX", name, value)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/client.py", line 1255, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 1481, in get_connection
    connection.connect()
  File "/usr/local/miniconda3/envs/xgen/lib/python3.7/site-packages/redis-4.5.1-py3.7.egg/redis/connection.py", line 721, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to localhost:12379. Connection refused.

Finally, I use xgencrtl status and it gave me this information:

root@8b9f28b24409:~# xgenctl status 
Settings check passed.
Redis connection failed.
Redis connection failed. Please ensure the xgen_redis container is running.
MQ connection failed.
Supervisor services check failed: ['queue-monitor', 'packer', 'unpacker', 'device-monitor', 'controllerd']
Supervisor services check failed. Please ensure the configuration file located at`$HOME/.xgen_controller/controller.conf` is correct. Then run `xgenctl restart`.

Here is mongo and redis docker status in docker ps -a:

c334021f1b0a   271679491055.dkr.ecr.us-east-1.amazonaws.com/mongo:6                                               "docker-entrypoint.s…"   20 minutes ago   Restarting (14) 32 seconds ago                                                                                                   xgen_mongodb
0d7bdc7a8cc2   271679491055.dkr.ecr.us-east-1.amazonaws.com/redis:7.0                                             "docker-entrypoint.s…"   20 minutes ago   Restarting (1) 37 seconds ago                                                                                                    xgen_redis

What do I miss to do the correct installation?

Connectivity issue during XGen installation.

Executing command:

bash xgen_docker_deploy.sh --install --region HK

Issues:

  1. The bash script to install XGen may or may not report errors due to connectivity issues. I.e., The script quit without error messages even though the installation procedure is not finished.
  2. It takes at least hours to download the XGen docker image from China's mainland.

Suggestion regards to Xgen installation documentation

1.Deploy. 让user在选择地址的时候选择server所在的地址, 如果server在国内, 要--region HK不能--region US. 我之前选择了--region US就不能用了.
2.在文档里加入chmod +x, 避免access denied
3.加入telnet telnet 127.0.0.1 28000测试, 如果测试没有通过, 必须multiple machine安装, 而不能single machine安装, 避免用户在不具备single machine安装的要求却依然选择这种安装方式.
4.升级问题. 如果controller和agent升级了, 但是Xgen还没有升级就跑不起来, 所以在这种情况下是否升级已经不是用户的选择了, 而是必须要做的, 建议在文档中"How to start"部分加入什么时候要升级. 如果只有一个“How to Upgrade”的section的话会让用户觉得是否升级都可以.
5.“uninstall_xgen”这个功能在最新版本里好像还没有.
6.我第一次装agent的时候, config文件里一开始就是是“xgen_controller_port=55672 # do not change this”. port其实应该改成28000的, 但是后面“ # do not change this”这个注释就有点confusing

The training procedure may report laytency with a negative value.

Compiling...
Onnx file path: /root/output/ResNet_Cifar/20220915114338/a440521b-ac79-49.onnx
Current Phase
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Quality Score ┃ Latency(ms) ┃ Model Size ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ 10.0 │ -1.1664394710837795 │ 11.174M │
└───────────────┴─────────────────────┴────────────┘
Current state: stage: 3/3| cycles: 1/1
Total jobs:1
processing job 1/1
Training...
MKL_THREADING_LAYER=GNU CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 python train_script_main.py
2022-09-15 11:45:05,874 - root - INFO - AIMET
==> Preparing data..
==> Building model..
load model from /root/output/ResNet_Cifar/20220915114338/5352643d-643f-4e.pth

Error when testing speed of ONNX

When I try to evaluate latency of a custom ONNX model using command XGen test-onnx-latency --model-path [onnx-model-path] --output-path [output-path] , it will always failed. I put the log in result.csv XGen generates here for your reference.

onnx_latency,fallback_latency,output_dir,slowest_device,all,input_shape,output_shape,error,device,file_name,file_path,pruning,time_cost,params,IR,MACs
,,,,,,,[Errno 2] No such file or directory: '/tmp/xgen/compiler/generated/deepvan_run',RFCR40CWRQD,yolov6l_relu.onnx,/data/onnx/yolov6l_relu.onnx,1.0,43.25611686706543,58582735,212868400,72127269600.0
inf,inf,/root/output/e5873d1c-50ea-48,RFCR40CWRQD,"{'RFCR40CWRQD': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'F deepvan/core/operator.cc:246]  pc 0x5c799e746c \nF deepvan/core/operator.cc:246]  pc 0x5c799e60e0 \nF deepvan/core/operator.cc:246]  pc 0x5c79763674 \nF deepvan/core/operator.cc:246]  pc 0x5c7975c5fc \nF deepvan/core/operator.cc:246]  pc 0x5c7975ba20 \nF deepvan/core/operator.cc:246]  pc 0x5c797471f0 \nF deepvan/core/operator.cc:246]  pc 0x5c79746a48 \nF deepvan/core/operator.cc:246]  pc 0x5c79740908 \nF deepvan/core/operator.cc:246]  pc 0x5c797421fc \nF deepvan/core/operator.cc:246]  pc 0x7987ae01e8 __libc_init\n\n\n     capability(CPU)        init      warmup     run_avg\n========================================================\n                 10          20.1       30.2        inf\n/tmp/xgen/compiler/generated/deepvan_run ~\n~\nINFO: Created TensorFlow Lite delegate for GPU.\nINFO: Created TensorFlow Lite XNNPACK delegate for CPU.\nINFO: Initialized TensorFlow Lite runtime.\nINFO: Replacing 409 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions.\nERROR: TfLiteGpuDelegate Init: ADD: Expected a 3D tensor of shape HxWxC or a 4D tensor of shape 1xHxWxC but got 8400x2\nINFO: Created 0 GPU delegate kernels.\nERROR: TfLiteGpuDelegate Prepare: delegate is not initialized\nERROR: Node number 409 (TfLiteGpuDelegateV2) failed to prepare.\nERROR: Restored original execution plan after delegate application failure.\nSegmentation fault \n/tmp/xgen/compiler/generated/deepvan_run ~\n~\n'}}","{'images': [1, 3, 640, 640]}","{'outputs': [1, 8400, 85]}","F deepvan/core/operator.cc:246]  pc 0x5c799e746c 
F deepvan/core/operator.cc:246]  pc 0x5c799e60e0 
F deepvan/core/operator.cc:246]  pc 0x5c79763674 
F deepvan/core/operator.cc:246]  pc 0x5c7975c5fc 
F deepvan/core/operator.cc:246]  pc 0x5c7975ba20 
F deepvan/core/operator.cc:246]  pc 0x5c797471f0 
F deepvan/core/operator.cc:246]  pc 0x5c79746a48 
F deepvan/core/operator.cc:246]  pc 0x5c79740908 
F deepvan/core/operator.cc:246]  pc 0x5c797421fc 
F deepvan/core/operator.cc:246]  pc 0x7987ae01e8 __libc_init


     capability(CPU)        init      warmup     run_avg
========================================================
                 10          20.1       30.2        inf
/tmp/xgen/compiler/generated/deepvan_run ~
~
INFO: Created TensorFlow Lite delegate for GPU.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
INFO: Initialized TensorFlow Lite runtime.
INFO: Replacing 409 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions.
ERROR: TfLiteGpuDelegate Init: ADD: Expected a 3D tensor of shape HxWxC or a 4D tensor of shape 1xHxWxC but got 8400x2
INFO: Created 0 GPU delegate kernels.
ERROR: TfLiteGpuDelegate Prepare: delegate is not initialized
ERROR: Node number 409 (TfLiteGpuDelegateV2) failed to prepare.
ERROR: Restored original execution plan after delegate application failure.
Segmentation fault 
/tmp/xgen/compiler/generated/deepvan_run ~
~
",RFCR40CWRQD,yolov6l_relu.onnx,/data/onnx/yolov6l_relu.onnx,0.0,157.5739607810974,58591515,212868400,72127695200

Data organization probably needs some improvement in next version

主要的考虑是用户如何比较方便地引入自己的数据集。
感觉custome AI的数据是直接放在Projects下面,模型与数据分离;但common AI的数据放在/Data下面,而且数据与模型有一定的coupling。
另外,数据管理这一块也要考虑如果用户的数据集是在其他地方(一个特殊情况是,通过samba/nfs mount过来的)的情况。

对第一个目录管理问题,可以考虑直接把/Data volume mount到docker里,这样所有的数据集都在/Data下面。用户自己的数据也可以mount到/Data下面。(包括NFS/Samba。不过symlink may not work).

对第二个模型与数据coupling的问题,我不清楚现在的数据和模型之间的组织关系,感觉如果用户一套数据(包括common AI所使用的数据)用于多个模型,现在的组织形式不是很方便。

Occasionally, XGen does not quit after a task has been executed

Issues

XGen does not quit after a task has been executed.

Environment

  1. Do not connect phones to the host.
  2. Multiple users use the same docker container.

Reproduce Steps

  1. User1 uses XGen to execute some tasks.
  2. Wait for a long time until the tasks have been finished (Use "top" to check the resource utilization, XGen is still there but no Python subprocesses).
    XGen_does_not_exit
  3. User2 login to the same docker container to run XGen.

Results

"Another active XGen instance is found. This XGen is a single-instance version. Please wait for it to complete."

Expectations

  1. XGen quits correctly after tasks are finished.

path issues in custom ai

In custom ai, we need to input the path of the train script and the xgen.json.
1122
But the xgen.json path must point to the file, while the training script must point to the directory of the script. These two paths are inconsistent, one for a file and the other one for a directory. I can figure it out quickly. I just feel it better if the meanings of paths can be consistent.

Besides, for custom ai, users need to provide a xgen.json file. The manual does not mention this in the custom ai part. Users do not know how to prepare this configuration file. I can use a configuration file from another common ai project. But as we do not have instructions about this, I am not sure whether the configuration file is correct.

Negative values would confuse users

Issues

After performing some tasks, negative "Quality Score" and "Latency" values would confuse users.

Environment

  1. Do not connect phones to the host.

Reproduce Steps

  1. Run XGen, then select UNet, Pruning
  2. Select latency, input desired latency 200ms
  3. Input quality score >= 90
  4. Input batch size per GPU: 8

Results

negative_results

Expectations

Users may expect some explanations for these negative values.

Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.

When trying to compile multiple DNN models from ONNX format, I typically get this error:

inf,inf,testResults//83489944-8eb7-44,RF8M21Y9MNR,"{'RF8M21Y9MNR': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.\n'}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}","Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.
",RF8M21Y9MNR,SR.onnx,dnnModels/SR.onnx,1.0,8.405928611755371,1391745,525600000,125925120000.0

How would we interpret this XGen error? This is for Dr. @xipengshen class and we are tasked to compile and optimize a Super Resolution model. Since the XGen code base is closed source, I can't really dig any further by myself. Any pointers on how to resolve this or work around this would be greatly appreciated.

I'm blocked by this operator DepthToSpace which is supported by XGen on GPU.

I'm a little misled because I followed the instructions our professor gave us on confirming if a model is supported by the XGen opset via python /usr/local/compiler/cocogen/xgen_check.py [onnx file]. I assume the log means success if the check passed?

INFO: ------------------------------------------
INFO: |        key        |       value        |
INFO: ==========================================
INFO: | DeepVan Model Path| builds/SR/model|
INFO: ------------------------------------------
INFO: 
INFO: Generate input file:  builds/SR/_tmp/arm64-v8a/model_input_input
INFO: Generate input file done.
INFO: builds/SR/model/SR.pb builds/SR/model/SR.data
INFO: builds/SR/_tmp/arm64-v8a/model_input_input
INFO: /tmp/cmd_file-cocogen-SR-1669170755.2465415
INFO: /tmp/cmd_file-fallback-SR-1669170755.2465415
INFO: Generated model and library in ./generated/deepvan_run/
INFO: ~/TryXGen
INFO: adb: more than one device/emulator
INFO: adb: error: failed to get feature set: more than one device/emulator
INFO: adb: more than one device/emulator
INFO: adb: more than one device/emulator
INFO: adb: more than one device/emulator
INFO: 
INFO: 
INFO:      capability(CPU)        init      warmup     run_avg
INFO: ========================================================
INFO:                  10          20.1       30.2        inf
INFO: /tmp/xgen/compiler/generated/deepvan_run ~/TryXGen
INFO: ~/TryXGen
INFO: adb: more than one device/emulator
INFO: /tmp/xgen/compiler/generated/deepvan_run ~/TryXGen
INFO: ~/TryXGen
INFO: Check passed.

Additional information is that 2 Samsung S10e devices are connected via the machine we are sharing for the class.

This is originating from a PyTorch model which is then converted to ONNX format. I experimented with both opset 9 and 11 (RDN_9 and RDN_11). I originally tried with opset 11, but when falling back to opset 9, I get a different error that I can't really comprehend.

Full logs:

(xgen) root@eb2-3267-lin05:~/TryXGen# onnx_latency_benchmark --model-path dnnModels/ --output-path testResults/
INFO: running pruned model, RDN_9, on device R38M20BDTME
INFO: running model, RDN_9, on device R38M20BDTME
ERROR: test on device "R38M20BDTME" failed
INFO: running pruned model, RDN_9, on device RF8M21Y9MNR
INFO: running model, RDN_9, on device RF8M21Y9MNR
ERROR: test on device "RF8M21Y9MNR" failed
0/2
INFO: running pruned model, RDN_11, on device R38M20BDTME
INFO: running model, RDN_11, on device R38M20BDTME
ERROR: test on device "R38M20BDTME" failed
INFO: running pruned model, RDN_11, on device RF8M21Y9MNR
INFO: running model, RDN_11, on device RF8M21Y9MNR
ERROR: test on device "RF8M21Y9MNR" failed
1/2
Done
Results are in testResults/result.csv

Output of result.csv

onnx_latency,fallback_latency,output_dir,slowest_device,all,input_shape,output_shape,error,device,file_name,file_path,pruning,time_cost,params,IR,MACs
inf,inf,testResults//cbe828f3-566d-4d,R38M20BDTME,"{'R38M20BDTME': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}",,R38M20BDTME,RDN_9.onnx,dnnModels/RDN_9.onnx,1.0,17.332654237747192,1391745,709920000,125925120000.0
inf,inf,testResults//139fdea4-5af6-4f,R38M20BDTME,"{'R38M20BDTME': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': ""I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 34, total(ms): 1039.57\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 35, total(ms): 1040.67\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 36, total(ms): 1037.68\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 37, total(ms): 1040.06\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 38, total(ms): 1040.13\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 39, total(ms): 1038.21\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 40, total(ms): 1036.62\nI deepvan/run/deepvan_run.cc:137] Average latency (w/o 1 outliers): 1038.14 ms, boundaries: [1034.46, 1042.35]\n========================================================\n     capability(CPU)        init      warmup     run_avg\n========================================================\ntime          10.000    1695.605   10423.165    1038.138\n\n/tmp/xgen/compiler/generated/deepvan_run ~/TryXGen\n~/TryXGen\n""}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}","I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 34, total(ms): 1039.57
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 35, total(ms): 1040.67
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 36, total(ms): 1037.68
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 37, total(ms): 1040.06
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 38, total(ms): 1040.13
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 39, total(ms): 1038.21
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 40, total(ms): 1036.62
I deepvan/run/deepvan_run.cc:137] Average latency (w/o 1 outliers): 1038.14 ms, boundaries: [1034.46, 1042.35]
========================================================
     capability(CPU)        init      warmup     run_avg
========================================================
time          10.000    1695.605   10423.165    1038.138

/tmp/xgen/compiler/generated/deepvan_run ~/TryXGen
~/TryXGen
",R38M20BDTME,RDN_9.onnx,dnnModels/RDN_9.onnx,0.0,70.93232417106628,1391755,709920000,125925120000
inf,inf,testResults//ef26a187-c7b7-40,RF8M21Y9MNR,"{'RF8M21Y9MNR': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}",,RF8M21Y9MNR,RDN_9.onnx,dnnModels/RDN_9.onnx,1.0,8.801038265228271,1391745,709920000,125925120000.0
inf,inf,testResults//85741514-9a60-4f,RF8M21Y9MNR,"{'RF8M21Y9MNR': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': ""I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 34, total(ms): 1045.03\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 35, total(ms): 1043.67\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 36, total(ms): 1044.02\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 37, total(ms): 1042.6\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 38, total(ms): 1042.45\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 39, total(ms): 1042.52\nI deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0\nI deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0\nI deepvan/run/deepvan_run.cc:1137] Run round: 40, total(ms): 1042.78\nI deepvan/run/deepvan_run.cc:137] Average latency (w/o 4 outliers): 1042.87 ms, boundaries: [1039.03, 1046.73]\n========================================================\n     capability(CPU)        init      warmup     run_avg\n========================================================\ntime          10.000    1695.588   10451.755    1042.869\n\n/tmp/xgen/compiler/generated/deepvan_run ~/TryXGen\n~/TryXGen\n""}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}","I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 34, total(ms): 1045.03
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 35, total(ms): 1043.67
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 36, total(ms): 1044.02
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 37, total(ms): 1042.6
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 38, total(ms): 1042.45
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 39, total(ms): 1042.52
I deepvan/run/deepvan_run.cc:1234] Use input input's buffer for tensor serving_default_input:0
I deepvan/run/deepvan_run.cc:1255] Use output output's buffer for tensor PartitionedCall:0
I deepvan/run/deepvan_run.cc:1137] Run round: 40, total(ms): 1042.78
I deepvan/run/deepvan_run.cc:137] Average latency (w/o 4 outliers): 1042.87 ms, boundaries: [1039.03, 1046.73]
========================================================
     capability(CPU)        init      warmup     run_avg
========================================================
time          10.000    1695.588   10451.755    1042.869

/tmp/xgen/compiler/generated/deepvan_run ~/TryXGen
~/TryXGen
",RF8M21Y9MNR,RDN_9.onnx,dnnModels/RDN_9.onnx,0.0,67.71089172363281,1391755,709920000,125925120000
inf,inf,testResults//76752209-9084-46,R38M20BDTME,"{'R38M20BDTME': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.\n'}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}","Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.
",R38M20BDTME,RDN_11.onnx,dnnModels/RDN_11.onnx,1.0,8.57650113105774,1391745,525600000,125925120000.0
inf,inf,testResults//8b8e6e26-d9a3-46,R38M20BDTME,"{'R38M20BDTME': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.\n'}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}","Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.
",R38M20BDTME,RDN_11.onnx,dnnModels/RDN_11.onnx,0.0,67.88899064064026,1391745,525600000,125925120000
inf,inf,testResults//d9e6fbe9-561d-44,RF8M21Y9MNR,"{'RF8M21Y9MNR': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.\n'}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}","Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.
",RF8M21Y9MNR,RDN_11.onnx,dnnModels/RDN_11.onnx,1.0,8.639543294906616,1391745,525600000,125925120000.0
inf,inf,testResults//3c5ecaed-69df-4e,RF8M21Y9MNR,"{'RF8M21Y9MNR': {'latency': inf, 'faster': 'onnx', 'slower': 'fallback', 'slower-latency': inf, 'error': 'Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.\n'}}","{'input': [1, 1, 300, 300]}","{'output': [1, 1, 1200, 1200]}","Operator ""DepthToSpace"" is not supported in deepopt mode according to its properties.
",RF8M21Y9MNR,RDN_11.onnx,dnnModels/RDN_11.onnx,0.0,68.2154529094696,1391745,525600000,125925120000

For your convenience, I've also attached ONNX graphs for RDN_9 and RDN_11

Graphs

RDN_9
RDN_11

Add pixshuffle operator

Pixshuffle is widely used in image/video transformations (e.g., super resolution). Can XGen add this operator support?

some results are strange

We try XGen and obtain some results. Some results are strange.

1111
The above shows the results for pruning with default pruning arguments. After pruning, the latency becomes larger for efficientnet-ImageNet. It should be smaller.

Besides, the latency before pruning on S20 is larger than that on S10. Usually, S20 should be faster than S10 with smaller latency.

Custom AI XGen_init problem

Problem

When running compatibility test on a custom AI, xgen_init gave errors.

Files for reproduction

The model is the mnist model from Pu. The files are here

Log

xgen_log

About TFLite converter and CPU Threads setting

Dear Aurthor,

Because of some researching issue, I need to have my model initialized with different configurations.
Currently, I use TFLite interpreter which offer the number of CPU threads setting.
Is there any mechanism in XGen to convert .pb file in to .tflite?
Or is it able to initialize the model with fixed CPU threads?

Thanks.

Best,
Hsin-Hsuan

Issues during Compatibility Test within XGen

I am trying to optimize a basic MNIST model, with scripts linked below.

The example is taken from MNIST.

After running run_xgen, I get in the XGen container, and I am able to run the model, and also create the ONNX model.
On running the Compatibility check,

python /usr/local/compiler/cocogen/xgen_check.py [the onnx file]

I get

INFO: Check passed.

While trying to achieve the same from within XGen conda env, by running XGen, I get the following error:

Traceback (most recent call last):
  File "xgen_scripts.py", line 96, in main
    xgen(training_main, run_mock, training_script_path=training_script_path, log_path=log_path)
  File "/root/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.0.11-py3.7.egg/xgen/xgen_run.py", line 402, in xgen
    internal_data = train_module(job, training_main)
  File "/root/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.0.11-py3.7.egg/xgen/train_module.py", line 160, in train_module
    args_ai = model_train_main(job, training_main)
  File "/root/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.0.11-py3.7.egg/xgen/train_module.py", line 145, in model_train_main
    raise Exception('Training failed')
Exception: Training failed

Complete Log File: Log
Train_script_main: Train_Script
Config: Config

check onnx file in custom ai

In the manual, the 'run the check' step has some problems. If we run the following command, it will has some problems.

/usr/local/compiler/cocogen/xgen_check.py [the onnx file]

But if we run the following, it will work.

python  /usr/local/compiler/cocogen/xgen_check.py  [the onnx file]

custom AI model loading error

Problem

When I tried to run the compatiblity test of the latest XGen (without expiration problem) on a custom AI model, I encountered a model loading error.

Files for reproduce

The custom AI model is the mnist model that Pu posted:
https://github.com/CoCoPIE-Group/XGen-Report/files/9693404/xgen_mnist.zip

Log

Please choose your device(s):

[*] RF8M428R6AK model:SM_G973U
Press or for multi-selection, and or letter key and to move, to accept.

Please choose the base model (and the default dataset) to start with:
Image classification I: EfficientNet (ImageNet)
Image classification II: ResNet (ImageNet)
Image classification III: ResNet (Cifar)
Image classification IV: MobileNet (ImageNet)
Object detection: Yolov5 (CoCo)
Segmentation: UNet (ISBI-2012)
Video classification: R2+1d (UCF101)
Video super resolution: WDSR (DIV2K)

Your own model

Have you set up the environment needed to train your model in this container?

Yes
No

Have you revised your training script for XGen by following the XGen manual?

Yes
No

If you haven't run the compatibility testing, you are recommended to select that option in the following questions.
XGen config file (absolute path): /root/Projects/xgen_mnist/xgen.json
Training script folder (absolute path): /root/Projects/xgen_mnist/
What do you want to do?

Compatibility test
Pruning
Scaling
Customized operation

You chose "Compatibility test". We will create a temporary workplace for you, and will delete it after the testing.
2022-10-03 14:00:15,181 - root - INFO - AIMET
Your current workplace is /tmp/b3cc6ec24c4646129820d4d72401239f
A new search is started!
config summary********************
xgen-config-path: /tmp/b3cc6ec24c4646129820d4d72401239f/xgen_config.json
xgen-workplace: /tmp/b3cc6ec24c4646129820d4d72401239f
xgen-resume: False
xgen-mode: compatible_testing
xgen-pretrained-model-path: ./checkpoint/ckpt.pth
Detail args: {'origin': {'common_train_epochs': 200, 'root_path': './workplace/', 'pretrain_model_weights_path': None, 'train_data_path': './data', 'train_label_path': None,
'eval_data_path': './data', 'eval_label_path': None, 'scaling_factor': 2, 'num_classes': 10, 'batch_size': 128}, 'general': {'user_id': 'test', 'work_place':
'/tmp/b3cc6ec24c4646129820d4d72401239f', 'random_seed': None, 'enable_ddp': False, 'CUDA_VISIBLE_DEVICES': '0', 'tran_scripts_path': None}, 'prune': {'sp_store_weights': None, 'sp_lars':
False, 'sp_lars_trust_coef': 0.001, 'sp_backbone': False, 'sp_retrain': False, 'sp_admm': False, 'sp_admm_multi': False, 'sp_retrain_multi': False, 'sp_config_file': None,
'sp_subset_progressive': False, 'sp_admm_fixed_params': False, 'sp_no_harden': False, 'nv_sparse': False, 'sp_load_prune_params': None, 'sp_store_prune_params': None,
'generate_rand_seq_gap_yaml': False, 'sp_admm_update_epoch': 5, 'sp_admm_update_batch': None, 'sp_admm_rho': 0.001, 'sparsity_type': 'block_punched', 'sp_admm_lr': 0.01, 'admm_debug':
False, 'sp_global_weight_sparsity': False, 'sp_prune_threshold': -1.0, 'sp_block_irregular_sparsity': '(0,0)', 'sp_block_permute_multiplier': 2, 'sp_admm_block': '(8,4)',
'sp_admm_buckets_num': 16, 'sp_admm_elem_per_row': 1, 'sp_admm_tile': None, 'sp_admm_select_number': 4, 'sp_admm_pattern_row_sub': 1, 'sp_admm_pattern_col_sub': 4, 'sp_admm_data_format':
None, 'sp_admm_do_not_permute_conv': False, 'sp_gs_output_v': None, 'sp_gs_output_ptr': None, 'sp_load_frozen_weights': None, 'retrain_mask_pattern': 'weight', 'sp_update_init_method':
'weight', 'sp_mask_update_freq': 10, 'retrain_mask_sparsity': -1.0, 'retrain_mask_seed': None, 'sp_prune_before_retrain': False, 'output_compressed_format': False, 'sp_grad_update': False,
'sp_grad_decay': 0.98, 'sp_grad_restore_threshold': -1, 'sp_global_magnitude': False, 'sp_pre_defined_mask_dir': None, 'sp_prune_ratios': 0, 'admm_sparsity_type': 'block_punched',
'admm_block': '(8,4)', 'prune_threshold': -1.0}, 'quantization': {'qt_aimet': False, 'qat': True, 'fold_layers': True, 'cross_layer_equalization': False, 'bias_correction': True,
'rounding_mode': 'nearest', 'num_quant_samples': 1000, 'num_bias_correct_samples': 1000, 'weight_bw': 8, 'act_bw': 8, 'quant_scheme': 'tf_enhanced', 'layers_to_ignore': [], 'auto_add_bias':
True, 'perform_only_empirical_bias_corr': True}, 'task': {'specific_scenarios': 'BasicTest', 'pretrained_model_path': './checkpoint/ckpt.pth', 'state': {'stage': 0, 'cycles': 0},
'max_searching': 10}, 'user_requirements': {'power': None, 'accuracy': None, 'accuracy_reverse_yn': 0, 'model_size': None, 'memory_size': None, 'latency': 0.1, 'margin': 0.1,
'primary_type': 'latency', 'primary_range': '>', 'secondary_type': 'accuracy', 'secondary_range': '<', 'searching_variable': 'scaling_factor', 'searching_range': [1, 23],
'searching_step_size': 1, 'target_type': 'latency'}, 'train': {'common_save_best_yn': 1, 'trained_yn': False, 'larger_better': True}, 'compiler': {'input_shape': '(1, 1, 28, 28)',
'opset_version': 11, 'devices': ['RF8M428R6AK']}, 'distillation': {'distillation_method': None, 'enable_ddp': False, 'enable_dp': False, 'input_shape': None, 'original_loss_weights': 0.1,
'tag_loss_weights': 0.9, 'tag_loss': 'kl', 'tag_temperature': 4, 'tag_loss_combination_method': 'avg', 'feature_loss_weights': 0.9, 'feature_default_temperature': 1,
'advance_feature_mapping': {}, 'regularization_loss_weights': 1, 'regularization_loss_types': [], 'discriminator_lr': 0.0001}}
Current search has 1 stages
Stage: 1
Max search cycles: 1
config summary********************
Current state: stage: 1/1| cycles: 1/1
Total jobs:1
processing job 1/1
Training...
MKL_THREADING_LAYER=GNU CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 python train_script_main.py
2022-10-03 14:00:16,288 - root - INFO - AIMET
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz
9913344it [00:00, 33112378.88it/s]
Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz
29696it [00:00, 19239118.26it/s]
Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz
1649664it [00:00, 10896211.41it/s]
Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz
5120it [00:00, 16869470.92it/s]
Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw

/root/miniconda3/envs/xgen/lib/python3.7/site-packages/torchvision/datasets/mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:180.)
return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
Traceback (most recent call last):
File "train_script_main.py", line 151, in
training_main(args_ai)
File "train_script_main.py", line 130, in training_main
xgen_load(model,args_ai=args_ai)
File "/root/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_tools-1.0.1-py3.7.egg/xgen_tools/xgen_load.py", line 263, in xgen_load
Exception: args_ai or path can not be both none
Traceback (most recent call last):
File "xgen_scripts.py", line 147, in main
xgen(training_main, run, training_script_path=training_script_path, log_path=log_path)
File "/root/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.0.17-py3.7.egg/xgen/xgen_run.py", line 364, in xgen
internal_data = train_module(job, training_main)
File "/root/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.0.17-py3.7.egg/xgen/train_module.py", line 169, in train_module
args_ai = model_train_main(job, training_main)
File "/root/miniconda3/envs/xgen/lib/python3.7/site-packages/xgen_main-1.0.17-py3.7.egg/xgen/train_module.py", line 148, in model_train_main
raise Exception('Training failed')
Exception: Training failed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.