Coder Social home page Coder Social logo

tlc-pack / tenset Goto Github PK

View Code? Open in Web Editor NEW
87.0 9.0 32.0 49.4 MB

License: Apache License 2.0

CMake 0.69% Makefile 0.22% Java 0.90% Shell 0.79% C++ 39.23% RenderScript 0.01% Python 53.75% C 1.17% Objective-C 0.08% Objective-C++ 0.26% Rust 1.67% Batchfile 0.02% Go 0.50% Cuda 0.08% HTML 0.01% JavaScript 0.07% TypeScript 0.43% Cython 0.13%

tenset's Introduction

TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers

TenSet is a large-scale multi-platform tensor program performance dataset. TenSet contains 52 million program performance records collected from 6 hardware platforms. This repo is based on a fork of TVM.

Dataset Information

  • Statics

    Item Number
    Networks 120
    Hardware Platforms 6
    Tasks 13,848
    Measurement records 51,577,248
  • Hardware Platforms

    Hardware Platform Cloud Instance Other Comments
    Intel Platinum 8272CL @ 2.60GHz (16 cores) Azure D32s_v4 AVX-512
    Intel E5-2673 v4 @ 2.30GHz (8 cores) Azure F16s AVX-2
    AMD EPYC 7452 @ 2.35GHz (4 cores) Azure D16as_v4 AVX-2
    ARM Graviton2 (16 cores) AWS c6g.4xlarge Neon
    NVIDIA Tesla K80 AWS p2.xlarge Kepler Architecture
    NVIDIA Tesla T4 AWS g4dn.xlarge Turing Architecture

Tutorials

Organization

Follow the above tutorial to download the dataset. The dataset is stored under tenset/scripts/dataset folder.

  • dataset/network_info: The metadata for networks
    • *.relay.pkl: The relay IR of a network. One network per file.
      • For example, (resnet_50,[(1,3,224,224)]).relay.pkl contains the relay IR of resnet_50 with input shape (1, 3, 224, 224).
    • *.task.pkl: The tasks and their weights in a network. One (network, targte) pair per file.
      • For example, ((resnet_50,[(1,3,224,224)]),llvm).task.pkl contains all tasks of resnet_50 on llvm backend.
    • all_tasks.pkl: A file containing all tasks. It is used an an index for all tasks.
  • dataset/to_measure_programs: The generated random programs for measurement.
    • *.json: The randomly generated programs (schedules) for measurement. One file per task.
  • dataset/measure_records: Collected measurement records.
    • e5-2666/*.json: measurement records collected on an Intel e5-2673. One file per task.
    • platinum-8272/*.json: measurement records collected on an Intel platinum-8272. One file per task.
    • ...: other hardware platforms

Inspect Tasks and Programs in the Dataset

Follow the above tutorial to download the dataset. You can then inspect the tasks and programs in the dataset

  • Print a task

    cd scripts
    python3 print_all_tasks.py --idx 1264

    output:

    Index: 1264
    flop_ct: 115806208.0
    workload_key: ["12b88bedece6984af589a28b43e0f3c4", 1, 56, 56, 64, 3, 3, 64, 128, 1, 1, 1, 128, 1, 28, 28, 128]
    Compute DAG:
    placeholder = PLACEHOLDER [1, 56, 56, 64]
    PaddedInput(i0, i1, i2, i3) = tir.if_then_else(((((i1 >= 1) && (i1 < 57)) && (i2 >= 1)) && (i2 < 57)), placeholder[i0, (i1 - 1), (i2 - 1), i3], 0f)
    placeholder = PLACEHOLDER [3, 3, 64, 128]
    Conv2dOutput(nn, yy, xx, ff) += (PaddedInput[nn, ((yy*2) + ry), ((xx*2) + rx), rc]*placeholder[ry, rx, rc, ff])
    placeholder = PLACEHOLDER [1, 1, 1, 128]
    T_add(ax0, ax1, ax2, ax3) = (Conv2dOutput[ax0, ax1, ax2, ax3] + placeholder[ax0, 0, 0, ax3])
    T_relu(ax0, ax1, ax2, ax3) = max(T_add[ax0, ax1, ax2, ax3], 0f)
  • Print a program

    cd scripts
    python3 print_programs.py --filename 'dataset/measure_records/e5-2673/([12b88bedece6984af589a28b43e0f3c4,1,56,56,64,3,3,64,128,1,1,1,128,1,28,28,128],llvm).json' --idx 31

    output:

    Index: 31
    Time cost (second): [0.000990787, 0.000826989, 0.00082599, 0.00083999, 0.000827089, 0.000831189, 0.00083599, 0.000853589]
    Program:
    Placeholder: placeholder, placeholder, placeholder
    parallel ax0.0@ax1.0@ax2.0@ (0,4)
      for i1 (0,57)
        for i2 ((floormod(ax0.outer.outer.ax1.outer.outer.fused.ax2.outer.outer.fused, 4)*14),15)
          for i3 (0,64)
            PaddedInput = ...
      for ax3.0 (0,2)
        for ax2.1 (0,7)
          for ax3.1 (0,8)
            Conv2dOutput auto_unroll: 16
            for rx.0 (0,3)
              for rc.0 (0,4)
                for ry.1 (0,3)
                  for rc.1 (0,16)
                    for yy.3 (0,28)
                      vectorize ff.3 (0,8)
                        Conv2dOutput = ...
            for ax1.2 (0,28)
              vectorize ax3.2 (0,8)
                T_relu = ...

Citation

@inproceedings{zheng2021tenset,
  title={Tenset: A large-scale program performance dataset for learned tensor compilers},
  author={Zheng, Lianmin and Liu, Ruochen and Shao, Junru and Chen, Tianqi and Gonzalez, Joseph E and Stoica, Ion and Ali, Ameer Haj},
  booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
  year={2021}
}

License

The code is licensed under an Apache-2.0 license.
The dataset is licensed under a CC BY 4.0 license.

tenset's People

Contributors

anijain2305 avatar apivovarov avatar areusch avatar comaniac avatar eqy avatar frozengene avatar icemelon avatar jroesch avatar junrushao avatar jwfromm avatar kazum avatar kevinthesun avatar laurawly avatar liaopeiyuan avatar lixiaoquan avatar marisakirisame avatar masahi avatar merrymercy avatar nhynes avatar siju-samuel avatar srkreddy1238 avatar tkonolige avatar tmoreau89 avatar tqchen avatar vegaluisjose avatar vinx13 avatar wweic avatar yzhliu avatar zhiics avatar zihengjiang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tenset's Issues

Meet error running measure_program.py on TX2

I encounter two bugs when collecting data on NVIDIA TX2. The settings are:

Device: NVIDIA TX2 (cuda gpu)
Model:
for batch_size in [1]:
for image_size in [256]:
for layer in [18]:
network_keys.append((f'resnet_{layer}',
[(batch_size, 3, image_size, image_size)]))
shell:
python measure_programs.py --target "cuda -keys=cudagpu -arch=SM_62" --target-host="llvm"

The meeting bugs are:
root@kyrie-desktop:/home/kyrie/Desktop/ijcai/tenset/scripts# python measure_programs.py --target "cuda -keys=cudagpu -arch=SM_62" --target-host="llvm"

Load all tasks...
===== task: 0 programs: 0/193 =====
Get 128 programs to measure:
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
Time elapsed for measurement: 708.33 s
===== task: 0 programs: 128/193 =====
Get 65 programs to measure:
........TTTTTTTT
........TTTTTTTT
........TTTTTTTT
[10:37:55] /home/kyrie/Desktop/ijcai/tenset/src/auto_scheduler/measure.cc:337: warning: Too many errors happened during tuning. Switching to debug mode.

........TTTTTTTT

No: 153 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079882.82)

Placeholder:

==================================================
No: 154 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079888.05)

Placeholder:

==================================================
No: 155 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079893.27)

Placeholder:

==================================================
No: 156 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079898.49)

Placeholder:

==================================================
No: 157 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.01, Tstamp:1636079903.71)

Placeholder:

==================================================
No: 158 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636079908.93)

Placeholder:

==================================================
No: 159 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636079914.15)

Placeholder:

==================================================
No: 160 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:5.99, Tstamp:1636079919.38)

Placeholder:

[10:38:39] /home/kyrie/Desktop/ijcai/tenset/src/auto_scheduler/measure.cc:337: warning: Too many errors happened during tuning. Switching to debug mode.

........TTTTTTTT

No: 161 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079927.01)

Placeholder:

==================================================
No: 162 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:5.99, Tstamp:1636079932.25)

Placeholder:

==================================================
No: 163 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.03, Tstamp:1636079937.47)

Placeholder:

==================================================
No: 164 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:5.98, Tstamp:1636079942.69)

Placeholder:

==================================================
No: 165 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636079947.91)

Placeholder:

==================================================
No: 166 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079953.13)

Placeholder:

==================================================
No: 167 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.03, Tstamp:1636079958.34)

Placeholder:

==================================================
No: 168 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636079963.57)

Placeholder:

[10:39:23] /home/kyrie/Desktop/ijcai/tenset/src/auto_scheduler/measure.cc:337: warning: Too many errors happened during tuning. Switching to debug mode.

........TTTTTTTT

No: 169 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079971.21)

Placeholder:

==================================================
No: 170 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079976.44)

Placeholder:

==================================================
No: 171 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:5.99, Tstamp:1636079981.67)

Placeholder:

==================================================
No: 172 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636079986.89)

Placeholder:

==================================================
No: 173 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636079992.11)

Placeholder:

==================================================
No: 174 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.03, Tstamp:1636079997.34)

Placeholder:

==================================================
No: 175 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636080002.56)

Placeholder:

==================================================
No: 176 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636080007.78)

Placeholder:

[10:40:07] /home/kyrie/Desktop/ijcai/tenset/src/auto_scheduler/measure.cc:337: warning: Too many errors happened during tuning. Switching to debug mode.

........TTTTTTTT

No: 177 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.01, Tstamp:1636080015.53)

Placeholder:

==================================================
No: 178 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.04, Tstamp:1636080020.75)

Placeholder:

==================================================
No: 179 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:5.97, Tstamp:1636080025.98)

Placeholder:

==================================================
No: 180 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.01, Tstamp:1636080031.20)

Placeholder:

==================================================
No: 181 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.01, Tstamp:1636080036.43)

Placeholder:

==================================================
No: 182 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636080041.65)

Placeholder:

==================================================
No: 183 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.01, Tstamp:1636080046.86)

Placeholder:

==================================================
No: 184 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.03, Tstamp:1636080052.08)

Placeholder:

[10:40:52] /home/kyrie/Desktop/ijcai/tenset/src/auto_scheduler/measure.cc:337: warning: Too many errors happened during tuning. Switching to debug mode.

........TTTTTTTT

No: 185 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.00, Tstamp:1636080059.73)

Placeholder:

==================================================
No: 186 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:5.99, Tstamp:1636080064.95)

Placeholder:

==================================================
No: 187 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.01, Tstamp:1636080070.16)

Placeholder:

==================================================
No: 188 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:5.99, Tstamp:1636080075.38)

Placeholder:

==================================================
No: 189 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636080080.60)

Placeholder:

==================================================
No: 190 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.02, Tstamp:1636080085.83)

==================================================
Placeholder:

==================================================
No: 191 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.01, Tstamp:1636080091.05)

Placeholder:

==================================================
No: 192 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:6.01, Tstamp:1636080096.28)

Placeholder:

[10:41:36] /home/kyrie/Desktop/ijcai/tenset/src/auto_scheduler/measure.cc:337: warning: Too many errors happened during tuning. Switching to debug mode.

.*T

No: 193 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:RunTimeoutError, error_msg:, all_cost:5.87, Tstamp:1636080102.51)

Placeholder:

[10:41:42] /home/kyrie/Desktop/ijcai/tenset/src/auto_scheduler/measure.cc:337: warning: Too many errors happened during tuning. Switching to debug mode.

The error starts to happen at the 153th program for each task.

Another bug happens later:
No: 3889 GFLOPS: 0.00 / 0.00 results: MeasureResult(error_type:CompileHostError, error_msg:Traceback (most recent call last):
File "/home/kyrie/Desktop/ijcai/tenset/python/tvm/auto_scheduler/measure.py", line 653, in _timed_func
sch, args, target=task.target, target_host=task.target_host
File "/home/kyrie/Desktop/ijcai/tenset/python/tvm/
...
ir::VarNode const*, tvm::PrimExpr)
0: tvm::codegen::CodeGenCUDA::PrintType(tvm::runtime::DataType, std::ostream&)
File "/home/kyrie/Desktop/ijcai/tenset/src/target/source/codegen_cuda.cc", line 351
TVMError: Cannot convert type float32x32 to CUDA type
, all_cost:0.68, Tstamp:1636105300.67)

Placeholder:

About the dependency version of tenset

After I ran setup.py to install the defaulat dependencies and tried the 'get-started' example, I met a few issues and it seems that the versions for torch, xgboost, and other dependencies need to be specified. Could anyone please show me the dependency versions? Thanks.

transfer-tune using mlp throwing NoneType error

Command (from tutorial): python3 tune_network.py --network resnet_50 --n-trials 100 --cost-model mlp-no-update --load-model mlp.pkl --transfer-tune

Error:

Traceback (most recent call last):
  File "tune_network.py", line 180, in <module>
    args.result_file, args.transfer_tune, args.search_type)
  File "tune_network.py", line 93, in tune_and_evaluate
    tuner.transfer_tune(tuning_opt, search_policy=policy)
  File "<pwd>/tenset/python/tvm/auto_scheduler/task_scheduler.py", line 574, in transfer_tune
    few_shot_learning='plus_mix_task'
  File "<pwd>/tenset/python/tvm/auto_scheduler/task_scheduler.py", line 121, in make_search_policies
    cost_model.model.fit_local(local_dataset)
  File "<pwd>/tenset/python/tvm/auto_scheduler/cost_model/mlp_model.py", line 464, in fit_local
    self.local_model[task] = diff_model
TypeError: 'NoneType' object does not support item assignment

When I collect measurement records meet some error

@merrymercy Hi~This is a great job๏ผ

When I collect data in my own environment(Tesla V100-SXM2-16GB) and encounter the error shown in the figure below, is this normal and how should I solve it?
d4df2d9009a2750fed89d60ef2eaa58a

When i enlarge the timeout like this,this error error_type:RunTimeoutError still exists.

Looking forward your reply~

nsight compute not able to profile the kernels

I want to profile the kernels using ncu --target-processes all python3 measure_programs.py --target cuda, but no kernels are profiled. Is it normal? How could I profile the kernels using nvidia profiler? (ncu, nsys, or nvprof?)

Meet Error collecting measurement records

Hi, I am collecting data at my desktop (2060 gpu) and error occurs:

Traceback (most recent call last):
File "measure_programs.py", line 137, in
remeasure_file(i, task, target, args.target_host, args.batch_size, measurer_kwargs)
File "measure_programs.py", line 82, in remeasure_file
res_batch = measurer.measure(task, empty_policy, inp_batch)
File "/root/.local/lib/python3.6/site-packages/tvm-0.8.dev1882+g5a2eed60b-py3.6-linux-x86_64.egg/tvm/runtime/object.py", line 65, in getattr
return _ffi_node_api.NodeGetAttr(self, name)
File "/root/.local/lib/python3.6/site-packages/tvm-0.8.dev1882+g5a2eed60b-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/packed_func.py", line 237, in call
raise get_last_ffi_error()
TypeError: Traceback (most recent call last):
3: TVMFuncCall
2: _ZNSt17_Function_handlerI
1: tvm::NodeGetAttr(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
0: tvm::ReflectionVTable::GetAttr(tvm::runtime::Object*, tvm::runtime::String const&) const
File "/home/lenovo/Desktop/zzh/phd/tvm/tvm/include/tvm/node/reflection.h", line 390
TypeError: auto_scheduler.ProgramMeasurer is not registered via TVM_REGISTER_NODE_TYPE

Have tried several backbones (resnet18, xception etc), same error occurs
I am using tvm-0.8

transfer-tune option tries only one-round for each task

Hello everyone.

We are interested in optimizing vision DNN models for Jetson devices, so we tried to use TenSet dataset for optimizing DNN models on Jetson Xavier NX.
With some modification to use auto_scheduler.RPCRunner in tune_network.py, we were able to tune networks on Jetson Xavier NX.
We evaluated models available in tune_network.py with --n-trials 10000, and we found that Ansor with pretrained model by TenSet founds good programs than Ansor without TenSet model in the first thousands of trials and final results are slightly better.

We expected that enabling --transfer-tune option makes the result better, because --trasfer-tune looks to improve cost models by using measured result on real device. However, when we tried --transfer-tune option, it caused slower programs in all models. The following are the results:

ResNet 18

compiler execution time # trials
w/o transfer tune 7.21 10044
w/ transfer tune 10.24 1692

ResNet 50

compiler execution time # trials
w/o transfer tune 15.56 10044
w/ transfer tune 22.62 1692

MobileNet v2

compiler execution time # trials
w/o transfer tune 2.49 10048
w/ transfer tune 2.9 2048

MobileNet v3

compiler execution time # trials
w/o transfer tune 3.06 10048
w/ transfer tune 3.53 3328

Wide ResNet 50

compiler execution time # trials
w/o transfer tune 35.32 10044
w/ transfer tune 48.8 1692

DenseNet 121

compiler execution time # trials
w/o transfer tune 15.61 10044
w/ transfer tune 17.88 4604

Inception v3

compiler execution time # trials
w/o transfer tune 29.08 10015
w/ transfer tune 45.46 3487

We use the following commands for evaluation:

n_trials=10000
target="cuda -keys=cudagpu -arch=sm_72 -max_num_threads=1024 -max_threads_per_block=1024 -registers_per_block=65536 -shared_memory_per_block=49152 -thread_warp_size=32"
target_host="llvm -keys=arm_cpu -mtriple=aarch64-linux-gnu -mattr=+neon"
# w/o transfer tune
python3 tune_network.py --network ${model} --n-trials ${n_trials} ----cost-model xgb-no-update --load-model xgb.pkl --target "$target" --target-host "$target_host"
# w/ transfer tune
python3 tune_network.py --network ${model} --n-trials ${n_trials} ----cost-model xgb-no-update --transfer-tune --load-model xgb.pkl --target "$target" --target-host "$target_host"

For investigating the slower results of transfer tune, we read the related code to transfer tune option and found some seemingly strange points in its implementation:

  • It only tune each task for only one round, even if we give much more trial counts. In ResNet 50, normal Ansor w/ TenSet model tries 10044 trials, but transfer tune only does 1692 times.
  • It only uses fine-tuned models for the last half of tasks. The first half of tasks are always tuned by the given model.

Could you please tell me the intension of these implementation or how to improve the result of transfer tuning?

Dump_network_info for bert_large leads to seg fault

I'm trying to run dump_network_info on an A100 red hat. The script runs fine for all other networks, but causes a segmentation fault for bert_large.
Has anyone faced this before or do you have any idea as to why it might be happening?

Get started with cost model experiments

Get started

This tutorial contains a minimal example of training a cost model and using it for search.

Dataset

Install and Download

  1. Build and install this repo following the install guide of TVM.
  2. Download dataset file.
    • You can download it from google drive
    • Or you can use the command line
    pip3 install gdown
    gdown https://drive.google.com/uc?id=1hciRGyXcGY9fK_owgvlJow8P_l8xYIVJ
    
  3. Put dataset_v3.1.zip under tvm-cost-model/scripts and run unzip dataset_v3.1.zip
    A new folder dataset will appear in tvm-cost-model/scripts.

Dataset Content

see this readme

Example experiment

Train a cost model and use it for search

Go to tvm-cost-model/scripts.

  1. Make a dataset
    You can either
  • create a sampled smaller dataset for fast experiments.
python3 make_dataset.py --logs dataset/measure_records/e5-2673/*.json --sample-in-files 100
  • create a complete dataset by using all files. This takes a longer time and requires more memory.
python3 make_dataset.py --logs dataset/measure_records/e5-2673/*.json
  1. Train a cost model
python3 train_model.py
  1. Use the model for search
python3 tune_network.py --network resnet_50 --n-trials 100 --cost-model xgb-no-update --load-model xgb.pkl

References

When executing `python measure_programs.py "--target=cuda"`, I get some errors.

Hi, mercy

When executing python measure_programs.py "--target=cuda", I get some errors. Those errors are tvm error and timeout error.
i try to modify the build time to 30 seconds and run time to 30 seconds, but these is no difference.

what target should i specify when i measure programs with 2080Ti? I didn't find documentation to guide setting target when using cuda. Is it enough to just set --target=cuda?

Thank you, mercy~

  6: ffi_call_unix64
  5: TVMArrayFree
        at /root/tenset/src/runtime/ndarray.cc:295
  4: _ZN3tvm7runtime7NDArray8Int
  3: tvm::runtime::NDArray::FFIDecRef(DLTensor*)
        at /root/tenset/include/tvm/runtime/ndarray.h:383
  2: tvm::runtime::Object::DecRef()
        at /root/tenset/include/tvm/runtime/object.h:781
  1: tvm::runtime::NDArray::Internal::DefaultDeleter(tvm::runtime::Object*)
  0: tvm::runtime::CUDADeviceAPI::FreeDataSpace(DLContext, void*)
        at /root/tenset/src/runtime/cuda/cuda_device_api.cc:127
  File "/root/tenset/src/runtime/cuda/cuda_device_api.cc", line 127
TVMError: ---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading == false: CUDA: misaligned address
Exception ignored in: <function NDArrayBase.__del__ at 0x7f92d8afc820>
Traceback (most recent call last):
  File "/root/tenset/python/tvm/_ffi/_ctypes/ndarray.py", line 82, in __del__
    check_call(_LIB.TVMArrayFree(self.handle))
  File "/root/tenset/python/tvm/_ffi/base.py", line 346, in check_call
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  43: 0xffffffffffffffff
  42: __clone
  41: start_thread
        at /build/glibc-uZu3wS/glibc-2.27/nptl/pthread_create.c:463
  40: 0x0000000000619066
  39: 0x00000000006390f7
  38: PyObject_Call
  37: 0x0000000000537506
  36: _PyObject_Call_Prepend
  35: _PyEval_EvalFrameDefault
  34: PyVectorcall_Call
  33: 0x00000000005ce97f
  32: _PyFunction_Vectorcall
  31: 0x000000000045b756
  30: 0x0000000000609bbf
  29: _PyFunction_Vectorcall
  28: 0x000000000045b756
  27: 0x0000000000609bbf
  26: _PyFunction_Vectorcall
  25: _PyEval_EvalFrameDefault
  24: PyVectorcall_Call
  23: _PyFunction_Vectorcall
  22: _PyEval_EvalCodeWithName
  21: _PyEval_EvalFrameDefault
  20: PyVectorcall_Call
  19: _PyFunction_Vectorcall
  18: 0x0000000000500cb4
  17: 0x0000000000501310
  16: 0x0000000000535307
  15: PyObject_CallFinalizerFromDealloc
  14: 0x00000000005fa0c5
  13: 0x0000000000535c2b
  12: _PyFunction_Vectorcall
  11: 0x000000000045c107
  10: _PyObject_MakeTpCall
  9: 0x00007f939f8cc763
  8: _ctypes_callproc
  7: ffi_call
  6: ffi_call_unix64
  5: TVMArrayFree
        at /root/tenset/src/runtime/ndarray.cc:295
  4: _ZN3tvm7runtime7NDArray8Int
  3: tvm::runtime::NDArray::FFIDecRef(DLTensor*)
        at /root/tenset/include/tvm/runtime/ndarray.h:383
  2: tvm::runtime::Object::DecRef()
        at /root/tenset/include/tvm/runtime/object.h:781
  1: tvm::runtime::NDArray::Internal::DefaultDeleter(tvm::runtime::Object*)
  0: tvm::runtime::CUDADeviceAPI::FreeDataSpace(DLContext, void*)
        at /root/tenset/src/runtime/cuda/cuda_device_api.cc:127
  File "/root/tenset/src/runtime/cuda/cuda_device_api.cc", line 127
TVMError: ---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading == false: CUDA: misaligned address
Exception ignored in: <function NDArrayBase.__del__ at 0x7f92d8afc820>
Traceback (most recent call last):
  File "/root/tenset/python/tvm/_ffi/_ctypes/ndarray.py", line 82, in __del__
    check_call(_LIB.TVMArrayFree(self.handle))
  File "/root/tenset/python/tvm/_ffi/base.py", line 346, in check_call
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  42: 0xffffffffffffffff
  41: __clone
  40: start_thread
        at /build/glibc-uZu3wS/glibc-2.27/nptl/pthread_create.c:463
  39: 0x0000000000619066
  38: 0x00000000006390f7
  37: PyObject_Call
  36: 0x0000000000537506
  35: _PyObject_Call_Prepend
  34: _PyEval_EvalFrameDefault
  33: PyVectorcall_Call
  32: 0x00000000005ce97f
  31: _PyFunction_Vectorcall
  30: 0x000000000045b756
  29: 0x0000000000609bbf
  28: _PyFunction_Vectorcall
  27: 0x000000000045b756
  26: 0x0000000000609bbf
  25: _PyFunction_Vectorcall
  24: _PyEval_EvalFrameDefault
  23: PyVectorcall_Call
  22: _PyFunction_Vectorcall
  21: _PyEval_EvalCodeWithName
  20: _PyEval_EvalFrameDefault
  19: PyVectorcall_Call
  18: _PyFunction_Vectorcall
  17: 0x0000000000500cb4
  16: 0x0000000000535307
  15: PyObject_CallFinalizerFromDealloc
  14: 0x00000000005fa0c5
  13: 0x0000000000535c2b
  12: _PyFunction_Vectorcall
  11: 0x000000000045c107
  10: _PyObject_MakeTpCall
  9: 0x00007f939f8cc763
  8: _ctypes_callproc
  7: ffi_call
  6: ffi_call_unix64
  5: TVMArrayFree
        at /root/tenset/src/runtime/ndarray.cc:295
  4: _ZN3tvm7runtime7NDArray8Int
  3: tvm::runtime::NDArray::FFIDecRef(DLTensor*)
        at /root/tenset/include/tvm/runtime/ndarray.h:383
  2: tvm::runtime::Object::DecRef()
        at /root/tenset/include/tvm/runtime/object.h:781
  1: tvm::runtime::NDArray::Internal::DefaultDeleter(tvm::runtime::Object*)
  0: tvm::runtime::CUDADeviceAPI::FreeDataSpace(DLContext, void*)
        at /root/tenset/src/runtime/cuda/cuda_device_api.cc:127
  File "/root/tenset/src/runtime/cuda/cuda_device_api.cc", line 127
TVMError: ---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading == false: CUDA: misaligned address
terminate called after throwing an instance of 'tvm::runtime::InternalError'
  what():  [16:04:22] /root/tenset/src/runtime/cuda/cuda_module.cc:61: CUDAError: cuModuleUnload(module_[i]) failed with error: CUDA_ERROR_MISALIGNED_ADDRESS
Stack trace:
  0: tvm::runtime::CUDAModuleNode::~CUDAModuleNode()
        at /root/tenset/src/runtime/cuda/cuda_module.cc:61
  1: tvm::runtime::SimpleObjAllocator::Handler<tvm::runtime::CUDAModuleNode>::Deleter_(tvm::runtime::Object*)
        at /root/tenset/include/tvm/runtime/memory.h:138
  2: tvm::runtime::Object::DecRef()
        at /root/tenset/include/tvm/runtime/object.h:781
  3: tvm::runtime::ObjectPtr<tvm::runtime::Object>::reset()
        at /root/tenset/include/tvm/runtime/object.h:442
  4: tvm::runtime::ObjectPtr<tvm::runtime::Object>::~ObjectPtr()
        at /root/tenset/include/tvm/runtime/object.h:396
  5: tvm::runtime::ObjectRef::~ObjectRef()
        at /root/tenset/include/tvm/runtime/object.h:502
  6: tvm::runtime::Module::~Module()
        at /root/tenset/include/tvm/runtime/module.h:48
  7: void std::_Destroy<tvm::runtime::Module>(tvm::runtime::Module*)
        at /usr/include/c++/7/bits/stl_construct.h:98
  8: void std::_Destroy_aux<false>::__destroy<tvm::runtime::Module*>(tvm::runtime::Module*, tvm::runtime::Module*)
        at /usr/include/c++/7/bits/stl_construct.h:108
  9: void std::_Destroy<tvm::runtime::Module*>(tvm::runtime::Module*, tvm::runtime::Module*)
        at /usr/include/c++/7/bits/stl_construct.h:137
  10: void std::_Destroy<tvm::runtime::Module*, tvm::runtime::Module>(tvm::runtime::Module*, tvm::runtime::Module*, std::allocator<tvm::runtime::Module>&)
        at /usr/include/c++/7/bits/stl_construct.h:206
  11: std::vector<tvm::runtime::Module, std::allocator<tvm::runtime::Module> >::~vector()
        at /usr/include/c++/7/bits/stl_vector.h:434
  12: tvm::runtime::ModuleNode::~ModuleNode()
        at /root/tenset/include/tvm/runtime/module.h:114
  13: tvm::runtime::LibraryModuleNode::~LibraryModuleNode()
        at /root/tenset/src/runtime/library_module.cc:38
  14: tvm::runtime::SimpleObjAllocator::Handler<tvm::runtime::LibraryModuleNode>::Deleter_(tvm::runtime::Object*)
        at /root/tenset/include/tvm/runtime/memory.h:138
  15: tvm::runtime::Object::DecRef()
        at /root/tenset/include/tvm/runtime/object.h:781
  16: tvm::runtime::ObjectPtr<tvm::runtime::Object>::reset()
        at /root/tenset/include/tvm/runtime/object.h:442
  17: tvm::runtime::ObjectPtr<tvm::runtime::Object>::~ObjectPtr()
        at /root/tenset/include/tvm/runtime/object.h:396
  18: ~<lambda>
        at /root/tenset/src/runtime/library_module.cc:73
  19: _M_destroy
        at /usr/include/c++/7/bits/std_function.h:207
  20: _M_manager
        at /usr/include/c++/7/bits/std_function.h:231
  21: std::_Function_base::~_Function_base()
        at /usr/include/c++/7/bits/std_function.h:276
  22: std::function<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::~function()
        at /usr/include/c++/7/bits/std_function.h:389
  23: tvm::runtime::PackedFunc::~PackedFunc()
        at /root/tenset/include/tvm/runtime/packed_func.h:75
  24: ~<lambda>
        at /root/tenset/src/runtime/rpc/rpc_module.cc:370
  25: _M_destroy
        at /usr/include/c++/7/bits/std_function.h:207
  26: _M_manager
        at /usr/include/c++/7/bits/std_function.h:231
  27: std::_Function_base::~_Function_base()
        at /usr/include/c++/7/bits/std_function.h:276
  28: std::function<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::~function()
        at /usr/include/c++/7/bits/std_function.h:389
  29: tvm::runtime::PackedFunc::~PackedFunc()
        at /root/tenset/include/tvm/runtime/packed_func.h:75
  30: TVMFuncFree
        at /root/tenset/src/runtime/c_runtime_api.cc:463
  31: ffi_call_unix64
  32: ffi_call
  33: _ctypes_callproc
  34: 0x00007f939f8cc763
  35: _PyObject_MakeTpCall
  36: 0x000000000045c107
  37: _PyFunction_Vectorcall
  38: 0x0000000000535c2b
  39: 0x00000000005fa0c5
  40: PyObject_CallFinalizerFromDealloc
  41: 0x0000000000535307
  42: 0x00000000005ce6d4
  43: 0x000000000052cd24
  44: 0x00000000005d4882
  45: 0x0000000000500cb4
  46: _PyFunction_Vectorcall
  47: PyVectorcall_Call
  48: _PyEval_EvalFrameDefault
  49: _PyEval_EvalCodeWithName
  50: _PyFunction_Vectorcall
  51: PyVectorcall_Call
  52: _PyEval_EvalFrameDefault
  53: _PyFunction_Vectorcall
  54: 0x0000000000609bbf
  55: 0x000000000045b756
  56: _PyFunction_Vectorcall
  57: 0x0000000000609bbf
  58: 0x000000000045b756
  59: _PyFunction_Vectorcall
  60: 0x00000000005ce97f
  61: PyVectorcall_Call
  62: _PyEval_EvalFrameDefault
  63: _PyObject_Call_Prepend
  64: 0x0000000000537506
  65: PyObject_Call
  66: 0x00000000006390f7
  67: 0x0000000000619066
  68: start_thread
        at /build/glibc-uZu3wS/glibc-2.27/nptl/pthread_create.c:463
  69: __clone
  70: 0xffffffffffffffff


*T*T*T*T*T*T*E*T*T*T*T*E*T*T*E*T*T*T*E*T*E*T*T*T*T*E*T*E*T*T*T*E*T*T*T*T*E*T*E*E*T*EException ignored in: <function NDArrayBase.__del__ at 0x7f92d8afc820>
Traceback (most recent call last):
  File "/root/tenset/python/tvm/_ffi/_ctypes/ndarray.py", line 82, in __del__
    check_call(_LIB.TVMArrayFree(self.handle))
  File "/root/tenset/python/tvm/_ffi/base.py", line 346, in check_call
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  43: 0xffffffffffffffff
  42: __clone
  41: start_thread
        at /build/glibc-uZu3wS/glibc-2.27/nptl/pthread_create.c:463
  40: 0x0000000000619066
  39: 0x00000000006390f7
  38: PyObject_Call
  37: 0x0000000000537506
  36: _PyObject_Call_Prepend
  35: _PyEval_EvalFrameDefault
  34: PyVectorcall_Call
  33: 0x00000000005ce97f
  32: _PyFunction_Vectorcall
  31: 0x000000000045b756
  30: 0x0000000000609bbf
  29: _PyFunction_Vectorcall
  28: 0x000000000045b756
  27: 0x0000000000609bbf
  26: _PyFunction_Vectorcall
  25: _PyEval_EvalFrameDefault
  24: PyVectorcall_Call
  23: _PyFunction_Vectorcall
  22: _PyEval_EvalCodeWithName
  21: _PyEval_EvalFrameDefault
  20: PyVectorcall_Call
  19: _PyFunction_Vectorcall
  18: 0x0000000000500cb4
  17: 0x0000000000501310
  16: 0x0000000000535307
  15: PyObject_CallFinalizerFromDealloc
  14: 0x00000000005fa0c5
  13: 0x0000000000535c2b
  12: _PyFunction_Vectorcall
  11: 0x000000000045c107
  10: _PyObject_MakeTpCall
  9: 0x00007f939f8cc763
  8: _ctypes_callproc
  7: ffi_call
  6: ffi_call_unix64
  5: TVMArrayFree
        at /root/tenset/src/runtime/ndarray.cc:295
  4: _ZN3tvm7runtime7NDArray8Int
  3: tvm::runtime::NDArray::FFIDecRef(DLTensor*)
        at /root/tenset/include/tvm/runtime/ndarray.h:383
  2: tvm::runtime::Object::DecRef()
        at /root/tenset/include/tvm/runtime/object.h:781
  1: tvm::runtime::NDArray::Internal::DefaultDeleter(tvm::runtime::Object*)
  0: tvm::runtime::CUDADeviceAPI::FreeDataSpace(DLContext, void*)
        at /root/tenset/src/runtime/cuda/cuda_device_api.cc:127
  File "/root/tenset/src/runtime/cuda/cuda_device_api.cc", line 127
TVMError: ---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading == false: CUDA: misaligned address
Exception ignored in: <function NDArrayBase.__del__ at 0x7f92d8afc820>
Traceback (most recent call last):
  File "/root/tenset/python/tvm/_ffi/_ctypes/ndarray.py", line 82, in __del__
    check_call(_LIB.TVMArrayFree(self.handle))
  File "/root/tenset/python/tvm/_ffi/base.py", line 346, in check_call
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  43: 0xffffffffffffffff
  42: __clone
  41: start_thread
        at /build/glibc-uZu3wS/glibc-2.27/nptl/pthread_create.c:463
  40: 0x0000000000619066
  39: 0x00000000006390f7
  38: PyObject_Call
  37: 0x0000000000537506
  36: _PyObject_Call_Prepend
  35: _PyEval_EvalFrameDefault
  34: PyVectorcall_Call
  33: 0x00000000005ce97f
  32: _PyFunction_Vectorcall
  31: 0x000000000045b756
  30: 0x0000000000609bbf
  29: _PyFunction_Vectorcall
  28: 0x000000000045b756
  27: 0x0000000000609bbf
  26: _PyFunction_Vectorcall
  25: _PyEval_EvalFrameDefault
  24: PyVectorcall_Call
  23: _PyFunction_Vectorcall
  22: _PyEval_EvalCodeWithName
  21: _PyEval_EvalFrameDefault
  20: PyVectorcall_Call
  19: _PyFunction_Vectorcall
  18: 0x0000000000500cb4
  17: 0x0000000000501310
  16: 0x0000000000535307
  15: PyObject_CallFinalizerFromDealloc
  14: 0x00000000005fa0c5
  13: 0x0000000000535c2b
  12: _PyFunction_Vectorcall
  11: 0x000000000045c107
  10: _PyObject_MakeTpCall
  9: 0x00007f939f8cc763
  8: _ctypes_callproc
  7: ffi_call
  6: ffi_call_unix64
  5: TVMArrayFree
        at /root/tenset/src/runtime/ndarray.cc:295
  4: _ZN3tvm7runtime7NDArray8Int
  3: tvm::runtime::NDArray::FFIDecRef(DLTensor*)
        at /root/tenset/include/tvm/runtime/ndarray.h:383
  2: tvm::runtime::Object::DecRef()
        at /root/tenset/include/tvm/runtime/object.h:781
  1: tvm::runtime::NDArray::Internal::DefaultDeleter(tvm::runtime::Object*)
  0: tvm::runtime::CUDADeviceAPI::FreeDataSpace(DLContext, void*)
        at /root/tenset/src/runtime/cuda/cuda_device_api.cc:127
  File "/root/tenset/src/runtime/cuda/cuda_device_api.cc", line 127
TVMError: ---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.

train model with big dataset report Segmentation fault (core dumped)

Hi, mercy

I meet some problems while training the model. When I make dataset, if add the option --sample-in-files 100, python train_model.py is ok. But if I add the option --hold-out all_five, python train_model.py will report some errors. Below are the details

(base) zhaiyi@linke8:~/tenset/scripts$ CUDA_VISIBLE_DEVICES='7' python /data/workspace/zhaiyi/tenset/scripts/train_model.py --dataset=dataset_all.pkl
Arguments: Namespace(dataset=['dataset_all.pkl'], models='mlp', seed=0, split_scheme='within_task', train_ratio=0.9, use_gpu=False)
Load all tasks...
Load dataset...
Train set: 7415211. Task 0 = LearningTask(workload_key='["142c0886579d3901e9f6db0e30878395", 1, 8, 8, 512, 3, 3, 512, 512, 1, 1, 1, 512, 1, 8, 8, 512, 1, 8, 8, 512]', target='llvm -keys=cpu -link-params=0 -mcpu=skylake-avx512 -model=platinum-8272')
Test set:  823967. Task 0 = LearningTask(workload_key='["142c0886579d3901e9f6db0e30878395", 1, 8, 8, 512, 3, 3, 512, 512, 1, 1, 1, 512, 1, 8, 8, 512, 1, 8, 8, 512]', target='llvm -keys=cpu -link-params=0 -mcpu=skylake-avx512 -model=platinum-8272')
Segmentation fault (core dumped)

Then I tried to debug with vscode's breakpoints, and reported an error in here

if torch.cuda.device_count():

if device is None:
    if torch.cuda.device_count():
        device = 'cuda:0'
    else:
        device = 'cpu'
print(device)

Then I modified the sourecode as below, but reported another error

class MLPModelInternal:
    def __init__(self, device=None, few_shot_learning="base_only", use_workload_embedding=True, use_target_embedding=False,
                 loss_type='lambdaRankLoss'):
        if device is None:
            # if torch.cuda.device_count():
            #     device = 'cuda:0'
            # else:
            #     device = 'cpu'
            device = 'cuda:0'
        print(device)
(base) zhaiyi@linke8:~/tenset/scripts$ CUDA_VISIBLE_DEVICES='7' python /data/workspace/zhaiyi/tenset/scripts/train_model.py --dataset=dataset_all.pkl
Arguments: Namespace(dataset=['dataset_all.pkl'], models='mlp', seed=0, split_scheme='within_task', train_ratio=0.9, use_gpu=False)
Load all tasks...
Load dataset...
Train set: 7415211. Task 0 = LearningTask(workload_key='["142c0886579d3901e9f6db0e30878395", 1, 8, 8, 512, 3, 3, 512, 512, 1, 1, 1, 512, 1, 8, 8, 512, 1, 8, 8, 512]', target='llvm -keys=cpu -link-params=0 -mcpu=skylake-avx512 -model=platinum-8272')
Test set:  823967. Task 0 = LearningTask(workload_key='["142c0886579d3901e9f6db0e30878395", 1, 8, 8, 512, 3, 3, 512, 512, 1, 1, 1, 512, 1, 8, 8, 512, 1, 8, 8, 512]', target='llvm -keys=cpu -link-params=0 -mcpu=skylake-avx512 -model=platinum-8272')
cuda:0
============================================================
Fit a net. Train size: 7415211
malloc(): invalid next size (unsorted)
Aborted (core dumped)
Ubuntu 20.04.1 LTS \n \l
CUDA Version: 11.4
torch                              1.8.2+cu111
torchaudio                         0.8.2
torchvision                        0.9.2+cu111
memery size 504G

I don't think this error is relative to torch, because there is no error in training mode if I add the option --sample-in-files 100
Do you have any advice on this issue? Thank you, mercy.

Is there a way to look at the generated programs?

Is there a way to look at the generated programs/subgraphs? For example, if there is some CUDA/Python code that's generated and measured, is it possible to look at that code file?

Apologies, if the answer to this is obvious. Really new to TVM.

Thanks,
Akash

Get "Only support 64x64 image" error when dumping network infor of dcgan

When I was dumping the network information of dcgan using the dump_network_info.py (I modified this file to only dump the information of dcgan), I got the following error:

Traceback (most recent call last):
File "dump_network_info.py", line 243, in
dump_network(key, target)
File "dump_network_info.py", line 122, in dump_network
mod, params, inputs = get_network_with_key(network_key)
File "dump_network_info.py", line 103, in get_network_with_key
mod, params = relay.testing.dcgan.get_workload(
File "/usr/tvm/python/tvm/relay/testing/dcgan.py", line 170, in get_workload
net = get_net(batch_size, random_len, oshape=oshape, ngf=ngf, layout=layout, dtype=dtype)
File "/usr/tvm/python/tvm/relay/testing/dcgan.py", line 87, in get_net
assert oshape[-1] == 64, "Only support 64x64 image"
AssertionError: Only support 64x64 image

Can I directly delete the assert statements in the file "/usr/tvm/python/tvm/relay/testing/dcgan.py"?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.