When I was running the python run.py --is_finetune True, there was an error happened.
C:\Users\94323\anaconda3\lib\site-packages\torch\nn\functional.py:1805: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [0,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [1,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [2,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [4,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [5,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [6,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [7,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [8,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [9,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [10,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [11,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [12,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [13,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [14,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [15,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [16,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [17,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [18,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [19,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [20,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [22,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [24,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [25,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: block: [0,0,0], thread: [26,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
0%| | 0/1431 [00:01<?, ?it/s]
Traceback (most recent call last):
File "run.py", line 916, in <module>
loop.train()
File "run.py", line 460, in train
self.backward(joint_loss)
File "run.py", line 854, in backward
loss.backward()
File "C:\Users\94323\anaconda3\lib\site-packages\torch\_tensor.py", line 255, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "C:\Users\94323\anaconda3\lib\site-packages\torch\autograd\__init__.py", line 147, in backward
Variable._execution_engine.run_backward(
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
I tried to find the traceback of the error, and find it happen in model.py line 636.
I tried to find some fix method in web, some people say that the target value(movies_gth) should in the value 0<target[i]<C-1, but when I was debugging, the movies.gth is
"tensor([28207, 22727, 64362, 4646, 64362, 64362, 8404, 64362, 51711, 40569,....])"
and the C is 6924.
I don’t know if I understand this question correctly, but I hope somebody can solve this problem.