cogito2012 / dear Goto Github PK

[ICCV 2021 Oral] Deep Evidential Action Recognition

License: Apache License 2.0

Python 90.30% Dockerfile 0.04% Shell 5.37% Makefile 0.01% C++ 1.10% C 2.77% Cython 0.42%

action-recognition debiasing evidential-deep-learning model-calibration ood-detection openset-recognition uncertainty-quantification video-understanding

dear's People

Contributors

Stargazers

Watchers

Forkers

peterzhousz zyg11 fang-zhen c188 sunshinewhy qiuqiu746 dreamerlin jackzhousz syjxxxx hjzhang-forward ashoknp-git kalarohit liamsx uaicfs douwenhao68 xbchen82 devin-pi wednesque

dear's Issues

ImportError: cannot import name 'version' from 'mmaction' (unknown location)报错

cd experiments/i3d
bash finetune_i3d_edlnokl_avuc_debias_ucf101.sh 0
报错：Traceback (most recent call last):
File "tools/train.py", line 16, in
from mmaction import version
ImportError: cannot import name 'version' from 'mmaction' (unknown location)
Experiments finished!

Error installing mmcv-full

Following the directions on the GitHub, when running the command:

pip install mmcv-full==1.2.1 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.1/index.html

The command fails with the following error:

Looking in links: https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.1/index.html
Collecting mmcv-full==1.2.1
  Using cached mmcv-full-1.2.1.tar.gz (251 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [22 lines of output]
      /tmp/pip-install-6jmensbp/mmcv-full_883ba128e6c24b62bdb5675c066a03d8/setup.py:8: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
      !!
      
              ********************************************************************************
              Requirements should be satisfied by a PEP 517 installer.
              If you are using pip, you can try `pip install --use-pep517`.
              ********************************************************************************
      
      !!
        dist.Distribution().fetch_build_eggs(['Cython', 'numpy>=1.11.1'])
      Traceback (most recent call last):
        File "<string>", line 36, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-6jmensbp/mmcv-full_883ba128e6c24b62bdb5675c066a03d8/setup.py", line 15, in <module>
          import torch
        File "/home/username/.conda/envs/mmaction/lib/python3.7/site-packages/torch/__init__.py", line 189, in <module>
          _load_global_deps()
        File "/home/username/.conda/envs/mmaction/lib/python3.7/site-packages/torch/__init__.py", line 142, in _load_global_deps
          ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
        File "/home/username/.conda/envs/mmaction/lib/python3.7/ctypes/__init__.py", line 364, in __init__
          self._handle = _dlopen(self._name, mode)
      OSError: /home/username/.conda/envs/mmaction/lib/python3.7/site-packages/torch/lib/../../../../libcublas.so.11: symbol free_gemm_select, version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Trying to install using the suggested --use-pep517, results in error again:

Looking in links: https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.1/index.html
Collecting mmcv-full==1.2.1
  Using cached mmcv-full-1.2.1.tar.gz (251 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error
  
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [51 lines of output]
      /home/username/.conda/envs/mmaction/bin/python: No module named pip
      /tmp/pip-install-wnszk78b/mmcv-full_641186a7621946769760a8dd87ff86a0/setup.py:8: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
      !!
      
              ********************************************************************************
              Requirements should be satisfied by a PEP 517 installer.
              If you are using pip, you can try `pip install --use-pep517`.
              ********************************************************************************
      
      !!
        dist.Distribution().fetch_build_eggs(['Cython', 'numpy>=1.11.1'])
      Traceback (most recent call last):
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/setuptools/installer.py", line 96, in _fetch_build_egg_no_warn
          subprocess.check_call(cmd)
        File "/home/username/.conda/envs/mmaction/lib/python3.7/subprocess.py", line 363, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['/home/username/.conda/envs/mmaction/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmplek5d5rn', '--quiet', 'numpy>=1.11.1']' returned non-zero exit status 1.
      
      The above exception was the direct cause of the following exception:
      
      Traceback (most recent call last):
        File "/home/username/.conda/envs/mmaction/lib/python3.7/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/username/.conda/envs/mmaction/lib/python3.7/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/username/.conda/envs/mmaction/lib/python3.7/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 341, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 323, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 488, in run_setup
          self).run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/setuptools/build_meta.py", line 338, in run_setup
          exec(code, locals())
        File "<string>", line 8, in <module>
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/setuptools/dist.py", line 917, in fetch_build_eggs
          return _fetch_build_eggs(self, requires)
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/setuptools/installer.py", line 41, in _fetch_build_eggs
          replace_conflicting=True,
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/pkg_resources/__init__.py", line 828, in resolve
          req, best, replace_conflicting, env, installer, required_by, to_activate
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/pkg_resources/__init__.py", line 864, in _resolve_dist
          req, ws, installer, replace_conflicting=replace_conflicting
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/pkg_resources/__init__.py", line 1133, in best_match
          return self.obtain(req, installer)
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/pkg_resources/__init__.py", line 1145, in obtain
          return installer(requirement)
        File "/tmp/pip-build-env-9fxmm81s/overlay/lib/python3.7/site-packages/setuptools/installer.py", line 98, in _fetch_build_egg_no_warn
          raise DistutilsError(str(e)) from e
      distutils.errors.DistutilsError: Command '['/home/username/.conda/envs/mmaction/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmplek5d5rn', '--quiet', 'numpy>=1.11.1']' returned non-zero exit status 1.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Am I doing something wrong, or have the installation instructions changed?

It seems there are no 3D operations in DebiasHead.

It seems that the debiashead is used to implement CED.
But it seems no 3D operations though some modules (self.f1_conv3d, self.f2_conv3d) are named with '3D'. Because the temporal size of conlolution kenerls is 1.

In this way , shuffling the feat actually won't make any sense. Actually it seems no difference between this three branch: 1.(f1_conv3d-->avg_pool-->fc1), 2.(temporal shuffling-->f2_conv3d-->avg_pool-->fc2) 3.(reshape-->f3_conv2d-->avg_pool-->fc3).

Here is the code in

https://github.com/Cogito2012/DEAR/tree/master/mmaction/models/heads/debias_head.py:

@HEADS.register_module()
class DebiasHead(BaseHead):
    """Debias head.

    Args:
        num_classes (int): Number of classes to be classified.
        in_channels (int): Number of channels in input feature.
        loss_cls (dict): Config for building loss.
            Default: dict(type='EvidenceLoss')
        spatial_type (str): Pooling type in spatial dimension. Default: 'avg'.
        dropout_ratio (float): Probability of dropout layer. Default: 0.5.
        init_std (float): Std value for Initiation. Default: 0.01.
        kwargs (dict, optional): Any keyword argument to be used to initialize
            the head.
    """

    def __init__(self,
                 num_classes,
                 in_channels,
                 loss_cls=dict(type='EvidenceLoss'),
                 loss_factor=0.1,
                 hsic_factor=0.5,  # useful when alternative=True
                 alternative=False,
                 bias_input=True,
                 bias_network=True,
                 dropout_ratio=0.5,
                 init_std=0.01,
                 **kwargs):
        super().__init__(num_classes, in_channels, loss_cls, **kwargs)
        self.bias_input = bias_input
        self.bias_network = bias_network
        assert bias_input or bias_network, "At least one of the choices (bias_input, bias_network) should be True!"
        self.loss_factor = loss_factor
        self.hsic_factor = hsic_factor
        self.alternative = alternative
        self.f1_conv3d = ConvModule(
            in_channels,
            in_channels * 2, (1, 3, 3),
            stride=(1, 2, 2),
            padding=(0, 1, 1),
            bias=False,
            conv_cfg=dict(type='Conv3d'),
            norm_cfg=dict(type='BN3d', requires_grad=True))
        if bias_input:
            self.f2_conv3d = ConvModule(
                in_channels,
                in_channels * 2, (1, 3, 3),
                stride=(1, 2, 2),
                padding=(0, 1, 1),
                bias=False,
                conv_cfg=dict(type='Conv3d'),
                norm_cfg=dict(type='BN3d', requires_grad=True))
        if bias_network:
            self.f3_conv2d = ConvModule(
                in_channels,
                in_channels * 2, (3, 3),
                stride=(2, 2),
                padding=(1, 1),
                bias=False,
                conv_cfg=dict(type='Conv2d'),
                norm_cfg=dict(type='BN', requires_grad=True))
        self.dropout_ratio = dropout_ratio
        self.init_std = init_std
        if self.dropout_ratio != 0:
            self.dropout = nn.Dropout(p=self.dropout_ratio)
        else:
            self.dropout = None
        self.f1_fc = nn.Linear(self.in_channels * 2, self.num_classes)
        self.f2_fc = nn.Linear(self.in_channels * 2, self.num_classes)
        self.f3_fc = nn.Linear(self.in_channels * 2, self.num_classes)
        self.avg_pool = nn.AdaptiveAvgPool3d((1, 1, 1))
      

     .............
        def forward(self, x, num_segs=None, target=None, **kwargs):
        """Defines the computation performed at every call.

        Args:
            x (torch.Tensor): The input data. (B, 1024, 8, 14, 14)

        Returns:
            torch.Tensor: The classification scores for input samples.
        """
        feat = x.clone() if isinstance(x, torch.Tensor) else x[-2].clone()
        if len(feat.size()) == 4:  # for 2D recognizer
            assert num_segs is not None
            feat = feat.view((-1, num_segs) + feat.size()[1:]).transpose(1, 2).contiguous()
        # one-hot embedding for the target
        y = torch.eye(self.num_classes).to(feat.device)
        y = y[target]
        losses = dict()

        # f1_Conv3D(x)
        x = self.f1_conv3d(feat)  # (B, 2048, 8, 7, 7)
        feat_unbias = self.avg_pool(x).squeeze(-1).squeeze(-1).squeeze(-1)
        x = self.dropout(feat_unbias)
        x = self.f1_fc(x)
        alpha_unbias = self.exp_evidence(x) + 1
        # minimize the edl losses
        loss_cls1 = self.edl_loss(torch.log, alpha_unbias, y)
        losses.update({'loss_unbias_cls': loss_cls1})

        loss_hsic_f, loss_hsic_g = torch.zeros_like(loss_cls1), torch.zeros_like(loss_cls1)
        if self.bias_input:
            # f2_Conv3D(x)
            feat_shuffle = feat[:, :, torch.randperm(feat.size()[2])]
            x = self.f2_conv3d(feat_shuffle)  # (B, 2048, 8, 7, 7)
            feat_bias1 = self.avg_pool(x).squeeze(-1).squeeze(-1).squeeze(-1)
            x = self.dropout(feat_bias1)
            x = self.f2_fc(x)
            alpha_bias1 = self.exp_evidence(x) + 1
            # minimize the edl losses
            loss_cls2 = self.edl_loss(torch.log, alpha_bias1, y)
            losses.update({'loss_bias1_cls': loss_cls2})
            if self.alternative:
                # minimize HSIC w.r.t. feat_unbias, and maximize HSIC w.r.t. feat_bias1
                loss_hsic_f += self.hsic_factor * self.hsic_loss(feat_unbias, feat_bias1.detach(), unbiased=True) 
                loss_hsic_g += - self.hsic_factor * self.hsic_loss(feat_unbias.detach(), feat_bias1, unbiased=True)
            else:
                # maximize HSIC 
                loss_hsic1 = -1.0 * self.hsic_loss(alpha_unbias, alpha_bias1)
                losses.update({"loss_bias1_hsic": loss_hsic1})

        if self.bias_network:
            # f3_Conv2D(x)
            B, C, T, H, W = feat.size()
            feat_reshape = feat.permute(0, 2, 1, 3, 4).contiguous().view(-1, C, H, W)  # (B*T, C, H, W)
            x = self.f3_conv2d(feat_reshape)  # (64, 2048, 7, 7)
            x = x.view(B, T, x.size(-3), x.size(-2), x.size(-1)).permute(0, 2, 1, 3, 4)  # (B, 2048, 8, 7, 7)
            feat_bias2 = self.avg_pool(x).squeeze(-1).squeeze(-1).squeeze(-1)
            x = self.dropout(feat_bias2)
            x = self.f3_fc(x)
            alpha_bias2 = self.exp_evidence(x) + 1
            # minimize the edl losses
            loss_cls3 = self.edl_loss(torch.log, alpha_bias2, y)
            losses.update({'loss_bias2_cls': loss_cls3})
            if self.alternative:
                # minimize HSIC w.r.t. feat_unbias, and maximize HSIC w.r.t. feat_bias2
                loss_hsic_f += self.hsic_factor * self.hsic_loss(feat_unbias, feat_bias2.detach(), unbiased=True)
                loss_hsic_g += - self.hsic_factor * self.hsic_loss(feat_unbias.detach(), feat_bias2, unbiased=True)
            else:
                # maximize HSIC 
                loss_hsic2 = -1.0 * self.hsic_loss(alpha_unbias, alpha_bias2)
                losses.update({"loss_bias2_hsic": loss_hsic2})
        
        if self.alternative:
            # Here, we use odd iterations for minimizing hsic_f, and use even iterations for maximizing hsic_g
            assert 'iter' in kwargs, "iter number is missing!"
            loss_mask = kwargs['iter'] % 2
            loss_hsic = loss_mask * loss_hsic_f + (1 - loss_mask) * loss_hsic_g
            losses.update({'loss_hsic': loss_hsic})
            
        for k, v in losses.items():
            losses.update({k: v * self.loss_factor})
        return losses

Report an error

Report an error:
FileNotFoundError: [Errno 2] No such file or directory: 'experiments/i3d/results/I3D_EDLNoKLAvUCDebias_EDL_trainset_uncertainties.npz'
Experiments finished!
Is there something wrong or missed in the operation?

how to use the edl loss when distilate?

Computation of HSIC

Hi,

Thank you for your great work. I was going through the code and notice that here you would like to make only the diagonal elements zero which is consistent with the original paper "Feature Selection via Dependence Maximization". However, torch.diag function which you used here, has different behaviors when the input is vector or a matrix. Since here it is taking a matrix as the input, its output is a vector containing diagonal elements and kernel_XX - torch.diag(kernel_XX) subtracts the diagonal elements from all the elements in the corresponding column. The fix would be to change this line to kernel_XX - torch.diag(torch.diag(kernel_XX)). Could you please confirm this?

About evaluation protocols.

Hello.

Thanks for the interesting work.

Besides, I was curious about the design choice of the evaluation metric regarding 10 random trials of unknown class selection.

What's the difference between using a whole hmdb51/MiT-v2 dataset as open set?

Since the training is conducted with the whole UCF101 dataset, I was thinking that using hmdb51/MIT dataset as a whole would make no difference to your evaluation protocols.

Thank you for the great work.

Question about the evidential loss term in Eq.(1)

Dear authors,

Thanks a lot for the interesting paper and open-resourced repo.

It is awesome to incorporate the evidential deep learning (EDL) trick into action recognition task. But I have a question about the EDL loss term, i.e., Eq. (1) in DEAR paper. In this repo, the EDL loss with log function is set to be the default choice for running DEAR algorithm. I wonder if the digamma function can work properly? For me, I have conducted some experiments about training a EDL loss with digamma function on CIFAR100 dataset (based on the implementation in this repo) and I just found that the EDL loss term decreases quite hard and slow and the model does not get well optimized. I am curious if it happens to you?

Look forward to you response.

Best,
Haiming

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	1.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

Evidential Uncertainty Calibration

Very nice paper. I've been facing similar issues as the ones reported in this figure when using evidential uncertainty quantification:

In your work, you discuss we could add an Accuracy vs. Uncertainty loss function that looks like this:

You proposed a new version of it, as shown in the equation below:

I could find the implementation of the equation above here:

DEAR/mmaction/models/losses/edl_loss.py

Lines 113 to 144 in 2a64f6a

    
               def edl_loss(self, func, y, alpha, annealing_coef, target): 
        
                   """Used for both loss_type == 'log' and loss_type == 'digamma' 
        
                   func: function handler (torch.log, or torch.digamma) 
        
                   y: the one-hot labels (batchsize, num_classes) 
        
                   alpha: the predictions (batchsize, num_classes) 
        
                   epoch_num: the current training epoch 
        
                   """ 
        
                   losses = {} 
        
                   S = torch.sum(alpha, dim=1, keepdim=True) 
        
                   A = torch.sum(y * (func(S) - func(alpha)), dim=1, keepdim=True) 
        
                   losses.update({'loss_cls': A}) 
        
                   losses.update({'lambda': annealing_coef}) 
        
                   if self.with_kldiv: 
        
                       kl_alpha = (alpha - 1) * (1 - y) + 1 
        
                       kl_div = annealing_coef * \ 
        
                           self.kl_divergence(kl_alpha) 
        
                       losses.update({'loss_kl': kl_div}) 
        
                   if self.with_avuloss: 
        
                       pred_scores, pred_cls = torch.max(alpha / S, 1, keepdim=True) 
        
                       uncertainty = self.num_classes / S 
        
                       acc_match = torch.reshape(torch.eq(pred_cls, target.unsqueeze(1)).float(), (-1, 1)) 
        
                       if self.disentangle: 
        
                           acc_uncertain = - torch.log(pred_scores * (1 - uncertainty) + self.eps) 
        
                           inacc_certain = - torch.log((1 - pred_scores) * uncertainty + self.eps) 
        
                       else: 
        
                           acc_uncertain = - pred_scores * torch.log(1 - uncertainty + self.eps) 
        
                           inacc_certain = - (1 - pred_scores) * torch.log(uncertainty + self.eps) 
        
                       avu_loss = annealing_coef * acc_match * acc_uncertain + (1 - annealing_coef) * (1 - acc_match) * inacc_certain 
        
                       losses.update({'loss_avu': avu_loss}) 
        
                   return losses

It is unclear to me when to use the disentangle case. Could you please provide me with any insights about when to use one over the other?

Thanks :)

让证据非负的激活函数选择对结果的影响

作者你好，看你的代码。让证据非负的激活函数在训练过程中默认是relu，在测试阶段再换成exp？不知道我的理解是否正确，以及这个函数的选择对最终的结果是否比较大呢？

About joint training of CED

Hello I am very interested in your debias research and thank you very much for your selfless open source code.
I noticed that “In practice, we also implemented a joint training strategy which aims to optimize the objective of (4) and (5) jointly and we empirically found it can achieve a better performance”is mentioned in the paper.
And I found the setting alternative=False in the corresponding code. Is this “joint training”you mentioned?
Also, if alternative=False, loss_hsic_f += self.hsic_factor * self.hsic_loss(feat_unbias, feat_bias1.detach(), unbiased=True) and
loss_hsic_g += -self.hsic_factor * self.hsic_loss(feat_unbias.detach(), feat_bias1, unbiased=True) "and their corresponding formulas (4) and (5) don't make sense.
In addition, I would also like to ask whether it is necessary to use the simplified evidence_loss from DebiasHead on the closed set if we do not do the open set identification task. I would like to use the NLLLoss instead (you also mentioned that the two are similar in oral). (And I found that edl_loss doesn't work if alternative=False, haha) Of course, I have not done experiments to verify it.
Looking forward to your reply！

How to Train the model by applying multiple GPUs？

Hello.

Your work is really amazing. And I run your code successfully by using one GPU. Then I want to run the code by using multiple GPUs, and I got one error these days.

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 260, in <module>
    main()
  File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 255, in main
    raise subprocess.CalledProcessError(returncode=process.returncode,
subprocess.CalledProcessError: Command '['/usr/bin/python3', '-u', 'tools/train.py', '--local_rank=3', 'configs/recognition/i3d/finetune_ucf101_i3d_edlnokl_avuc_debias.py', '--launcher', 'pytorch', '--work-dir', 'work_dirs/i3d/finetune_ucf101_i3d_edlnokl_avuc_debias', '--validate', '--seed', '0', '--deterministic', '--gpu-ids', '0', '1', '2', '3']' returned non-zero exit status 1.
Experiments finished!

And I run the code with the following command,

bash tools/dist_train.sh configs/recognition/i3d/finetune_ucf101_i3d_edlnokl_avuc_debias.py 4 \
	--work-dir work_dirs/i3d/finetune_ucf101_i3d_edlnokl_avuc_debias \
	--validate \
	--seed 0 \
	--deterministic \
	--gpu-ids 0 1 2 3

So I want to know that how you use multilpe GPUs to train. I'd appreciated it if you can give one example.

On the issue of random seeds in OOD tests.

Hello. Your work has been very valuable to our research and we appreciate it.

We found a small problem during our research that we would like your guidance on. After the closed set dataset is trained, we need to perform OOD detection on the model to get the threshold. In this process, we found that the same experimental settings get different results, after identifying that the problem may be in SEED, as shown in the following code, which is in the python file ./DEAR/experiments/get_threshold.py.

def parse_args():
    parser = argparse.ArgumentParser(description='MMAction2 test')
    # model and data config
    parser.add_argument('--config', help='test config file path')
    parser.add_argument('--checkpoint', help='checkpoint file/url')
    parser.add_argument('--uncertainty', default='BALD', choices=['BALD', 'Entropy', 'EDL'], help='the uncertainty estimation method')
    parser.add_argument('--train_data', help='the split file of in-distribution training data')
    parser.add_argument('--forward_pass', type=int, default=10, help='the number of forward passes')
    parser.add_argument('--batch_size', type=int, default=8, help='the testing batch size')
    # env config
    parser.add_argument('--device', type=str, default='cuda:0', help='CPU/CUDA device option')
    parser.add_argument('--result_prefix', help='result file prefix')
    args = parser.parse_args()
    return args

The code parser.add_argument('--forward_pass', type=int, default=10, help='the number of forward passes') and the relative code are only available in run_stochastic_inference()，and our setting of ‘uncertainy’ is EDL, which uses run_evidence_inference().

So we wanted to know where are the possible reasons that would lead to different experimental results through one experimental setup.

Looking forward to your answer.

	def edl_loss(self, func, y, alpha, annealing_coef, target):
	"""Used for both loss_type == 'log' and loss_type == 'digamma'
	func: function handler (torch.log, or torch.digamma)
	y: the one-hot labels (batchsize, num_classes)
	alpha: the predictions (batchsize, num_classes)
	epoch_num: the current training epoch
	"""
	losses = {}
	S = torch.sum(alpha, dim=1, keepdim=True)
	A = torch.sum(y * (func(S) - func(alpha)), dim=1, keepdim=True)
	losses.update({'loss_cls': A})

	losses.update({'lambda': annealing_coef})
	if self.with_kldiv:
	kl_alpha = (alpha - 1) * (1 - y) + 1
	kl_div = annealing_coef * \
	self.kl_divergence(kl_alpha)
	losses.update({'loss_kl': kl_div})

	if self.with_avuloss:
	pred_scores, pred_cls = torch.max(alpha / S, 1, keepdim=True)
	uncertainty = self.num_classes / S
	acc_match = torch.reshape(torch.eq(pred_cls, target.unsqueeze(1)).float(), (-1, 1))
	if self.disentangle:
	acc_uncertain = - torch.log(pred_scores * (1 - uncertainty) + self.eps)
	inacc_certain = - torch.log((1 - pred_scores) * uncertainty + self.eps)
	else:
	acc_uncertain = - pred_scores * torch.log(1 - uncertainty + self.eps)
	inacc_certain = - (1 - pred_scores) * torch.log(uncertainty + self.eps)
	avu_loss = annealing_coef * acc_match * acc_uncertain + (1 - annealing_coef) * (1 - acc_match) * inacc_certain
	losses.update({'loss_avu': avu_loss})
	return losses