I have tried to train the model using torch.cuda.amp.autocast() but the training doesn

AutoCast for mixed precision/fp16 fails? about x-transformers HOT 1 OPEN

lucidrains commented on May 9, 2024 1

AutoCast for mixed precision/fp16 fails?

from x-transformers.

Comments (1)

LarsHill commented on May 9, 2024

I actually encountered a similar scenario.
The standard Huggingface bert-base-cased model trained with 16 bit mixed precision (using pytorch-lightning), a vocab size of 100K and a seq len of 1024 uses around 34GB of Memory with a batch size=8 (on Nvidia A100). If I switch to full 32 bit precision the RAM usage almost doubles to 67GB of Memory which is expected.

However, if I use the x-transformers "Bert-like" implementation (mimicking the Huggingface config):

self.bert = TransformerWrapper(
            num_tokens=100_000,
            max_seq_len=1024,
            emb_dropout=0.1,
            tie_embedding=True,
            attn_layers=Encoder(
                dim=768,
                depth=12,
                heads=12,
                attn_flash=False,
                layer_dropout=0.1,  # stochastic depth - dropout entire layer
                attn_dropout=0.1,  # dropout post-attention
                ff_dropout=0.1,  # feedforward dropout
                use_abs_pos_emb=True,
            ),
        )

the memory usage does not change if I switch between 16-mixed and 32 precision. The overall usage (same batch size and hardware) remains at a constant 52GB, which is substantially higher than the HF model with 34GB.
Why does the precision setting of the lightning trainer not affect the x-transformers implementation?

I would love to use the x-transformer implementation due to the large amount of new features. However, I am wondering where these significant GPU RAM differences come from? And why does torch.autocast, which I think is used by lightning under the hood show no effect?

from x-transformers.

Recommend Projects

AutoCast for mixed precision/fp16 fails? about x-transformers HOT 1 OPEN

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent