Coder Social home page Coder Social logo

Comments (2)

lreiher avatar lreiher commented on August 29, 2024 1

I have not tested Google Colab, but can train uNetXST for one epoch in about 10min on the 2_F dataset on our private NVIDIA A100, see log below. This is with the recommended Python environment.

  • Python 3.7
  • TensorFlow 2.5 (and everything else from requirements.txt
  • CUDA 11.2
  • cuDNN 8.1

Please note that we are not actively upgrading this repository to more recent Python / TensorFlow / CUDA versions anymore.

023-05-24 08:36:24.910143: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Found 32246 training samples
Found 3172 validation samples
2023-05-24 08:36:26.031596: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2023-05-24 08:36:26.113491: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:41:00.0 name: NVIDIA A100-SXM4-40GB computeCapability: 8.0
coreClock: 1.41GHz coreCount: 108 deviceMemorySize: 39.59GiB deviceMemoryBandwidth: 1.41TiB/s
2023-05-24 08:36:26.113542: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2023-05-24 08:36:26.115711: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2023-05-24 08:36:26.115751: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2023-05-24 08:36:26.116522: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2023-05-24 08:36:26.116698: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2023-05-24 08:36:26.117295: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2023-05-24 08:36:26.117822: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2023-05-24 08:36:26.117932: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2023-05-24 08:36:26.121095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2023-05-24 08:36:26.121608: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-24 08:36:26.152714: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:41:00.0 name: NVIDIA A100-SXM4-40GB computeCapability: 8.0
coreClock: 1.41GHz coreCount: 108 deviceMemorySize: 39.59GiB deviceMemoryBandwidth: 1.41TiB/s
2023-05-24 08:36:26.156884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2023-05-24 08:36:26.156937: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2023-05-24 08:36:27.210685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2023-05-24 08:36:27.210760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2023-05-24 08:36:27.210769: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2023-05-24 08:36:27.216527: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 37774 MB memory) -> physical GPU (device: 0, name: NVIDIA A100-SXM4-40GB, pci bus id: 0000:41:00.0, compute capability: 8.0)
Built data pipeline for training
Built data pipeline for validation
Compiled model uNetXST.py
Starting training...
2023-05-24 08:36:34.683173: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2023-05-24 08:36:34.711923: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2250035000 Hz
Epoch 1/100
2023-05-24 08:36:46.501740: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2023-05-24 08:36:48.682107: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2023-05-24 08:36:48.683088: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2023-05-24 08:36:50.143441: I tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded cuDNN version 8100
6449/6449 [==============================] - ETA: 0s - loss: 0.2305 - categorical_accuracy: 0.9356 - mean_io_u_with_one_hot_labels: 0.75832023-05-24 08:43:53.638021: I tensorflow/stream_executor/cuda/cuda_blas.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
6449/6449 [==============================] - 657s 99ms/step - loss: 0.2305 - categorical_accuracy: 0.9356 - mean_io_u_with_one_hot_labels: 0.7583 - val_loss: 0.1702 - val_categorical_accuracy: 0.9477 - val_mean_io_u_with_one_hot_labels: 0.7897
Epoch 2/100
6449/6449 [==============================] - 619s 96ms/step - loss: 0.1468 - categorical_accuracy: 0.9544 - mean_io_u_with_one_hot_labels: 0.8116 - val_loss: 0.1497 - val_categorical_accuracy: 0.9535 - val_mean_io_u_with_one_hot_labels: 0.8091

from cam2bev.

TBauer2000 avatar TBauer2000 commented on August 29, 2024

Alright thank you for your quick response! Seems like I have to get your requirements running on a Colab Notebook :)

from cam2bev.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.