Comments (1)
Update azure doc with the following issues:
(regarding cifar accuracy statement) Can we rephrase this comment? It requires a subtle understanding of data parallelism and I'm worried it will be misinterpreted as a critique of DeepSpeed.
(before azure cli details) Maybe a first bullet of making an Azure account if you are new?
I think it would be good to add a sentence or two telling users about the hardware that our sample configuration uses. We should also link to the SKU info page so pricing is clear before they follow the tutorial.
from deepspeed.
Related Issues (20)
- [BUG] No `universal_checkpoint_info` in the Accelerate+Deepspeed Checkpoint HOT 6
- nv-nightly CI test failure HOT 1
- [BUG] (flops_profiler) Duplicate registration check for start_time_hook is not working
- [BUG: Whisper model pipeline parallel training] logits and ground truth size mismatch during loss calculation
- [Q&A] Why Deepspeed Ulysses could support long sequence length?
- Why not save frozen params unless: `self.zero_optimization_stage() >= ZeroStageEnum.gradients`? HOT 2
- [REQUEST]I do not understand the meaning of ' reduction ' in the ZERO++ paper.
- Deepspeed module not being able to install in the WSL environment HOT 2
- Cannot create wheel for version 0.14.2 on Windows HOT 1
- [BUG] Unable to load CLIPVisionModel parameters properly in Zero Stage 3 HOT 2
- [BUG] Can't pickle local object 'instrument_w_nvtx.<locals>.wrapped_fn' HOT 4
- [BUG] Multi-gpu training is much lower than single gpu (due to additional processes?)
- [REQUEST] Remove scary warnings from deepspeed import
- [BUG]output tensor must have the same type as input tensor in PPO training script of TRL HOT 4
- [REQUEST] How can one specify the CPU architecture to target. HOT 2
- how to gather checkpoints to master node during multi-nodes training HOT 6
- [REQUEST] i want to know how to merge deepspeed multi gpu optim file into one pytorch optim.pt file ? HOT 8
- [BUG] [Regression] Adam Offload Runtime Error with DeepSpeed v0.14.2 HOT 3
- [REQUEST] too many unrelated warning HOT 1
- [REQUEST] Use python sysconfig to generate CFLAGs HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepspeed.