Coder Social home page Coder Social logo

apple_m1_pro_python's Introduction

Apple Silicon DL benchmarks

Currently we have PyTorch and Tensorflow that have Metal backend.

Results

Varied results across frameworks:

Tensorflow Resnet50:

tf_resnet_50results.png

PyTorch Resnet50:

  • Difference between CPU and GPU gpu_vs_cpu.png
  • Comparing with Nvidia samples_sec.png

PyTorch Bert

  • Running a Bert from Huggingface pt_bert.png

Pytorch

We have official PyTorch support! check pytorch folder to start running your benchmarks

Tensorflow

You can run tensorflow benchmarks by going to the tensorflow folder.

apple_m1_pro_python's People

Contributors

soumik12345 avatar tcapelle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

apple_m1_pro_python's Issues

on batchsize in the wandb experiments

hey - great work, i just have a question regarding the batchsizes chosen for the benchmark experiment you shared on wandb

for example in the first table, if i expand all runs, i can see that it says

  • 1080 ti uses batchsize 160
  • A5000 uses batchsize 128
  • ...
  • M1Pro uses batchsize 96
  • M1_7 uses batchsize 128
  • M1Max uses batchsize 128

and this inconsistency persists throughout all tables in the benchmark, for example if you look at the resnet graph, and expand the train_full_model tab, it says batchsize-wise M1Pro uses 48, RTX 3090 uses 512, and M1Max uses 128, which are wildly inconsistent.

is this just a typo in the table? or is there something wrong with the underlying experiment too? if this indeed indicates diff batchsizes across different benchmarks, it will affect the performance obviously no?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.