Coder Social home page Coder Social logo

deeplearning's Introduction

RAPIDS.AI Deep Learning Repo

This repository is the home of our efforts to integrate RAPIDS acceleration of dataframes on GPU into popular deep learning frameworks. The work can be broken down into three main sections:

  • Dataloaders and preprocessing functionality developed to help provide connectivity between RAPIDS cuDF dataframes and the different deep learning libraries available.
  • Improvements to optimizers through the fusion of GPU operations.
  • Examples of the use of each of the above in competitions or on real world datasets.

Each deep learning library is contained within it's own subfolder, with the different dataloader options and examples contained within further subfolders. For now our focus is on PyTorch, however we expect to add other libraries in the future.

deeplearning's People

Contributors

aleksficek avatar benfred avatar bschifferer avatar evenoldridge avatar jakirkham avatar jperez999 avatar madsbk avatar mike-wendt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplearning's Issues

Question re: 05_1_TimeSeries_HistoricalEvents!

Good Afternoon! Thank you very much for all you have done with these repositories of knowledge! I had a question about the file: 05_1_TimeSeries_HistoricalEvents.ipynb

In the case of the solution code here:

############### Solution ###############
offset = '7D'

data_window = df_train[['product_id', 'date', 'target']].groupby(['product_id', 'date']).agg(['count', 'sum']).reset_index()
data_window.columns = ['product_id', 'date', 'count', 'sum']
data_window.index = data_window['date']

data_window_roll = data_window[['product_id', 'count', 'sum']].groupby(['product_id']).rolling(offset).sum().drop('product_id', axis=1)
data_window_roll = data_window_roll.reset_index()
data_window_roll.columns = ['product_id', 'date', 'count_' + offset, 'sum_' + offset]
data_window_roll[['count_' + offset, 'sum_' + offset]] = data_window_roll[['count_' + offset, 'sum_' + offset]].shift(1)
data_window_roll.loc[data_window_roll['product_id']!=data_window_roll['product_id'].shift(1), ['count_' + offset, 'sum_' + offset]] = 0
data_window_roll['avg_' + offset] = data_window_roll['sum_' + offset]/data_window_roll['count_' + offset]
data = df_train.merge(data_window_roll, how='left', on=['product_id', 'date'])
data

We are typically left with a np.nan value for the first row of each group's avg_7D. Would you all recode this to zero, or leave it as nan and drop the row? Additionally, would you typically include several of these in your model? Say, compute 3D and a 7D offset average?

Separately, I take it you apply identical functions to the valid and test sets, as well, right?

Lastly, where/when I might learn more about similar courses that you might offer in the future?

Thank you for your time and consideration!

Batch Dataset/DataLoader

Hey @EvenOldridge , I stumbled upon your post about cudf dataloaders speeding up training a long time ago... and recently got around to actually trying my hand at it, so thanks for introducing me to the idea!

I'm just curious, but is there a specific reason you implemented the batchdataloaders like you did in this repo instead of using a custom sampler + regular DataLoader with 0 workers?

I wrote out a quick sketch of the implementation I'm thinking of here: https://gist.github.com/NegatioN/1f63c3a79dfe13b183d413123d37d4fa

I understand that your implementation might already have changed significantly since you mentioned integrating it with fast.ai, but I was curious if you ruled this out for any specific reason that I can't clearly see atm. I would think it has the same performance capabilities?

Edit: The biggest difference might be we can grab each batch as a single read from contiguous memory? Did you test how large the impact of this was?

/Joakim

Broken container

The Dockerfile specifies an entrypoint script which is not present in the container.
This particular file entrypoint.sh is present in https://github.com/rapidsai/docker, but fixing the Dockerfile to COPY the file to the right location only leads to another issue (missing conda.sh).

What is the correct way to build the container then?
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.