Coder Social home page Coder Social logo

Comments (17)

idanov avatar idanov commented on May 18, 2024 9

Hi @ulli-snyman, thank you for contributing to Kedro with adding a feature request and potentially adding support for new data sets. Adding support for GCS is something we would love to have as part of kedro.contrib.io and we will be more than happy to welcome contributions for the datasets. I will mark this as good first issue label, so anyone interested in doing can pick it up.

from kedro.

nakhan98 avatar nakhan98 commented on May 18, 2024 4

Hi @ulli-snyman, thank you for your interest in contributing to Kedro! I'm the QA on the Kedro team. You should be able to mock out calls to the GCP library. An example is shown here. You can also see how the developers test their client code here. If you need any further assistance, please let us know.

from kedro.

Flid avatar Flid commented on May 18, 2024 2

Looks like there's still no reply, and sure, go for it! @plauto
Thank you in advance for the contribution.

from kedro.

ulli-snyman avatar ulli-snyman commented on May 18, 2024 1

Hey @idanov, Getting ready for my first PR towards this issue, covering the CSVDataSet method,
I am busy writing tests and have run into a bit of a roadblock, Currently there is no Functionality to mock a GCS bucket as in the moto package.

The common approach for testing with CGP services is to actually read/write to the service.
I'm scratching my head a bit here as I can run the tests with my credentials in a a testing project but that is specific to me and anyone else wanting to test this would need a GCP project to test this in.

Ive set the tests to take in GCP Configuration from ENV Vars, this is the best way I can see this working out... Would you be fine with this or have you got any other ideas as to how we could test this?

from kedro.

ulli-snyman avatar ulli-snyman commented on May 18, 2024 1

Hey @lorenabalan,
Things have been busy my side, will try wrap things up in the PR by the end of the month.

from kedro.

plauto avatar plauto commented on May 18, 2024 1

Hey there,
if that's OK, can I start working on it? I have some experience with GCP and I'd be happy to implemnt those features :)

from kedro.

plauto avatar plauto commented on May 18, 2024 1

from kedro.

lorenabalan avatar lorenabalan commented on May 18, 2024

I've updated the title with our internal ticket number to keep track of this more easily. :)
@ulli-snyman how is this coming along? Do you need any help from our side?

from kedro.

lorenabalan avatar lorenabalan commented on May 18, 2024

Things have been busy my side, will try wrap things up in the PR by the end of the month.

Totally fine, just wanted to check in and make sure that you're not stuck on something from our end. :)

from kedro.

yetudada avatar yetudada commented on May 18, 2024

Hi @plauto! We would love the help! But it might be a good idea to just sync with @ulli-snyman as he mentioned that he has started working on a PR. Let's give him until the end of the week to reply about how far he's gotten and whether or not he needs help. If there's no status update then it's all yours.

from kedro.

plauto avatar plauto commented on May 18, 2024

Sounds good to me! Thanks @yetudada

from kedro.

plauto avatar plauto commented on May 18, 2024

Hey! If that’s ok, can I start working on it this week?

from kedro.

921kiyo avatar 921kiyo commented on May 18, 2024

@plauto How's the development coming along? If you would like our early feedback/comments, feel free to open a draft PR so we can see if you are on the right track :)

from kedro.

plauto avatar plauto commented on May 18, 2024

@921kiyo I am going to push a draft PR. Sorry for being a bit late on this, but I could find some time to work on it end of last week. There are still a couple of things to finish (e.g. unit tests for Versioned Dataset which have a bit of complexity due to the way I have structured unit tests). I look forward to get a feedback from you, when you will have some time. After that it shouldn't take long to finish up the rest!

from kedro.

ManjuladeviM avatar ManjuladeviM commented on May 18, 2024

This blog is the general information for the feature. You got good work for this blog. We have a developing our creative content of this mind. Thank you for this blog. This for very interesting and useful.
Best Google cloud Online Training

from kedro.

yetudada avatar yetudada commented on May 18, 2024

@ulli-snyman and everyone who has been watching this issue. We're excited to announce that kedro 0.15.5 will have CSVGCSDataSet, ParquetGCSDataSet and JSONGCSDataSet.

In a following release of Kedro, we will have:

  • Support for Google Big Query
  • And a new series of file-storage agnostic datasets for CSVDataSet, ParquetDataSet, JSONDataSet, ExcelDataSet, HDFDataSet and PickleDataSet made possible because we stumbled into fsspec while we were looking at Dask integration; these datasets will support GCS, S3, etc. and simplify our data catalog

I'll close this issue when we have finished full support of GCS.

from kedro.

yetudada avatar yetudada commented on May 18, 2024

@ulli-snyman This PR has been addressed and full Google Cloud Support will be available in the next release. The datasets are already available in the develop branch: https://github.com/quantumblacklabs/kedro/blob/develop/kedro/extras/datasets/

They all use fsspec to load filepath: and GCS is included in that series: https://filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/registry.html

from kedro.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.