Coder Social home page Coder Social logo

giorgiobasile / prefect-planetary-computer Goto Github PK

View Code? Open in Web Editor NEW
10.0 1.0 0.0 1.06 MB

Prefect integrations with Microsoft Planetary Computer.

Home Page: https://giorgiobasile.github.io/prefect-planetary-computer/

License: Apache License 2.0

Python 100.00%
dask-gateway data-engineering earth-observation planetary-computer prefect stac

prefect-planetary-computer's People

Contributors

dependabot[bot] avatar giorgiobasile avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

prefect-planetary-computer's Issues

Add specialised TaskRunner for PC Dask Gateway

Expectation / Proposal

Add a PlanetaryComputerTaskRunner, inheriting from prefect_dask.DaskTaskRunner, with PC-related defaults.

Traceback / Example

import dask
from prefect import flow, task
from prefect_planetary_computer import PlanetaryComputerCredentials
from prefect_planetary_computer.task_runners import PlanetaryComputerTaskRunner

pc_task_runner = PlanetaryComputerTaskRunner(
  credentials=PlanetaryComputerCredentials.load("BLOCK_NAME")
)

@task
def compute_task():
    with get_dask_client() as client:
        df = dask.datasets.timeseries("2000", "2001", partition_freq="4w")
        summary_df = client.compute(df.describe())
    return summary_df

@flow(task_runner=pc_task_runner)
def dask_flow():
    prefect_future = compute_task.submit()
    return prefect_future.result()

Pydantic v2 support from Prefect main branch

Expectation / Proposal

Prefect now supports pydantic v2, and this is breaking tests against Prefect main branch.

The easiest way to fix it is to import from pydantic.v1 as explained in the pydantic migration guide.

Traceback / Example

from pydantic import VERSION as PYDANTIC_VERSION
if PYDANTIC_VERSION.startswith("2."):
    from pydantic.v1 import Field, SecretStr
else:
    from pydantic import Field, SecretStr

Windows Tests actions fail

Problem

Windows Tests actions fail during the "Install dependencies" step - i.e. this one.

Traceback / Example

Getting requirements to build editable: started
  Getting requirements to build editable: finished with status 'error'
  error: subprocess-exited-with-error
  
  Getting requirements to build editable did not run successfully.
  exit code: 1
  
  [[21](https://github.com/giorgiobasile/prefect-planetary-computer/actions/runs/5832758908/job/15818800384#step:4:22) lines of output]
  Traceback (most recent call last):
    File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
      main()
    File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 132, in get_requires_for_build_editable
      return hook(config_settings)
    File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-d5o5hepf\overlay\Lib\site-packages\setuptools\build_meta.py", line 450, in get_requires_for_build_editable
      return self.get_requires_for_build_wheel(config_settings)
    File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-d5o5hepf\overlay\Lib\site-packages\setuptools\build_meta.py", line 341, in get_requires_for_build_wheel
      return self._get_build_requires(config_settings, requirements=['wheel'])
    File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-d5o5hepf\overlay\Lib\site-packages\setuptools\build_meta.py", line 3[23](https://github.com/giorgiobasile/prefect-planetary-computer/actions/runs/5832758908/job/15818800384#step:4:24), in _get_build_requires
      self.run_setup()
    File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-d5o5hepf\overlay\Lib\site-packages\setuptools\build_meta.py", line 487, in run_setup
      super(_BuildMetaLegacyBackend,
    File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-d5o5hepf\overlay\Lib\site-packages\setuptools\build_meta.py", line 338, in run_setup
      exec(code, locals())
    File "<string>", line 12, in <module>
    File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\encodings\cp1[25](https://github.com/giorgiobasile/prefect-planetary-computer/actions/runs/5832758908/job/15818800384#step:4:26)2.py", line 23, in decode
      return codecs.charmap_decode(input,self.errors,decoding_table)[0]
  UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 2[39](https://github.com/giorgiobasile/prefect-planetary-computer/actions/runs/5832758908/job/15818800384#step:4:40)4: character maps to <undefined>
  [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

Getting requirements to build editable did not run successfully.
exit code: 1

See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Error: Process completed with exit code 1.

Add credentials block method to create a new Dask cluster

Expectation / Proposal

Instantiation of a new Dask cluster should be made available by the PlanetaryComputerCredentials block, using the Dask Gateway factory already implemented.

Traceback / Example

from prefect_planetary_computer import PlanetaryComputerCredentials

pc_credentials_block = PlanetaryComputerCredentials(
    subscription_key = "sub-key",
    hub_api_token = "hub-token"
)

cluster = pc_credentials_block.new_dask_gateway_cluster()
cluster.adapt(minimum=2, maximum=10)

# use client for computations 
client = cluster.get_client()

Provide configured DaskTaskRunner instead of subclassed task runner

Expectation / Proposal

Instead of defining a new specific class for a DaskTaskRunner using the PC gateway, it would be better to provide the user with a credentials getter that accepts PC cluster options and returns a base DaskTaskRunner configured with the necessary defaults.

Traceback / Example

from prefect import flow

pc_credentials = PlanetaryComputerCredentials.load("BLOCK_NAME")
pc_runner = pc_credentials.get_dask_task_runner()

@flow(task_runner=pc_runner)
def my_flow():
    ...

Provide a PC credentials block and related clients

Provide a credentials block that handles the following:

Expectation / Proposal

The user shall be able to load the credential block and easily ask for a PySTAC client and a Dask Gateway client already configured to interact with the PC data and services. The clients will be instantiated with PC-related defaults, but it should be possible to provide other parameters compliant with the underlying libraries pystac-client and dask-gateway.

Traceback / Example

from prefect_planetary_computer import PlanetaryComputerCredentials
pc_credentials_block = PlanetaryComputerCredentials.load("BLOCK_NAME")

pc_stac_client = pc_credentials_block.get_pystac_client()
# do something with it (i.e. query a given collection)

gateway_client = pc_credentials_block.get_dask_gateway_client()
# do something with it (i.e. instantiate a new cluster)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.