godatadriven / databricks-cdk Goto Github PK
View Code? Open in Web Editor NEWDeployment of databricks resources with cdk
License: MIT License
Deployment of databricks resources with cdk
License: MIT License
We should move towards databricks-sdk for all the api requests. This will make this project more maintainable in the future as we don't need to write the api calls to deploy databricks resources.
First let's merge this
#1052
so we can create new branches for moving the different resources. I think best plan of attack is to create new pull requests for each resource, or atleast a seperate commit so we can track easily which have changed.
Made a pull request for this feature:
#833
To define the permission levels I opted for an enum. This makes sure that on the implementation side you can never select the wrong level. This might be something useful to implement on the other permission types as well.
We should add support for volumes to Databricks CDK
Currently the job permission API is missing. This is needed to set permission on created jobs/worfklows.
Currently I am not able to get the generated token_value from the Construct. Looking at for example the credentials construct I see that there is a public function attached to return attribute string. Would this be required for token too?
I wrote some code to be able to generate tokens in cdk.
The tricky thing with tokens is that they have to be created new every time they are deployed, this is because we cannot return the actual token_value from a request to the databricks api (I assume this is for safety reasons).
The implementation can be found here:
#808
When deploying it is hard to manage the amount of calls to the databricks api endpoints. In particular 429 errors should be retried (with backoff of course).
Databricks workspaces have a instance of mlflow running. There are two levels of this api:
Databricks only api:
https://docs.databricks.com/dev-tools/api/latest/mlflow.html
This part is not part of the open-source Mlflow package but can be used for some unique features of the mlflow experience within Databricks.
The opensource mlflow part of the API. This is documented here:
https://www.mlflow.org/docs/latest/rest-api.html#create-experiment
Calling the mlflow api has the same endpoint as the databricks only features.
I want to create experiments using cdk in Databricks, would it be ok to build this feature into databricks-cdk package?
There seems to be an issue with creating the deploy lambda from the databricks-cdk code. For me this issue has appeared from I think the 1.0 release? I get the following error:
[100%] fail: docker build --build-arg version=v0.11.8 --build-arg randomHash= --tag cdkasset-843fff560e25ec2fb84c3a073380084c6619f6ce268f7b78a4d9dd7adc827858 --file Dockerfile . exited with error code 1: #2 [internal] load .dockerignore
#2 sha256:6a66d730e6a982f7732660169b9ffa2ccdaf4d3806b0e3bee5b0746de5fe62e0
#2 transferring context: 2B done
#2 DONE 0.0s
#1 [internal] load build definition from Dockerfile
#1 sha256:c4c27166159593b0e3be33682c967b34d4406e67c282f6e6c83dfe281ae86d06
#1 transferring dockerfile: 84B done
#1 DONE 0.0s
#3 [internal] load metadata for docker.io/godatadriven/databricks-cdk-lambda:v0.11.8
#3 sha256:80729af1dd5f6df163a36275e6b80108e9a5c7f671a23bf7ccfd50084ebab95a
#3 ERROR: failed to do request: Head "https://172.27.51.159:5000/v2/godatadriven/databricks-cdk-lambda/manifests/v0.11.8?ns=docker.io": http: server gave HTTP response to HTTPS client
------
> [internal] load metadata for docker.io/godatadriven/databricks-cdk-lambda:v0.11.8:
------
godatadriven/databricks-cdk-lambda:v0.11.8: failed to do request: Head "https://172.27.51.159:5000/v2/godatadriven/databricks-cdk-lambda/manifests/v0.11.8?ns=docker.io": http: server gave HTTP response to HTTPS client
โ Building assets failed: Error: Building Assets Failed: Error: Failed to build one or more assets. See the error messages above for more information.
at buildAllStackAssets (/builds/datascience/platform/datascience-resources/deploy/node_modules/aws-cdk/lib/build.ts:21:11)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at CdkToolkit.deploy (/builds/datascience/platform/datascience-resources/deploy/node_modules/aws-cdk/lib/cdk-toolkit.ts:174:7)
at initCommandLine (/builds/datascience/platform/datascience-resources/deploy/node_modules/aws-cdk/lib/cli.ts:349:12)
I am not sure what's going on. I am able to pull the image. But the request for the manifests to "https://172.27.51.159:5000/v2/godatadriven/databricks-cdk-lambda/manifests" is also not working locally.
This is a bug. tokenName needs to be a property on the typescript side as well.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.