Comments (6)
@htaidirt, thank you for your contribution. Please find some comments below.
It is not clear how one can best deploy Kedro with credentials for production.
Currently Kedro supports deploying credentials for production via configuration environments. Default project template contains 2 of them - conf/base
is intended for storage of non-sensitive shareable configuration, and conf/local
, where you can put sensitive credentials. conf/local
is in .gitignore
by default. Please find more information on how kedro configuration module works in this section of documentation.
I create credentials.yml file in config/base/ folder, because this folder ignored by Git, but is still packaged in the Kedro Docker. This is still not good because the credentials are now located in the Docker images repository: Anyone can pull that image and get prod credentials!
conf/base
is indeed copied into Docker image, however, as documentation suggests, it is not intended to store any credentials. You should rather store your credentials in conf/local/credentials.yml
, which is in .dockerignore
by default.
Using environment variables will help standardize the deployment of Kedro like any app, thus reduces learning curve for developers.
Currently you can manually construct/enrich your credentials dictionary in src/<package_name>/run.py with any data, including one coming from the environment variables.
In long term we consider adding templating capability for kedro configs, which may, possibly, handle environment variables, however exact specification hasn't been finalised yet.
from kedro.
I posted my solution in
How do I fill the credentials from environment variables #403
Thanks for the help
from kedro.
We have updated our docs regarding credentials in 5f4325f. So I am closing this for now, but feel free to re-open it if this is still an issue :)
from kedro.
Hello,
I am trying to implement the following solution.
Currently you can manually construct/enrich your credentials dictionary in src/<package_name>/run.py with any data, including one coming from the environment variables.
When I clone the project, the file conf/local/credentials.yml is not present,
I want it to be created from env variables in the following situations:
- Running
kedro run
from command line - Running
context = load_context(MY_PATH)
from a notebook (not a kedro notebook)
For this I modified run.py as follows:
import yaml
class ProjectContext(KedroContext):
project_name = "my_project"
project_version = "0.16.1"
package_name = "my_package"
def __init__(self, project_path, **kwargs):
super().__init__(project_path, **kwargs)
self._set_credentials()
def _set_credentials(self):
kedro_project_dir = self.project_path
credential_file = os.path.join(kedro_project_dir, "conf", "local", "credentials.yml")
credentials = {
"my_credential" = os.getenv("MY_ENV_VAR")
}
with open(credential_file, "w") as file:
yaml.dump(credentials, file)
def _get_pipelines(self) -> Dict[str, Pipeline]:
return create_pipelines()
This does the job for now but is it future proof ?
(I guess the init argument of the super class KedroContext might change in the future).
Is there a better way to do this?
from kedro.
@HugoPerrier Can you please open a new issue and cross-link to this thread in there? Otherwise we won't get much visibility on the updates in closed issues.
Regarding your question: There might be a better way of handling your credentials using TemplatedConfigLoader
. You can create conf/local/credentials.yml
manually and add templated credentials in there (similar to catalog.yml
in the example above). That way the file won't contain any static secrets and can be committed to the repository. The actual secret values will come from the corresponding environment variables, which will be resolved by TemplatedConfigLoader
at runtime.
from kedro.
I am not sure how I can use the crendtials for CI environment, I cannot just put in the YAML in the local/ folder.
from kedro.
Related Issues (20)
- ci: Nightly build failure on `develop` HOT 3
- ci: Nightly build failure on `main` HOT 1
- Update `kedro new` hint and docs to clarify how to provide a project tools selection to `--tools`
- ci: Nightly build failure on `develop` HOT 2
- DatasetAlreadyExistsError thrown when using ThreadRunner, dataset factories HOT 4
- Maintenance of documentation versions is complex HOT 5
- Consider removing micropackaging HOT 2
- Improve Developer Experience
- Improve logging experience
- %load_node truncates import statements HOT 2
- ci: Nightly build failure on `main` HOT 1
- Upgrade Pluggy depdendency version (<1.4) - Preventing upgrade of Pytest 8.1 that requires pluggy >=1.4 HOT 1
- Monthly issue metrics report
- Update CONTRIBUTING.md and other instructions with new usage of Discussions vs Issues
- Release `kedro` 0.19.4 HOT 3
- Can't build docs in starter - need to update sphinx version HOT 2
- Improve `kedro jupyter setup` with options from `ipykernel install` HOT 1
- Kedro new starter CLI : user_input.lower() HOT 4
- Deprecate (mark for future removal) `get_pkg_version` from the public API HOT 5
- Decouple starters from framework in tool selection flow
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kedro.