Comments (16)
- +1 on setting the default level of the Kedro logger to
INFO
(if just for backwards compatibility), and then having-q
set it toWARNING
andERROR
And -v
to DEBUG
?
- I don't think the current
logging.yml
logic is the only way to achieve that though- I'm also ambivalent on whether we should change the global logging level to
INFO
Not sure whether @noklam filled you in on the chat we had about #3657 (there's a brief summary there but not all the details). But I think we should definitely not set the global logging level to INFO
. This is actually what used to be done on kedro and I think it was wrong - see #1732 for more.
Agree that the current logging.yml
isn't the only way to achieve setting kedro logger to INFO
. Also discussed in #1732, kedro could just add a Python setLevel
call manually rather than configuring logging from a file config.
from kedro.
Quick 2 cents
- +1 on setting the default level of the Kedro logger to
INFO
(if just for backwards compatibility), and then having-q
set it toWARNING
and-qq
toERROR
- I don't think the current
logging.yml
logic is the only way to achieve that though - I'm also ambivalent on whether we should change the global logging level to
INFO
- I don't think the current
- -1 on keeping the current
logging.yml
logic - whoever wants fine grained control of logs, file logging, rotation etc should be using journald, supervisor, Datadog, or whatever other solution. this is not Kedro's responsibility - +1 on directing users to current best-practices for logging, including creating official integrations and docs for Sentry, OpenTelemetry
from kedro.
Not sure about this. There's a myriad mechanisms to manage environment variables, for example:
- direnv https://direnv.net/ uses
.envrc
(and, only optionally,.env
) - rtx https://github.com/jdx/rtx uses its own config file:
$ cat .rtx.toml
[tools]
python = {version="3.10.11", virtualenv=".venv"}
[env]
GOOGLE_APPLICATION_CREDENTIALS = 'kedro-pypi-stats-dde2342d3c96.json'
- And then of course dotenv https://www.dotenv.org/ uses
.env
I'm not sure what makes .env
special. On one hand we could say "this is what we support", but what if folks are using something else? Also, what if there's a clash between Kedro loading .env
and then direnv reading .envrc
?
I think we should stay away from being too smart about the user environment, and let them manage it.
The core issue is:
With the new kedro new flow, when logging is selected, it is not automatically used until user set
KEDRO_LOGGING_CONFIG
manually.
which is a real user problem. But maybe we need to explore other solutions.
from kedro.
Updated the title to focus more on the logging problem.
from kedro.
@DimedS IIRC it's because of two conflicting features:
- ConfigLoader is able to load from
conf_source
which can be a zip file (default ofkedro package
) - Logging is decoupled from ConfigLoader
This means that we need to read logging without configloader, and it's very common to put logging.yml
in conf
folder even it's not used by the config loader. I agree that should be the default and the environment variable should be a workaround, thus I suggest to load this automatically with our best effort.
Cc @merelcht
from kedro.
Some extra reading material ahead of a TD session:
from kedro.
The solution is basically this: be3d28d in #3577. It should be handles the pattern slightly better, and try to access CONF_SOURCE
if possible (may be not).
from kedro.
(Not adding to the solution but more sharing my view...)
As a user for several years (+5 years) of kedro since the early days before it was even called kedro, I was a bit thrown off that logging was removed by default in 0.19. Personally would still prefer it out of the box enabled by default (instead of having to read the docs to find out how to enable it, I would rather read the docs on how to disable if not needed). Also appreciate my view is one of many out there.
To counter Juan's views:
(counter) +1 on keeping the current logging.yml logic - whoever wants fine grained control of logs, file logging, rotation etc should be using journald, supervisor, Datadog, or whatever other solution. this is not Kedro's responsibility.
This is super helpful to just get people setup with something basic.
(for) +1 on directing users to current best-practices for logging, including creating official integrations and docs for Sentry, OpenTelemetry
More docs on how to integrate is always helpful
Both would be good in my case but it was really nice that logging was configured out of the box for a standard project, just like how we can choose spark to be part of the starter.
Last resort: add documentation on how to configure logging.yaml to replicate old behaviour as we know after_context_created
hook would have the caveat of not logging anything before context is created. It's a one-time setup per new project which is fine but right now there's no advice on the best place to achieve this without the caveat above.
Thankssss!
from kedro.
Related: #3474
It seems that having user input for filepath is not the best idea. Besides, do we have a reason to do this? Why would we want user to have absolute freedom to put this logging file, when in reality they should just packaged with conf_source
? De-coupling the read of logging file from ConfigLoader doesn't mean that we want user to keep the logging config separately.
from kedro.
Supplement on this, this will be a breaking change - because the config will no longer accept a full path but only the name of the files (or only relative to conf_source
)
from kedro.
Another user confused by this https://linen-slack.kedro.org/t/16362029/hi-i-have-recently-updated-the-kedro-version-but-now-i-am-ge#f6a6d4d0-7615-496e-b1a5-18037b944eda
from kedro.
Why is this environment variable necessary? Can we instead observe the project directory for a 'logging.yml' file and use it if found, or default to standard logging options if not?
From what I understand, the environment variable might also introduce a security issue we're currently unable to address, as outlined here: (#3474)
from kedro.
I think we should consider @noklam's #3591 seriously, abstain from fiddling with user-specified logging settings (hence avoid #3474 instead of ignoring it), and if the user wants to provide a file-based logging config, they can always do so in their own code and place it wherever they like. It's clear that logging is an ongoing pain (almost 2 years since #1470 was opened), and we haven't yet fully solved it.
from kedro.
This issue is partially overlap to #3591
- Automatic enable logging.yml and
kedro run -v
is slightly overlap - logging.yml affect beyonds
kedro run
, do we need to have-v
for all kedro command? - The default of the
logging.yml
is more independent, as I am challenging the idea of having too manyLEVEL
default there. This video seems to support that everything should propagate toroot
unless necessary. Only for rare case you need non-root level logger filtering. (info-handler)
I am particularly interested in your opinion about this @antonymilne.
from kedro.
I'm a bit out of the loop here and will talk to @noklam to share my thoughts on it all in more detail, but some rough points:
- the current (I think?) environment variable solution was added after a lot of thought and consideration with @idanov. You should definitely pick his brains on the topic too since he will definitely have some useful things to say here
- the kedro handling of logging is indeed a direct departure from the 12 factor app guidance. I believe this was a conscious decision made in the early days of kedro since users wouldn't know how to e.g. show INFO level messages or direct their logs to a file easily. Hence the logging functionality was built into kedro out of the box for better UX. Possibly expertise of users and logging tools has changed since then, but the handling of logging in kedro has always been a matter of "practicality beats purity". 12FA is useful guidance but not gospel on kedro, e.g. our handling of run environments does exactly what 12FA says not to do
- personally I don't care much about the file-based logging, but being able to expose INFO level messages by default I think is important. There is an annoying conflict here where the default Python log level of WARNING contradicts with the fact that it's assumed to be good UX to update the user with INFO on progress as their pipeline runs. Of course this assumption could be questioned with some user research since AFAIK we've always just taken it for granted. Remember that kedro pipelines can take a loooong time, and unless there's some other way of demonstrating progress (e.g. a progress bar, which we did also try in a hackathon and I liked) then I feel like kedro should display INFO messages by default. So maybe a better way to handle it would instead be a
--quiet
option that changes the level from INFO to WARNING? - generally agree that kedro should interfere as little as possible with Python's default logging system, e.g. by delegating as much as possible to the root logger. But with the caveat that again this is a question of purity vs. practicality, and Python default's logging level of WARNING doesn't seem suitable on kedro, and the default format of log messages is also pretty gross.
from kedro.
I notice the discussion in this thread are mainly related to #3591, though they are related and maybe overlapping.
I have this comment in #3657 #3657 (comment)
Automatic loading logging is fine, but there are edge cases with CONF_SOURCE. I think it's still an improvement and it doesn't make anything worse in the current states.
The idea is almost the same what @DimedS suggested, which we try to locate the logging.yml
in the project directory. There are some complication of whether CONF_SOURCE
will be available. In any case, having a non-default CONF_SOURCE
is not the majority case, having logging.yml
not inside the conf
folder is even rarer. This is a 90% solution and maybe good enough unless we come up with something better.
from kedro.
Related Issues (20)
- Create a plugin for `ruff`/`flake8` that would check for bad practices specific to Kedro
- ci: Nightly build failure on `develop` HOT 7
- Deprecate micropackaging HOT 7
- DataCatalog.shallow_copy() destroys any `CustomDataCatalog` type object HOT 1
- ci: Nightly build failure on `main` HOT 3
- Backfill some old documentation versions with static assets HOT 15
- Assess Kedro performance for complex pipelines
- Release 0.19.6
- Broken link in docs guidelines/standards
- ci: Nightly build failure on `main`
- Use `click` choice options to simplify validation of project creation workflow
- Implement version fallback for starters & framework
- Make `cookiecutter` optional / not a core dependency of `kedro` HOT 1
- Default to `raise_errors=True` in `find_pipelines` HOT 1
- Allow returning modified dataset in `after_dataset_loaded` and `before_dataset_saved` hook HOT 5
- ci: Nightly build failure on `main` HOT 2
- Investigate performance of config loading for big projects HOT 1
- Logs for packaged Kedro projects are confusing
- Move PySpark to end of Tools prompt
- Monthly issue metrics report
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kedro.