Comments (7)
- Invoke meltano with
meltano run tap-my-source target-my-destination --run-id=abc123
- In the
runs
table, the record for this run has that persisted on thepayload
as a new"metadata": {"run-id":"abc123"}
field.
@menzenski Would this have a different value to the run_id
column in the runs
table? If so, I can imagine it could lead to some confusion.
FWIW if you wanna check out the approach, I was able to experiment with a --run-id=...
option in #8459 and I'm able to see the value correctly set in the runs
table:
from meltano.
@edgarrmondragon sorry for my delayed response here, I was out of office and missed your update - the draft PR https://github.com/meltano/meltano/pull/8459/files looks awesome, that'd totally work for our use case. (I confirmed that Argo Workflows is using v4 UUID strings).
from meltano.
@edgarrmondragon I put Meltano 3.4.0 into production today - we're using this new --run-id
flag to set the Meltano run ID to the workflow ID of the Argo Workflows workflow that runs Meltano.
It works great! Huge quality-of-life improvement for us. Thanks so much for implementing this!
from meltano.
Thanks for filing @menzenski!
If meltano run could accept a --run-id=abc123 CLI argument or similar, that could be persisted as part of the runs table record for that run.
I can imagine this, though we'd prefer to keep the run ID as a UUID to avoid having to create an Alembic migration script, since in Postgres it uses the builtin UUID
type.
Uniqueness of run_id
is not enforced, but I wonder what problems could come from running two pipelines with the same run ID. Maybe they'd just use the same log file?
Let me know if those restrictions work for you and your workflow, or if you'd need support for arbitrary strings.
If meltano run would expose the run ID of the current job as an environment variable (MELTANO_RUN_ID or similar), we could capture that upon completion of the job and persist it in the argo workflows archive.
I'm certain we could pass down a MELTANO_RUN_ID
env var to the plugin's subprocess, but I don't think that would be exposed outside of it, so I'm not sure it could be retrieved.
from meltano.
If meltano run could accept a --run-id=abc123 CLI argument or similar, that could be persisted as part of the runs table record for that run.
I can imagine this, though we'd prefer to keep the run ID as a UUID to avoid having to create an Alembic migration script, since in Postgres it uses the builtin
UUID
type.Uniqueness of
run_id
is not enforced, but I wonder what problems could come from running two pipelines with the same run ID. Maybe they'd just use the same log file?Let me know if those restrictions work for you and your workflow, or if you'd need support for arbitrary strings.
Sorry - I wasn't clear in my original message. I wasn't trying to propose that an orchestrator external to meltano should be able to set the meltano run ID. Rather, I was thinking about something like this:
- Invoke meltano with
meltano run tap-my-source target-my-destination --run-id=abc123
- In the
runs
table, the record for this run has that persisted on thepayload
as a new"metadata": {"run-id":"abc123"}
field.
Or similar - it seems that the payload
column is "just a JSON-encoded dict" (per
meltano/src/meltano/core/job/job.py
Line 112 in 2988899
singer_state
property).
If meltano run would expose the run ID of the current job as an environment variable (MELTANO_RUN_ID or similar), we could capture that upon completion of the job and persist it in the argo workflows archive.
I'm certain we could pass down a MELTANO_RUN_ID env var to the plugin's subprocess, but I don't think that would be exposed outside of it, so I'm not sure it could be retrieved.
For our use case, as long as it was available as an environment variable here, when the block run completed
message is logged (on success or error)
meltano/src/meltano/cli/run.py
Lines 153 to 209 in 2988899
from meltano.
@edgarrmondragon sorry for my delayed response here, I was out of office and missed your update - the draft PR https://github.com/meltano/meltano/pull/8459/files looks awesome, that'd totally work for our use case. (I confirmed that Argo Workflows is using v4 UUID strings).
Thanks for confirming @menzenski. I'm already in the process of beta testing Meltano 3.4.0 but I could probably slip #8459 in if the team accepts it.
from meltano.
@edgarrmondragon I put Meltano 3.4.0 into production today - we're using this new
--run-id
flag to set the Meltano run ID to the workflow ID of the Argo Workflows workflow that runs Meltano.It works great! Huge quality-of-life improvement for us. Thanks so much for implementing this!
I'm glad that it's helpful!
from meltano.
Related Issues (20)
- Replace use of flakeheaven with Ruff for flake8-errmsg
- docs: 404 on `/concepts/plugins/project` HOT 1
- Replace use of flakeheaven with Ruff for flake8-print
- Replace use of flakeheaven with Ruff for flake8-return
- Ruff: Enable flake8-pie, flake8-quotes and flake8-debugger
- feature: Make `pip` an explicit dependency of Meltano HOT 1
- bug: public discovery.yml is throwing 404 HOT 12
- bug: Environment variables are not passed from `.env` to the `pip install ...` subprocess
- Missing dependency wheels tracker
- feature: Make installation progress messages use logging instead of `click.echo` HOT 2
- bug: SingerMapper sets mapping configuration provided from env to null HOT 3
- bug: mysql-tap test returns Plugin configuration is invalid HOT 3
- feature: Add a `--run-id` option to `meltano el` similar to the one in `meltano run`
- bug: `meltano config <extractor> test` fails for SDK-based taps configured to use `BATCH` messages
- bug: Output of `meltano select <extractor> --list` is **visually** inconsistent between different Python 3.8-3.10 and 3.11+
- bug: Known upstream uv issues
- bug: Partial state causes failure when using a filesystem-based state backend
- bug: documentation for configuring plug-in python version HOT 1
- bug: Docs search broken HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
đ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. đđđ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google â¤ď¸ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from meltano.