Coder Social home page Coder Social logo

kestra-io / plugin-scripts Goto Github PK

View Code? Open in Web Editor NEW
8.0 7.0 6.0 1002 KB

Home Page: https://kestra.io/plugins/

License: Apache License 2.0

Java 99.24% JavaScript 0.35% Python 0.41%
kestra groovy jython plugin nodejs python shell powershell r julia

plugin-scripts's Introduction

Kestra workflow orchestrator

Event-Driven Declarative Orchestrator

Last Version License Github star
Kestra infinitely scalable orchestration and scheduling platform Slack

twitter   linkedin   youtube  


Get started in 4 minutes with Kestra

Get started with Kestra in 4 minutes.

Kestra Scripts Plugin

Plugin to orchestrate custom scripts in any language including Python, Node.js, R, Shell, PowerShell, and more.

Kestra orchestrator

Documentation

License

Apache 2.0 © Kestra Technologies

Stay up to date

We release new versions every month. Give the main repository a star to stay up to date with the latest releases and get notified about future updates.

Star the repo

plugin-scripts's People

Contributors

anna-geller avatar brian-mulier-p avatar dependabot[bot] avatar fhussonnois avatar jinsyin avatar loicmathieu avatar npranav10 avatar shrutimantri avatar skraye avatar smantri-moveworks avatar tchiotludo avatar ttmott avatar wrussell1999 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

plugin-scripts's Issues

Docker runner doesn't correctly output values in Script tasks

Describe the issue

reproducer:

id: test_send_param
namespace: tests
tasks:
  - id: task1
    type: io.kestra.plugin.scripts.python.Script
    runner: DOCKER
    beforeCommands:
      - pip install kestra
    script: |
      from kestra import Kestra

      print("1234\n\n")

      Kestra.outputs({'secrets': "test string"})
  - id: task2
    type: io.kestra.core.tasks.log.Log
    message: "{{ outputs.task1.vars.secrets }}"

Environment

  • Kestra Version: develop

Bad Docker API call if task contains `containerImage` that contains tag

Describe the issue

When the containerImage property of a task contains a value that contains a tag (specified after :), the generated API call to the Docker (or in our case Podman) daemon contains something like the following:

/images/create?fromImage=ghcr.io%2Fkestra-io%2Fkestrapy%3Alatest&tag=latest

Notice that the tag is supplied in two places, once in the image name itself (fromImage) and once in the tag parameter. I assume Docker proper handles this more gracefully, otherwise this would have never made it into a stable Kestra release. The Podman daemon however normalizes this to ghcr.io/kestra-io/kestrapy:latest:latest and fails with "invalid reference format". This is logged by the Podman daemon:

time="2024-09-06T15:54:19Z" level=debug msg="Looking up image \"ghcr.io/kestra-io/kestrapy:latest:latest\" in local containers storage"
time="2024-09-06T15:54:19Z" level=info msg="Request Failed(Internal Server Error): normalizing image: normalizing name for compat API: invalid reference format"
@ - - [06/Sep/2024:15:54:19 +0000] "POST /images/create?fromImage=ghcr.io%2Fkestra-io%2Fkestrapy%3Alatest&tag=latest HTTP/1.1" 500 174 "" "Apache-HttpClient/5.3.1 (Java/21.0.4)"

No further communication between Kestra and the daemon occurs and Kestra fails the task execution. The log in Kestra is not very helpful and only mentions a "broken pipe" in the Docker client:

java.lang.RuntimeException: java.io.IOException: Broken pipe
 at com.github.dockerjava.httpclient5.ApacheDockerHttpClientImpl.execute(ApacheDockerHttpClientImpl.java:210)
 ...

This can be worked around by explicitly including containerImage without a tag such as ghcr.io/kestra-io/kestrapy. A non-latest tag cannot be used however.

This was discovered because the default value changed from a simple python to ghcr.io/kestra-io/kestrapy:latest (commit 88b7352).

Note that if a tag other than latest is provided, it gets reflected in both places in the call:

time="2024-09-06T21:18:53Z" level=debug msg="Looking up image \"python:3.12.5:3.12.5\" in local containers storage"
time="2024-09-06T21:18:53Z" level=info msg="Request Failed(Internal Server Error): normalizing image: normalizing name for compat API: invalid reference format"
@ - - [06/Sep/2024:21:18:53 +0000] "POST /images/create?fromImage=python%3A3.12.5&tag=3.12.5 HTTP/1.1" 500 174 "" "Apache-HttpClient/5.3.1 (Java/21.0.4)"

Environment

  • Kestra Version: 0.18.4 (docker.io/kestra/kestra:v0.18.4)
  • Operating System (OS/Docker/Kubernetes): Rocky Linux 9.4
$ podman version
Client:       Podman Engine
Version:      4.9.4-rhel
API Version:  4.9.4-rhel
Go Version:   go1.21.11 (Red Hat 1.21.11-1.el9_4)
Built:        Wed Jul 24 06:07:55 2024
OS/Arch:      linux/amd64

conda activate does not work as shown in the Python Commands task example

Expected Behavior

The examples shown on this page: https://kestra.io/plugins/tasks/io.kestra.plugin.scripts.python.commands should work seamlessly.

Actual Behaviour

The first example with Conda is successful, but does not run "conda activate myCondaEnv" successfully.
The logs have this warning: "/bin/sh: 1: conda: not found" implying the conda command did not work.

Steps To Reproduce

Take the first example on the Commands task page:

id: python_venv
namespace: dev

tasks:
  - id: hello
    type: io.kestra.plugin.scripts.python.Commands
    namespaceFiles:
      enabled: true
    runner: PROCESS
    beforeCommands:
      - conda activate myCondaEnv
    commands:
      - python etl_script.py

And execute the flow.

Environment Information

  • Kestra Version: 0.14.4
  • Plugin version: 0.14.4
  • Operating System (OS / Docker / Kubernetes): Docker
  • Java Version (If not docker): N/A

Example flow

id: python_venv
namespace: dev

tasks:

  • id: hello
    type: io.kestra.plugin.scripts.python.Commands
    namespaceFiles:
    enabled: true
    runner: PROCESS
    beforeCommands:
    • conda activate myCondaEnv
      commands:
    • python etl_script.py

Docker runner split logs in multi-line if too long

Expected Behavior

No response

Actual Behaviour

See docker-java/docker-java#1647 & docker-java/docker-java#1754 & docker-java/docker-java#2050 which have not been addressed (and they probably won't soon).

This is due to the internal engine of java-docker-client which uses a fixed-size buffer preventing logs from being more than 1024 bytes long (else it is splitting it in multi line)
See this screen where my log gets split into multiple lines:
image

This is particularly an issue when you want for eg. to send a token from task to task as they can be really long, causing the echo ::{"outputs" to be split ence not recognized by the current log parser

Steps To Reproduce

No response

Environment Information

  • Kestra Version:
  • Plugin version:
  • Operating System (OS / Docker / Kubernetes):
  • Java Version (If not docker):

Example flow

No response

Add support for uv in Python script plugins (python, dbt, singer)

Add support for uv:

Working example that would be great to support in a less verbose way:

id: script_in_venv
namespace: dev
tasks:
  - id: bash
    type: io.kestra.plugin.scripts.python.Commands
    runner: DOCKER
    docker: 
      image: python:3.12-slim
    inputFiles:
      main.py: |
        import requests
        from kestra import Kestra

        response = requests.get('https://google.com')
        print(response.status_code)
        Kestra.outputs({'status': response.status_code, 'text': response.text})
    beforeCommands:
      - pip install uv
      - uv venv --quiet
      - . .venv/bin/activate --quiet
      - uv pip install --quiet requests kestra
    commands:
      - python main.py

This should work:

id: dbtGitDockerDuckDB
namespace: blueprint

tasks:
  - id: dbt
    type: io.kestra.core.tasks.flows.WorkingDirectory
    tasks:
    - id: cloneRepository
      type: io.kestra.plugin.git.Clone
      url: https://github.com/kestra-io/dbt-demo
      branch: main

    - id: dbt-build
      type: io.kestra.plugin.dbt.cli.DbtCLI
      runner: PROCESS
      beforeCommands:
        - curl -LsSf https://astral.sh/uv/install.sh | sh
        - source $HOME/.cargo/env
        - uv venv
        - source .venv/bin/activate
        - uv pip install dbt-duckdb
      commands:
        - dbt deps
        - dbt build

Currently, there are some issues to using uv as /bin/sh is forced as an interpreter. It makes sense to create a virtual environment with uv by default in all Python-based plugins when PROCESS runner is used. Alternatively, if easier, we can add UV runner that will create and activate uv venv by default

The `taskDefaults` doesn't apply to script tasks

Expected Behavior

Using this reproducer, you'll notice that the explicitly defined image (ghcr.io/kestra-io/pydata:latest) in the first Script task applies (as expected), but the next task is expected to use the default image provided in taskDefaults but it doesn't -- it directly uses the python:latest image instead of trying to first check if there are any taskDefaults for that task type.

id: pandasETLnew
namespace: blueprint

tasks:
  - id: wdir
    type: io.kestra.core.tasks.flows.WorkingDirectory
    tasks:
      - id: extractCsv
        type: io.kestra.plugin.scripts.python.Script
        script: |
          import pandas as pd
          data = {
              'Column1': ['A', 'B', 'C', 'D'],
              'Column2': [1, 2, 3, 4],
              'Column3': [5, 6, 7, 8]
          }
          df = pd.DataFrame(data)
          df.to_csv("data.csv", index=False)
        warningOnStdErr: false
        docker:
          image: ghcr.io/kestra-io/pydata:latest

      - id: outputFiles
        type: io.kestra.core.tasks.storages.LocalFiles
        outputs:
          - "*.csv"

      - id: inputFiles
        type: io.kestra.core.tasks.storages.LocalFiles
        inputs:
          data.csv: "{{outputs.outputFiles.uris['data.csv']}}"

      - id: transformAndLoadCsv
        type: io.kestra.plugin.scripts.python.Script
        script: |
          import pandas as pd
          df = pd.read_csv("data.csv")
          df['Column4'] = df['Column2'] + df['Column3']
          df.to_csv("final_result.csv", index=False)

taskDefaults:
  - type: io.kestra.plugin.scripts.python.Script
    values:
      warningOnStdErr: false
      docker:
        image: ghcr.io/kestra-io/pydata:latest

Actual Behaviour

image

Steps To Reproduce

No response

Environment Information

  • Kestra Version: develop-full 0.10.1

Example flow

No response

io.kestra.plugin.scripts.python.Commands - containerImage dynamic feature not working

Expected Behavior

...
  - id: fetch_data
    type: io.kestra.plugin.scripts.python.Commands
    beforeCommands:
      - export PYTHONPATH=src
      - export GCP_PROJECT_ID={{ secret('GCP_PROJECT_ID') }}
    commands:
      - python {{ inputs.fetch_command }} --uuid_str {{ inputs.uuid_str }}
    containerImage: "us-central1-docker.pkg.dev/{{ secret('GCP_PROJECT_ID') }}/kestra/kestra-woohoo-v2:latest"
...

The image utilizes the following container static container image in Batch Job:
us-central1-docker.pkg.dev/{{ secret('GCP_PROJECT_ID') }}/kestra/kestra-woohoo-v2:latest , no secret substitution.```yaml
...

  • id: fetch_data
    type: io.kestra.plugin.scripts.python.Commands
    beforeCommands:
    • export PYTHONPATH=src
    • export GCP_PROJECT_ID={{ secret('GCP_PROJECT_ID') }}
      commands:
    • python {{ inputs.fetch_command }} --uuid_str {{ inputs.uuid_str }}
      containerImage: "us-central1-docker.pkg.dev/{{ secret('GCP_PROJECT_ID') }}/kestra/kestra-woohoo-v2:latest"
      ...

expected container image:

`us-central1-docker.pkg.dev/my-gcp-project-id/kestra/kestra-woohoo-v2:latest`

### Actual Behaviour

The image utilizes the following container static container image in Batch Job:
`us-central1-docker.pkg.dev/{{ secret('GCP_PROJECT_ID') }}/kestra/kestra-woohoo-v2:latest` , no secret substitution.

### Steps To Reproduce

```yaml
  - id: fetch_data
    type: io.kestra.plugin.scripts.python.Commands
    beforeCommands:
      - export PYTHONPATH=src
      - export GCP_PROJECT_ID={{ secret('GCP_PROJECT_ID') }}
    commands:
      - python {{ inputs.fetch_command }} --uuid_str {{ inputs.uuid_str }}
    containerImage: "us-central1-docker.pkg.dev/{{ secret('GCP_PROJECT_ID') }}/kestra/kestra-woohoo-v2:latest"
    warningOnStdErr: false
    taskRunner:
      type: io.kestra.plugin.gcp.runner.GcpBatchTaskRunner	
 ....

Environment Information

  • Kestra Version: 0.16.0
  • Plugin version: 0.16.1
  • Operating System (OS / Docker / Kubernetes): Docker Compose GCP VM
  • Java Version (If not docker): n/a

Example flow

  - id: fetch_data
    type: io.kestra.plugin.scripts.python.Commands
    beforeCommands:
      - export PYTHONPATH=src
      - export GCP_PROJECT_ID={{ secret('GCP_PROJECT_ID') }}
    commands:
      - python {{ inputs.fetch_command }} --uuid_str {{ inputs.uuid_str }}
    containerImage: "us-central1-docker.pkg.dev/{{ secret('GCP_PROJECT_ID') }}/kestra/kestra-woohoo-v2:latest"
    warningOnStdErr: false
    taskRunner:
      type: io.kestra.plugin.gcp.runner.GcpBatchTaskRunner	
 ....

Stream sys.stdout in script tasks faster

Feature description

The following script using a logger will stream logs immediately:

id: loguru
namespace: blueprints

inputs:
  - name: nr_logs
    type: INT
    defaults: 1000

tasks:
  - id: reproducer
    type: io.kestra.plugin.scripts.python.Script
    warningOnStdErr: false
    docker:
      image: ghcr.io/kestra-io/pydata:latest
    script: |
        from loguru import logger
        from faker import Faker
        import time
        import sys

        logger.remove()
        logger.add(sys.stdout, level="INFO")
        logger.add(sys.stderr, level="WARNING")


        def generate_logs(fake, num_logs):
            logger.debug("This message will not show up as the log level is set to INFO")
            logger.warning("Starting to generate log messages")
            for _ in range(num_logs):
                log_message = fake.sentence()
                logger.info(log_message)
                time.sleep(0.1)
            logger.warning("Finished generating log messages")


        if __name__ == "__main__":
            faker_ = Faker()
            generate_logs(faker_, int("{{ inputs.nr_logs }}"))

But using print() will load them in batches roughly every 20 seconds:

id: printer
namespace: dev

inputs:
  - name: nr_logs
    type: INT
    defaults: 1000

tasks:
  - id: reproducer
    type: io.kestra.plugin.scripts.python.Script
    warningOnStdErr: false
    runner: DOCKER
    docker:
      image: ghcr.io/kestra-io/pydata:latest
    script: |
      from faker import Faker
      import time
      import sys

      def generate_logs(fake, num_logs):
          print("Starting to generate log messages", file=sys.stderr)
          for _ in range(num_logs):
              log_message = fake.sentence()
              print(log_message)
              time.sleep(0.1)
          print("Finished generating log messages", file=sys.stderr)

      if __name__ == "__main__":
          faker_ = Faker()
          generate_logs(faker_, int("{{ inputs.nr_logs }}"))

TODO: investigate if we can get logs immediately even using the second option

Powershell process outputFiles not detecting files after updating to 0.18.2 from 0.17.

Expected Behavior

Thread from Slack: https://kestra-io.slack.com/archives/C03FQKXRK3K/p1724185082477709

0.17 version of flow:

  - id: "run_query"
    type: "io.kestra.plugin.scripts.powershell.Commands"
    failFast: false
    runner: PROCESS
    interpreter: [powershell]
    commands: 
      - "<< a powershell command that generates *.tab files in the ./ directory>>"
    outputFiles:
    - "*.tab"

0.18.2:

  - id: "run_query"
    type: "io.kestra.plugin.scripts.powershell.Commands"
    failFast: false
    taskRunner: 
      type: io.kestra.plugin.core.runner.Process
    interpreter: [powershell]
    targetOS: WINDOWS
    namespaceFiles: 
      enabled: false
    commands: 
      - "<< a powershell command that generates *.tab files in the ./ directory>>"
    outputFiles:
    - "*.tab"

outputFiles should be detected same as 0.17

Actual Behaviour

Task doesn't capture outputs it used to capture:

2024-08-20 16:11:17.539 Captured 0 output(s).

Steps To Reproduce

No response

Environment Information

Example flow

No response

Credentials in the config seems broken

Expected Behavior

This should works

image

Actual Behaviour

After the recent commit : 1fdd68f, its seems that using credentials in the config does not work anymore
image

Using the new credentials object fix the issue, but it should break the existing.

Steps To Reproduce

No response

Environment Information

  • Kestra Version: 0.12.0-SNAPSHOT

Example flow

No response

OutputDir examples

Issue description

Write an example for each task including the usage of OutputDir.

`containerImage` doesn't work with `python.Commands` on 0.18

Expected Behavior

When executing a flow using io.kestra.plugin.scripts.python.Commands, and containerImage set to ghcr.io/kestra-io/pydata:latest, pandas should be included.

Actual Behaviour

When executing the flow, it errors as it can't find pandas. This suggests that containerImage is not working as pydata includes pandas.
Screenshot 2024-06-07 at 17 53 31

This works if you explicitly set the task runner to docker despite being set to docker by default.

id: gitPython
namespace: blueprint

tasks:
  - id: pythonScripts
    type: io.kestra.plugin.core.flow.WorkingDirectory
    tasks:
    - id: cloneRepository
      type: io.kestra.plugin.git.Clone
      url: https://github.com/kestra-io/scripts
      branch: main
    
    - id: python
      type: io.kestra.plugin.scripts.python.Commands
      warningOnStdErr: false
      taskRunner:
        type: io.kestra.plugin.scripts.runner.docker.Docker
      containerImage: ghcr.io/kestra-io/pydata:latest
      commands:
        - python etl/global_power_plant.py

If taskRunner needs to be set, an error should be specified / documentation updated to clear this up.

Steps To Reproduce

No response

Environment Information

  • Kestra Version: 0.18 Snapshot
  • Plugin version: Latest
  • Operating System (OS / Docker / Kubernetes):
  • Java Version (If not docker):

Example flow

This works:

id: gitPython
namespace: blueprint

tasks:
  - id: pythonScripts
    type: io.kestra.plugin.core.flow.WorkingDirectory
    tasks:
    - id: cloneRepository
      type: io.kestra.plugin.git.Clone
      url: https://github.com/kestra-io/scripts
      branch: main
    
    - id: python
      type: io.kestra.plugin.scripts.python.Commands
      warningOnStdErr: false
      docker: 
        image: ghcr.io/kestra-io/pydata:latest
      commands:
        - python etl/global_power_plant.py

This also works:

id: gitPython
namespace: blueprint

tasks:
  - id: pythonScripts
    type: io.kestra.plugin.core.flow.WorkingDirectory
    tasks:
    - id: cloneRepository
      type: io.kestra.plugin.git.Clone
      url: https://github.com/kestra-io/scripts
      branch: main
    
    - id: python
      type: io.kestra.plugin.scripts.python.Commands
      warningOnStdErr: false
      taskRunner:
        type: io.kestra.plugin.scripts.runner.docker.Docker
      containerImage: ghcr.io/kestra-io/pydata:latest
      commands:
        - python etl/global_power_plant.py

This causes it to error becauses pandas can't be found (containerImage used instead):

id: gitPython
namespace: blueprint

tasks:
  - id: pythonScripts
    type: io.kestra.plugin.core.flow.WorkingDirectory
    tasks:
    - id: cloneRepository
      type: io.kestra.plugin.git.Clone
      url: https://github.com/kestra-io/scripts
      branch: main
    
    - id: python
      type: io.kestra.plugin.scripts.python.Commands
      warningOnStdErr: false
      containerImage: ghcr.io/kestra-io/pydata:latest
      commands:
        - python etl/global_power_plant.py

Blueprint #140 on serverless lambda does not run successfully

Expected Behavior

The blueprint should run successfully.

Actual Behaviour

The blueprint flow fails with an error.

Stacktrace:

2024-06-23 22:12:55.532Provided 2 input(s).
2024-06-23 22:12:55.553Starting command with pid 210 [/bin/sh -c set -e
sls deploy
sls invoke -f etl --log]
2024-06-23 22:12:55.561/bin/sh: 2: sls: not found
2024-06-23 22:12:55.567Command failed with code 127
2024-06-23 22:12:55.567io.kestra.core.models.tasks.runners.TaskException: Command failed with code 127
	at io.kestra.plugin.core.runner.Process.run(Process.java:118)
	at io.kestra.plugin.scripts.exec.scripts.runners.CommandsWrapper.run(CommandsWrapper.java:159)
	at io.kestra.plugin.scripts.node.Commands.run(Commands.java:90)
	at io.kestra.plugin.scripts.node.Commands.run(Commands.java:18)
	at io.kestra.core.runners.WorkerTaskThread.doRun(WorkerTaskThread.java:77)
	at io.kestra.core.runners.AbstractWorkerThread.run(AbstractWorkerThread.java:56)

Steps To Reproduce

Put the blueprint flow in the Kestra env, and Execute the flow.

Environment Information

  • Kestra Version: 0.17.4
  • Plugin version: 0.17.4
  • Operating System (OS / Docker / Kubernetes): Docker
  • Java Version (If not docker):

Example flow

id: lambda
namespace: company.team

tasks:
  - id: sls_commands
    type: io.kestra.plugin.scripts.node.Commands
    description: npm install -g serverless
    runner: PROCESS
    warningOnStdErr: false
    inputFiles:
      serverless.yml: |
        service: lambda
        frameworkVersion: '3'

        provider:
          name: aws
          runtime: python3.9
          region: eu-central-1
          memorySize: 512 # optional, in MB, default is 1024; can be 128, 256, 512, 1024, 2048, 4096, 5120, ...
          timeout: 10 # optional, in seconds, default is 6

        functions:
          etl:
            handler: handler.run

      handler.py: |
        import platform
        import sys

        def extract() -> int:
            print("Extracting data...")
            return 21


        def transform(x: int) -> int:
            print("Transforming data...")
            return x * 2


        def load(x: int) -> None:
            print(f"Loading {x} into destination...")


        def run(event=None, context=None):
            raw_data = extract()
            transformed = transform(raw_data)
            load(transformed)
            print("Hello from Kestra 🚀")
            print(f"Host's network name = {platform.node()}")
            print(f"Python version = {platform.python_version()}")
            print(f"Platform information (instance type) = {platform.platform()}")
            print(f"OS/Arch = {sys.platform}/{platform.machine()}")
    commands:
      - sls deploy 
      - sls invoke -f etl --log
      # - sls remove

Jython Eval flow example not working as expected

Expected Behavior

On executing the flow, I am expecting the "outputs.out" to have a file containing 2 rows:

555
666

Actual Behaviour

Getting an empty file in outputs.out.

Steps To Reproduce

  1. Run the flow example as mentioned on this page
  2. Check the outputs.out in the Outputs tab. The file will be empty.

Environment Information

  • Kestra Version: 0.13.8
  • Plugin version: 0.13.8
  • Operating System (OS / Docker / Kubernetes): Docker
  • Java Version (If not docker):

Example flow

id: "eval"
type: "io.kestra.plugin.scripts.jython.Eval"
outputs:
  - out
  - map
script: |
  from io.kestra.core.models.executions.metrics import Counter
  import tempfile
  from java.io import File
  
  logger.info('executionId: {}', runContext.render('{{ execution.id }}'))
  runContext.metric(Counter.of('total', 666, 'name', 'bla'))
  
  map = {'test': 'here'}
  tempFile = tempfile.NamedTemporaryFile()
  tempFile.write('555\n666\n')
  
  out = runContext.putTempFile(File(tempFile.name))

Dind permission issue: mounting local script files as volumes doesn't work when running Kestra itself in Docker

Expected Behavior

I was trying to mount a local directory to be used as a Kestra storage:

  kestra:
    volumes:
      - /Users/anna/.kestra/storage:/app/storage

Then I added the etl_script.py to that folder

image

Example flow:

id: pythonVolume
namespace: dev
tasks:
  - id: anyPythonScript
    type: io.kestra.plugin.scripts.python.Commands
    runner: DOCKER
    docker:
      image: ghcr.io/kestra-io/pydata:latest
      volumes:
        - /app/storage:/app
    commands:
      - python /app/etl_script.py
      # - python /app/storage/etl_script.py

Actual Behaviour

Image pulled [ghcr.io/kestra-io/pydata:latest]
Status 500: Mounts denied: 
The path /app/storage is not shared from the host and is not known to Docker.
You can configure shared paths from Docker -> Preferences... -> Resources -> File Sharing.
See https://docs.docker.com/desktop/mac for more info.
com.github.dockerjava.api.exception.InternalServerErrorException: Status 500: Mounts denied: 
The path /app/storage is not shared from the host and is not known to Docker.
You can configure shared paths from Docker -> Preferences... -> Resources -> File Sharing.
See https://docs.docker.com/desktop/mac for more info.

	at com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:247)
	at com.github.dockerjava.core.DefaultInvocationBuilder.post(DefaultInvocationBuilder.java:102)
	at com.github.dockerjava.core.exec.StartContainerCmdExec.execute(StartContainerCmdExec.java:31)
	at com.github.dockerjava.core.exec.StartContainerCmdExec.execute(StartContainerCmdExec.java:13)
	at com.github.dockerjava.core.exec.AbstrSyncDockerCmdExec.exec(AbstrSyncDockerCmdExec.java:21)
	at com.github.dockerjava.core.command.AbstrDockerCmd.exec(AbstrDockerCmd.java:33)
	at com.github.dockerjava.core.command.StartContainerCmdImpl.exec(StartContainerCmdImpl.java:42)
	at io.kestra.plugin.scripts.exec.scripts.runners.DockerScriptRunner.run(DockerScriptRunner.java:116)
	at io.kestra.plugin.scripts.exec.scripts.runners.CommandsWrapper.run(CommandsWrapper.java:123)
	at io.kestra.plugin.scripts.python.Commands.run(Commands.java:90)
	at io.kestra.plugin.scripts.python.Commands.run(Commands.java:20)
	at io.kestra.core.runners.Worker$WorkerThread.run(Worker.java:635)

image

Steps To Reproduce

No response

Environment Information

  • Kestra Version: develop-full image

Example flow

No response

Add `registry`, `username` and `password` to the `docker` property

Feature description

Current behavior

Currently, to pass credentials to a private container registry, the user has to pass the entire JSON config, which might be cumbersome. Examples:

GCP Artifact Registry:

      - id: analyzeSales
        type: io.kestra.plugin.scripts.python.Script
        script: ...
        docker:
          image: yourGcpRegion-docker.pkg.dev/YOUR_GCP_PROJECT_NAME/flows/python:latest
          config: |
            {
              "auths": {
                "europe-west3-docker.pkg.dev": {
                    "username": "oauth2accesstoken",
                    "password": "{{outputs.fetchAuthToken.accessToken.tokenValue}}"
                  }
              }
            }

AWS ECR:

  - id: py
    type: io.kestra.plugin.scripts.python.Commands
    docker:
      image: 338306982838.dkr.ecr.eu-central-1.amazonaws.com/data-infastructure:latest
      config: |
        {
          "auths": {
            "338306982838.dkr.ecr.eu-central-1.amazonaws.com": {
                "username": "AWS",
                "password": "{{outputs.aws.vars.token}}"
              }
          }
        }
    commands:
      - python --version # e.g. python yourscript.py

Proposal

Add registry, username and password to the docker property:

  - id: python
    type: io.kestra.plugin.scripts.python.Commands
    docker:
     image: 338306982838.dkr.ecr.eu-central-1.amazonaws.com/data-infastructure:latest
     credentials:
        registry: "338306982838.dkr.ecr.eu-central-1.amazonaws.com"
        username: AWS
        password: "{{outputs.aws.vars.token}}"
    commands:
      - python --version # e.g. python yourscript.py

GString objects are not properly serialized

Expected Behavior

I use a lot of Groovy scripts to transform datasets.

To give a little context: in Groovy, when we use string substitution, like "${key} ${value}", it creates an instance of GString (it's not String). Unfortunately, these instances are not properly serialized when written in ION files.

For example, if I have this script:

def value = "awesome"
row = "Kestra is ${value}!"

When serialized in ION, I would like to have this result:

"Kestra is awesome"

Actual Behaviour

But, I find:

{values:["awesome"],strings:["Kestra is ","!"],empty:false,valueCount:1,bytes:{{S2VzdHJhIGlzIGF3ZXNvbWUh}}}

The GString object is serialized as a POJO, instead of calling its representation with toString()

For other needs, I created this Jackson module, registrable mapper.registerModule(new GroovyModule()) :

public class GroovyModule extends SimpleModule {

    public GroovyModule() {
        addSerializer(GString.class, new GStringSerializer());
    }

    private static class GStringSerializer extends ToStringSerializerBase {

        public GStringSerializer() {
            super(GString.class);
        }

        @Override
        public String valueToString(Object value) {
            return value.toString();
        }
    }
}

Steps To Reproduce

No response

Environment Information

  • Kestra Version: 0.13.4
  • Operating System (OS / Docker / Kubernetes): K8S

Example flow

id: bad-serdes
namespace: dev.yvrng
tasks:
  - id: transform
    type: io.kestra.plugin.scripts.groovy.FileTransform
    from: >
      [{"value":"awesome"}]
    script: >
      row = "Kestra is ${row.value}!"

Error logging is not clear when running an invalid script task.

Describe the issue

Task:

Running a Python script task.

Problem:

When running a Python script task, if no code is present an error is thrown, however it fails to clearly identify what the issue is.

Code:

id: clickhouse_tfl_status_update
namespace: company.team
tasks:
  - id: get-tfl-status
    type: io.kestra.plugin.scripts.python.Script
    docker:
      image: python:slim
    beforeCommands:
    - pip install requests 
    warningOnStdErr: false
    script: |

Error:

Screenshot 2024-03-01 at 10 39 59

Environment

  • Kestra Version: 0.150.0
  • Operating System (OS/Docker/Kubernetes): Mac OS, Docker Desktop.
  • Java Version (if you don't run kestra in Docker): N/A

Allow users to set Workdir of the docker container

Feature description

Currently it's not possible to set the --workdir param of the container (which sould be relative to the current workdir of the container).

Having this option make some Docker images way easier to use like Terraform image.

In order to perform terraform plan in the correct location within our Kestra WorkingDirectory, we need to be able to specify this option to tell the container where is our terraform code (i.e. /environment/dev) in the mounted volume

Powershell Commands flow example failing to run

Expected Behavior

The flow example should run successfully.

Actual Behaviour

Flow example execution fails with the following error: Object reference not set to an instance of an object.

2024-01-27 13:37:32,901 INFO  jdbc-queue_2 flow.powershell-command [namespace: dev] [flow: powershell-command] [execution: cIfkERVTtKcp2ucVFHDi2] Flow started
2024-01-27 13:37:32,967 INFO  worker_1     f.p.powershell_script [namespace: dev] [flow: powershell-command] [task: powershell_script] [execution: cIfkERVTtKcp2ucVFHDi2] [taskrun: 6TAlnOOMgqGEFb1gzhyBEi] [value: null] Type Commands started
2024-01-27 13:37:32,972 INFO  WorkerThread f.p.c.6TAlnOOMgqGEFb1gzhyBEi Provided 1 input(s).
2024-01-27 13:37:36,173 WARN  docker-java-stream--958154225 f.p.c.6TAlnOOMgqGEFb1gzhyBEi One or more errors occurred. (Object reference not set to an instance of an object.)
2024-01-27 13:37:37,864 WARN  docker-java-stream--958154225 f.p.c.6TAlnOOMgqGEFb1gzhyBEi Object reference not set to an instance of an object.
2024-01-27 13:37:38,016 WARN  docker-java-stream--958154225 f.p.c.6TAlnOOMgqGEFb1gzhyBEi 

2024-01-27 13:37:38,094 ERROR WorkerThread f.p.c.6TAlnOOMgqGEFb1gzhyBEi Command failed with code 1
io.kestra.plugin.scripts.exec.scripts.runners.ScriptException: Command failed with code 1
	at io.kestra.plugin.scripts.exec.scripts.runners.DockerScriptRunner.run(DockerScriptRunner.java:171)
	at io.kestra.plugin.scripts.exec.scripts.runners.CommandsWrapper.run(CommandsWrapper.java:159)
	at io.kestra.plugin.scripts.powershell.Commands.run(Commands.java:95)
	at io.kestra.plugin.scripts.powershell.Commands.run(Commands.java:20)
	at io.kestra.core.runners.Worker$WorkerThread.run(Worker.java:684)

Steps To Reproduce

Create a flow with the example given on this page and execute it.

Environment Information

  • Kestra Version: 0.13.8
  • Plugin version: 0.13.8
  • Operating System (OS / Docker / Kubernetes): Docker
  • Java Version (If not docker): N/A

Example flow

id: powershell
namespace: dev
tasks:
  - id: powershell_script
    type: io.kestra.plugin.scripts.powershell.Commands
    inputFiles:
      main.ps1: |
        Get-ChildItem | Format-List
    commands:
      - pwsh main.ps1

Add support for AWS ECR

Feature description

Usually, when working with private Docker registries in Script tasks, you can do:

id: pythonCommandsExample
namespace: dev

tasks:
  - id: wdir
    type: io.kestra.core.tasks.flows.WorkingDirectory
    tasks:
      - id: cloneRepository
        type: io.kestra.plugin.git.Clone
        url: https://github.com/kestra-io/examples
        branch: main

      - id: gitPythonScripts
        type: io.kestra.plugin.scripts.python.Commands
        warningOnStdErr: false
        runner: DOCKER
        docker:
          image: ghcr.io/kestra-io/pydata:latest
          config: |
            {
              "auths": {
                  "https://index.docker.io/v1/": {
                      "username": "annageller",
                      "password": "dckr_pat_xxxxx"
                  }
              }
            }
        beforeCommands:
          - pip install faker > /dev/null
        commands:
          - python scripts/etl_script.py
          - python scripts/generate_orders.py
      
      - id: outputFile
        type: io.kestra.core.tasks.storages.LocalFiles
        outputs:
          - orders.csv

  - id: loadCsvToS3
    type: io.kestra.plugin.aws.s3.Upload
    accessKeyId: "{{secret('AWS_ACCESS_KEY_ID')}}"
    secretKeyId: "{{secret('AWS_SECRET_ACCESS_KEY')}}"
    region: eu-central-1
    bucket: kestraio
    key: stage/orders.csv
    from: "{{outputs.outputFile.uris['orders.csv']}}"

However, this doesn't work with ECR as it requires a token retrieved using aws ecr get-login-password --region eu-central-1 | docker login --username AWS --password-stdin 338306982838.dkr.eu-central-1.amazonaws.com

TBD: figure out a simple solution, optionally combining it with kestra-io/plugin-aws#199

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.