Coder Social home page Coder Social logo

plugin-fs's Introduction

Kestra workflow orchestrator

Event-Driven Declarative Orchestrator

Last Version License Github star
Kestra infinitely scalable orchestration and scheduling platform Slack

twitter   linkedin   youtube  


Get started in 4 minutes with Kestra

Get started with Kestra in 4 minutes.

Kestra HTTP API and File System Plugin

Plugin to interract with file systems and HTTP APIs

Kestra orchestrator

Documentation

License

Apache 2.0 © Kestra Technologies

Stay up to date

We release new versions every month. Give the main repository a star to stay up to date with the latest releases and get notified about future updates.

Star the repo

plugin-fs's People

Contributors

alexandrebrg avatar anna-geller avatar aurelienwls avatar brian-mulier-p avatar dependabot[bot] avatar eregnier avatar fhussonnois avatar guiguir68 avatar loicmathieu avatar pierrez avatar skraye avatar smantri-moveworks avatar tchiotludo avatar wrussell1999 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

plugin-fs's Issues

http.Request POST response body returns request body instead

Expected Behavior

When you make a POST request with io.kestra.plugin.fs.http.Request using a JSON body, you can access the response body through {{ outputs.task_id.body }}.

Actual Behaviour

When you make a POST request and access the output, the body is the same as the property body used to make the request.

Steps To Reproduce

Run the following flow:

id: http_request_example
namespace: example
description: Make a HTTP Request and Handle the Output

inputs:
  - id: payload
    type: JSON
    defaults: |
      [{"name": "Will", "job": "Developer Advocate"}]

tasks:
  - id: send_data
    type: io.kestra.plugin.fs.http.Request
    uri: https://reqres.in/api/users
    method: POST
    contentType: application/json
    body: "{{ inputs.payload }}"

  - id: print_status
    type: io.kestra.core.tasks.log.Log
    message: "{{ outputs.send_data.body }}"

The output will look like this (the same as the request body):
image

Documentation for the API

Postman is able to get a response body that matches the documentation.
Screenshot 2024-04-19 at 15 38 02
Screenshot 2024-04-19 at 15 38 16

Environment Information

  • Kestra Version: 0.16.1
  • Plugin version:
  • Operating System (OS / Docker / Kubernetes):
  • Java Version (If not docker):

Example flow

Example 1

id: http_request_example
namespace: example
description: Make a HTTP Request and Handle the Output

inputs:
  - id: payload
    type: JSON
    defaults: |
      [{"name": "Will", "job": "Developer Advocate"}]

tasks:
  - id: send_data
    type: io.kestra.plugin.fs.http.Request
    uri: https://reqres.in/api/users
    method: POST
    contentType: application/json
    body: "{{ inputs.payload }}"

  - id: print_status
    type: io.kestra.core.tasks.log.Log
    message: "{{ outputs.send_data.body }}"

Example 2

id: rest_api_with_inputs
namespace: dev

labels:
  env: prod
  country: US

tasks:
  - id: extract
    type: io.kestra.plugin.fs.http.Request
    uri: https://dummyjson.com/products
    method: GET

  - id: load
    type: io.kestra.plugin.fs.http.Request
    uri: https://reqres.in/api/products
    method: POST
    contentType: application/json
    body: "{{outputs.extract.body}}"

Allow the file detection trigger to fire only when a file was updated

Feature description

For now, the trigger doesn't maintain the state (no way to know which file was already processed and when) but RealtimeTriggers are more stateful (always-on process), so it seems feasible to keep a state about which files have been scanned and which were updated (even if as a local cache on the Worker)

Plugins appearing dark on dark background even when currentColor is used

Expected Behavior

All the images have currentColor used in them, and hence they should have appeared light on dar background.

Actual Behaviour

Plugins are appearing dark on dark background.

Screenshot 2024-05-16 at 4 13 20 PM

Steps To Reproduce

N/A

Environment Information

  • Kestra Version: N/A
  • Plugin version:
  • Operating System (OS / Docker / Kubernetes):
  • Java Version (If not docker):

Example flow

No response

[HTTP] Ability to provide headers from templated string

Feature description

Today you can't construct your headers before-hand and inject them in your Request / Download task. This is sad as it would leverage the ability for eg. to have a subflow that handles the whole corp' authentication (call corp' authentication provider to get a token and inject it directly in the headers).

Basically a subflow using this feature would be

inputs:
  - name: headers
    type: JSON
  - name: body
    type: JSON
  ...
tasks:
  - id: oauth
    type: Request
    url: "https://auth-provider/oauth"
    method: POST
    headers:
      client_id: {{ secret('clientId') }}
      client_secret: {{ secret('clientSecret') }}
  - id: call
    type: Request
    url: "{{ inputs.url }}"
    method: "{{ inputs.method }}"
    headers: |
      [{% for header in to_entries(inputs.headers) %}
        {{header.key}}: {{header.value}},
      {% endfor %}
        Authorization: Bearer {{ outputs.oauth.token }}
      ]
    body: "{{ inputs.body }}"

That's not ideal syntax but at least it removes the need of creating a custom plugin to handle oauth

Ideally this remark should also apply to Request method to allow templating it also.

I know it removes the type-checking at compile-time but imo it brings so much value to "Kestra on its own" mindset

SSH Command plugin does not work when loading private key from a secret

Expected Behavior

Be able to establish a SSH connection when loading a private key from a secret

Actual Behaviour

Unable to establish a SSH connection when loading a private key from a secret

Steps To Reproduce

  1. launch kestra with docker
  2. attach private key with env variable that is base64 encoded
  3. run task with secret

I have validated that the environment variable is correct by decoding the key in the docker console and using ssh directly in the docker console to make a connection. The key also works when added to the flow directly in plain text, it only does not work when loaded from a secret.

Environment Information

  • Kestra Version: 0.16.6
  • Plugin version: 0.16.0
  • Operating System (OS / Docker / Kubernetes): Docker
  • Java Version (If not docker):

Example flow

Flow that works

id: test
namespace: test
description: test

inputs:
  - id: host
    type: STRING

tasks:
  - id: install-docker-on-host
    type: io.kestra.plugin.fs.ssh.Command
    authMethod: PUBLIC_KEY
    commands:
      - pwd
    host: "{{ inputs.host }}"
    privateKey: |-
        -----BEGIN OPENSSH PRIVATE KEY-----
        ...redacted...
        -----END OPENSSH PRIVATE KEY-----
    username: root

Flow that doesn't

id: test
namespace: test
description: test
inputs:
  - id: host
    type: STRING

tasks:
  - id: install-docker-on-host
    type: io.kestra.plugin.fs.ssh.Command
    authMethod: PUBLIC_KEY
    commands:
      - pwd
    host: "{{ inputs.host }}"
    privateKey: "{{ secret('ID_KEY') }}"
    username: root

[HTTP] uri is not tagged as required

Expected Behavior

uri (unless I miss something) should be a required field

Actual Behaviour

uri is not tagged as required field

Steps To Reproduce

No response

Environment Information

  • Kestra Version:
  • Plugin version:
  • Operating System (OS / Docker / Kubernetes):
  • Java Version (If not docker):

Example flow

No response

[HTTP] Retry policy based on HTTP code (body ?)

Feature description

It would be great to be able to define the retry policy of HTTP tasks based on the HTTP Response code or body.
For eg. we should be able to have an infinite retry on 5XX errors but not on 4XX
Maybe something like that :

retry:
  valueOf: {{outputs.myHttpRequest.code}}
  policies:
    '5\d*':
      type: constant
      interval: PT30S
    '400':
      type: none
  default:
    type: constant
    interval: PT30S
    maxAttempt: 3

HTTP/Download: Allow empty response

Hi 👋

I have a usecase where I'm downloading many files from an API that may returns legit, empty files. Using the latest version of the plugin, an error is thrown here:

if (size == null) {
throw new HttpClientResponseException("No response from server", HttpResponse.status(HttpStatus.SERVICE_UNAVAILABLE));
}

Would that be possible to:

  • avoid failing conditionally based on an input property (default will keep the current behavior),
  • report back the size in an output property so that we can handle the case if needed

I would be happy to make the contribution if needed 😄

Download and Request tasks throw "No HttpClientFactory present on classpath, cannot create client"

Issue description

2024-02-06 10:19:21.939java.lang.IllegalStateException: No HttpClientFactory present on classpath, cannot create client
	at io.micronaut.http.client.HttpClientFactoryResolver.resolveClientFactory(HttpClientFactoryResolver.java:50)
	at io.micronaut.http.client.HttpClientFactoryResolver.getFactory(HttpClientFactoryResolver.java:38)
	at io.micronaut.http.client.HttpClient.create(HttpClient.java:264)
	at io.micronaut.reactor.http.client.ReactorHttpClient.create(ReactorHttpClient.java:130)
	at io.kestra.plugin.fs.http.AbstractHttp.client(AbstractHttp.java:135)
	at io.kestra.plugin.fs.http.Request.run(Request.java:88)
	at io.kestra.plugin.fs.http.Request.run(Request.java:23)
	at io.kestra.core.runners.Worker$WorkerThread.run(Worker.java:720)

from the guided tour example:

id: welcome
namespace: company.team
description: Welcome to Kestra!

inputs:
- id: user
  type: STRING
  defaults: Kestra user

tasks:
- id: api
  type: io.kestra.plugin.fs.http.Request
  uri: https://dummyjson.com/products

also:

id: parse_pdf
namespace: blueprint

tasks:
  - id: download_pdf
    type: io.kestra.plugin.fs.http.Download
    uri: https://huggingface.co/datasets/kestra/datasets/resolve/main/pdf/app_store.pdf

Unable to use {{ trigger.date }} inside a trigger parameter string

Expected Behavior

I want to set a trigger to ingest a unique url every 15 minutes. I would like one portion of the url to change for each ingestion, depending on the time of the trigger. This portion is formatted as so: 'yyyyMMddHHmm' I cannot use now(), because occasionally it runs a minute late, causing the url portion to end with 16 instead of 15, which does not exist. The ideal solution is to replace {{ now() }} with {{ trigger.date }} so that the url format will always represent a 15 minute interval.

Actual Behaviour

I get this error when I attempt to use {{ trigger.date }} instead of {{ now() }}:

"Evaluate Failed with error 'Missing variable: 'date' on 'http://...{{ trigger.date | date('yyyyMMddHHmm') }}00.CSV.zip' at line 1'"

Steps To Reproduce

  1. Begin creating a flow with a trigger in the kestra editor
  2. Make one of the inputs a url
  3. In the trigger, take a url as an input parameter.
  4. Use {{ trigger.date }} as a variable in the url
  5. Execute the flow and watch the logs in kestra for an error/success

Environment Information

  • Kestra Version:
  • Plugin version:
  • Operating System (OS / Docker / Kubernetes):
  • Java Version (If not docker):

Example flow

Screenshot 2024-06-13 at 9 42 05 AM

Set ENV variables for the `ssh` command

Feature description

I would like to initialize the ssh shell with environment variables
https://kestra.io/plugins/plugin-fs/tasks/ssh/io.kestra.plugin.fs.ssh.command

environment:
  - B2_ACCOUNT_ID: 0050d2528eaa6af0000000001
  - B2_ACCOUNT_KEY: 7005f3QteNYV4zrj8YaAJHThwJnb9JM
  - RESTIC_PASSWORD: 6RQ2P@txq79aa
  - RESTIC_REPOSITORY: b2:JHS-BucketName:/subfolder

for example the complete task would be:

id: "command"
type: "io.kestra.plugin.fs.ssh.Command"
host: localhost
port: 22
username: foo
password: pass
environment:
  - B2_ACCOUNT_ID: 0050d2528eaa6af0000000001
  - B2_ACCOUNT_KEY: 7005f3QteNYV4zrj8YaAJHThwJnb9JM
  - RESTIC_PASSWORD: 6RQ2P@txq79aa
  - RESTIC_REPOSITORY: b2:JHS-BucketName:/subfolder
 commands:
  - restic backup /Users/auser/important_docs

"io.kestra.plugin.fs.ssh.Command" not giving output in var variable

Expected Behavior

when we run below code the output of ls should be displayed on the output section, but that is not happening. I am using docker image kestra/kestra:latest-full

id: "command"
type: "io.kestra.plugin.fs.ssh.Command"
host: localhost
port: "22"
authMethod: PASSWORD
username: foo
password: pass
commands: ['ls']

Actual Behaviour

When we run a flow with type: "io.kestra.plugin.fs.ssh.Command" , what ever command it run the output should be stored on the var variable and display in the output section

Steps To Reproduce

create a flow suing below blueprint and when you run its not giving proper output
id: "command"
type: "io.kestra.plugin.fs.ssh.Command"
host: localhost
port: "22"
authMethod: PASSWORD
username: foo
password: pass
commands: ['ls']

Environment Information

  • Kestra Version: 0.16.1
  • Plugin version: io.kestra.plugin.fs.ssh.Command
  • Operating System (OS / Docker / Kubernetes): Amazon Linux release 2023.4.20240401 (Amazon Linux) / Docker Server Version: 25.0.3
  • Java Version (If not docker):

Example flow

No response

SECRET env variables seem to be broken with ssh logins

Expected Behavior

I expect to be able to accesss SECRET_etc... environment variables (from docker compose) in my Kestra instance to be able to log into a remote server via the ssh command plugin

Actual Behaviour

I get a auth fail error
2024-04-02 11:50:19.953 Auth fail

With the following trace:

2024-04-02 11:50:19.953com.jcraft.jsch.JSchException: Auth fail
	at com.jcraft.jsch.Session.connect(Session.java:519)
	at com.jcraft.jsch.Session.connect(Session.java:183)
	at io.kestra.plugin.fs.ssh.Command.run(Command.java:110)
	at io.kestra.plugin.fs.ssh.Command.run(Command.java:38)
	at io.kestra.core.runners.Worker$WorkerThread.run(Worker.java:710)

Steps To Reproduce

Here's my docker-compose (password changed, obv)

volumes:
  kestra-postgres-data:
    external: true
  kestra-data:
    external: true

services:
  postgres:
    image: postgres
    volumes:
      - kestra-postgres-data:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: kestra
      POSTGRES_USER: kestra
      POSTGRES_PASSWORD: k3str4
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -d $${POSTGRES_DB} -U $${POSTGRES_USER}"]
      interval: 30s
      timeout: 10s
      retries: 10

  kestra:
    image: kestra/kestra:latest-full
    pull_policy: always
    # Note that this is meant for development only. Refer to the documentation for production deployments of Kestra which runs without a root user.
    user: "root"
    command: server standalone --worker-thread=128
    volumes:
      - kestra-data:/app/storage
      - /var/run/docker.sock:/var/run/docker.sock
      - /tmp/kestra-wd:/tmp/kestra-wd
    environment:
      SECRET_IMAC_PASSWORD: UHVycG9zZWx5IE9iZnVzY2F0ZWQK
      SECRET_B2_ACCOUNT_ID: PDA1MGQyNTI4ZzZkNmFmMDAwMDAwMDAwMQo=
      SECRET_B2_ACCOUNT_KEY: QzAwNWY5UXRXTllzZXpyajhZYUFKSFRod0puYjlKTQo=
      SECRET_RESTIC_PASSWORD: ZzZzUTJQQHR4Cg==
      KESTRA_CONFIGURATION: |
        datasources:
          postgres:
            url: jdbc:postgresql://postgres:5432/kestra
            driverClassName: org.postgresql.Driver
            username: kestra
            password: k3str4
        kestra:
          server:
            basic-auth:
              enabled: false
              username: "[email protected]" # it must be a valid email address
              password: super_secret
          repository:
            type: postgres
          storage:
            type: local
            local:
              base-path: "/app/storage"
          queue:
            type: postgres
          tasks:
            tmp-dir:
              path: /tmp/kestra-wd/tmp
          url: http://localhost:8080/
    ports:
      - "8180:8080"
      - "8181:8081"
    depends_on:
      postgres:
        condition: service_started

and here's my flow:

id: paperless-export
namespace: backups
description: Exports the paperless settings and documents for backup (later)

tasks:
  # This task runs a restic backup
- id: ssh-paperless-export-backup-cmd
  type: "io.kestra.plugin.fs.ssh.Command"
  host: selfhostserver.lan
  port: "22"
  username: johnsmith
  password: "{{ secret('IMAC_PASSWORD') }}"
  commands: 
    - sudo docker exec -t paperless-ngx-webserver-1 document_exporter -d ../export

triggers:
- id: paperless-export-schedule-hourly
  type: io.kestra.core.models.triggers.types.Schedule
  cron: "15 * * * *"
  timezone: US/Pacific

Environment Information

  • Kestra Version: latest
  • Plugin version: latest
  • Operating System (OS / Docker / Kubernetes): 6.5.0-25-generic #25~22.04.1-Ubuntu
  • Java Version (If not docker):

Example flow

No response

Ability to use shared directories (network, Windows share, Samba protocol, etc.)

Feature description

In various projects, third parties may generate files for various reasons. Today, this processing is carried out by dropping files onto shared directories (using various protocols, including Windows sharing, Samba, etc.). It would be interesting to have a feature that would allow us to launch Kestra as soon as we receive a file in this directory, but also to be able to retrieve the file in order to process it.

I hope I've gone into enough detail here.

HTTP Trigger with Encrypted Body doesn't trigger executions

Issue description

id: send_alert_price
namespace: dev

tasks:
  - id: slack
    type: io.kestra.core.tasks.log.Log
    message: "The price is now below the threshold: {{ json(trigger.body).products[0].price }}"

triggers:
  - id: http
    type: io.kestra.plugin.fs.http.Trigger
    uri: https://dummyjson.com/products/search?q=macbook-pro
    responseCondition: "{{ json(response.body).products[0].price < 1800 }}"
    interval: PT1S
    encryptBody: true
    stopAfter:
      - SUCCESS

This doesn't create any executions

When you remove the encryptBody, it works

[HTTP] Allow configuring a keystore at the task level

Feature description

Currently, when you need to access an HTTPS server with a custom certificate, this certificate must be defined inside the JVM keystore which mandate to have a custom Kestra image and setting up the certificate via the keytool command line.

The mincronaut client that we use under the cover allow setting custom keystore/truststore, we may provide a way to pass custom certificate and setup them. Note that this is not easy as we need to decided which is the best format (PEM file maybe) and translate that to a Java keystore/truststore.

Missing "moveDirectory" property but action is set to "DELETE"

Expected Behavior

image

action is set to DELETE on task io.kestra.plugin.fs.sftp.Downloads
this moveDirectory property is only required for MOVE action (as seen on documentation)

Actual Behaviour

No response

Steps To Reproduce

No response

Environment Information

  • Kestra Version: 0.12.3
  • Plugin version: 0.12.0

Example flow

No response

Allow options to chunk logging of ssh outputs

Feature description

I am using an ssh command to run a JAR file which runs some batch logic in a remote server.
The output of the command is very verbose, with lots of data.
Currently, the plugin outputs one log record per line of the shell output, which results in up to 30,000 database records being saved in Postgres per execution. I want to be able to save these logs, but not in such an inefficient way that every curly brace or a square bracket in a json output is saved as a separate log.
A workaround is to first save the log into a variable and echo the output, like this :

tasks:
  - id: start_batch
    type: io.kestra.plugin.fs.ssh.Command
    commands:
      - cd /app/instances/cm-bat
      - output=$(./springBatch.sh <batch jar with print output>)
      - exit_code=$?
      - echo $output
      - exit $exit_code
    host: "{{ render(namespace.batch.dev.host) }}"  
    password: "{{ render(namespace.batch.dev.password) }}"
    username: "{{ render(namespace.batch.dev.username) }}"

But then the log is unformatted, and looks like this in the Kestra output :

image

I wish there was an option in the task to be able to chunk the log output while keeping all the formatting.

Standard regex fails with white space

Expected Behavior

No response

Actual Behaviour

The following regex .*test Test_nbs_issuers_.+.csv should match on this string "test Test_nbs_issuers_YYYYMMDD.csv" but the white space seems to cause an issue using the regEx property.

id: regex
namespace: dev

tasks:
  - id: fs
    host: localhost
    port: 21
    username: foo
    password: pass
    type: io.kestra.plugin.fs.ftp.List
    from: "/folder"
    regExp: .*test Test_nbs_issuers_.+.csv

Steps To Reproduce

No response

Environment Information

  • Kestra Version:
  • Plugin version:
  • Operating System (OS / Docker / Kubernetes):
  • Java Version (If not docker):

Example flow

No response

http.Download task is broken after Micronaut 4 migration -- No HttpClientFactory present on classpath, cannot create client

Feature description

Reproducer:

id: csv
namespace: dev
tasks:
  - id: hello
    type: io.kestra.plugin.fs.http.Download
    uri: https://huggingface.co/datasets/kestra/datasets/blob/main/csv/orders.csv

error:

No HttpClientFactory present on classpath, cannot create client
2024-02-17 18:16:06.839java.lang.IllegalStateException: No HttpClientFactory present on classpath, cannot create client
	at io.micronaut.http.client.StreamingHttpClientFactoryResolver.resolveClientFactory(StreamingHttpClientFactoryResolver.java:50)
	at io.micronaut.http.client.StreamingHttpClientFactoryResolver.getFactory(StreamingHttpClientFactoryResolver.java:38)
	at io.micronaut.http.client.StreamingHttpClient.create(StreamingHttpClient.java:161)
	at io.micronaut.reactor.http.client.ReactorStreamingHttpClient.create(ReactorStreamingHttpClient.java:81)
	at io.kestra.plugin.fs.http.AbstractHttp.streamingClient(AbstractHttp.java:141)
	at io.kestra.plugin.fs.http.Download.run(Download.java:70)
	at io.kestra.plugin.fs.http.Download.run(Download.java:30)
	at io.kestra.core.runners.Worker$WorkerThread.run(Worker.java:710)

Add the possibility to add mass upload

Feature description

Description

Currently Kestra allows for uploading files via SFTP/FTP/... , but it only allows to upload a single file per connection
Besides the connection only supporting a single upload, it is also required to specify the filename on both from and to locations.
In case we want to upload multiple files we'll have to go over each file individually.

Solution

Have the option to provide the from as a folder or have the option to specify an array of files, the to should in both cases point to a directory.

Upload the complete folder

- id: "uploadDir"
  type: "io.kestra.plugin.fs.sftp.Upload"
  host: ftp.server.com
  port: 22
  username: user
  password: secretpassword
  from: "{{outputs.generateArray.**outputDir**}}"
  to: /ftpserver/output/

Upload an array of files

- id: "uploadArray"
  type: "io.kestra.plugin.fs.sftp.Upload"
  host: ftp.server.com
  port: 22
  username: user
  password: secretpassword
  from: "{{outputs.generateArray.**outputFiles**}}"
  to: /ftpserver/output/

Merge property naming between `fs.sftp` and `fs.http`

Feature description

fs.sftp and fs.http are using different propertiy names for the same element, namely:

proxyHost vs. proxyAddress
proxyUser vs proxyUsername

It could be great to merge those one (and depecrating one or the other)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.