Coder Social home page Coder Social logo

exasol / bucketfs-python Goto Github PK

View Code? Open in Web Editor NEW

This project forked from exasol/bucketfs-utils-python

1.0 1.0 1.0 11.6 MB

BucketFS utilities for the Python programming language

Home Page: https://exasol.github.io/bucketfs-python

License: MIT License

Shell 4.29% Python 95.48% Dockerfile 0.23%
bucketfs exasol-integration foundation-library python

bucketfs-python's People

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

tkilias

bucketfs-python's Issues

Create a connection object for a bucket

Background

  • Connection objects are used from Exasol to store credentials or config for UDFs
  • We need to often supply credentials of the bucketfs and locations in the bucketfs to the UDFs
  • It would be good to have a function which generates a connection object create statement from a BucketFSLocation
  • This connection object then can be used inside a UDF to create BucketFSLocation object via the BucketFSFactory
  • Process: BucketFSLocation -> Connection Object Create Statement -> Connection Object in UDF -> BucketFSFactory in UDF -> BucketFSLocation in UDF

Acceptance Criteria

  • BucketFSLocation can create Connection object create statement
  • BucketFSFactory can create BucketFSLocation from this connection object

Create a path builder

Create a universal path builder function that will return a PathLike object. The function should take different sets of arguments for different file backends. The following backends should be supported.

  • On-prem BucketFS
  • SaaS BucketFS
  • BucketFS file system as seen by a UDF

Create bucketfs via XMLRPC

Background

  • Add the moment, user can only use existing bucketfs, but can't create one
  • Creating buckets requires admin access to the cluster configuration, which happens currently via XMLRPC

Acceptance Criteria

  • Add a function which creates a bucketds

๐Ÿ”ง Remove old BucketFs API and Package

Summary

Currently the exasol-bucketfs package contains two packages (exasol.bucketfs, exasol_bucketfs_utils_python) containing the new and the old API receptively. Once all dependencies to the old API are cut the exasol_bucketfs_utils_python package and their receptive tests should be removed.

References

Requires this issues to be solved first:

Task(s)

  • Remove old API package exasol_bucketfs_utils_python
  • Remove tests for exasol_bucketfs_utils_python package
  • Remove deprecated dependency
    • Joblib
    • Typeguard?

Compute hash sum by downloading from HTTP without persisting the downloaded file

Background:

  • We had in the past, often problems with corrupted files or wrongly uploaded files
  • Checking the checksum helped a lot in the past

Acceptance Criteria:

  • Implement a function which downloads a file via HTTP / HTTPS and computes the checksum on the fly
  • The checksum computation and the download should be streamed, to reduce the memory footprint
  • Different checksum should be usable, such as SHA512, SHA256, MD5, ...
  • The checksum should be compatible with the checksum you get from the command line tools sha512sum, sha256sum, md5sum
  • We want to avoid storing anything on disk

Download into byte string

Background

  • At the moment, we can only download to strings or files, however sometimes you want to download binary data

Acceptance Criteria

  • We can download binary data to byte string without any encoding from the Bucketfs

Ability to create Bucket

In order to be able to integrate Exasol closely with ML pipelines, this feature would be highly helpful.

List buckets of a certain bucketfs

Background

  • Currently, the user needs to specify which bucket to use, but the user has no way to discover the bucket names

Acceptance Criteria

  • Add a function to list the names of the buckets of a bucketfs

Improve documentation

Summary

Make sure the documentation is easy to use (e.g. all parameters in the API documentation are shown properly).

Examples:
Good vs Bad

Tasks

  • Make sure everything is documented properly
  • Add an examples section
  • Make sure code snippets are run or taken as part unit tests
    (Doctests may be also an option, if they can be displayed easily within the docs using sphinx)
  • Adjust old references from bucketfs-util-python to bucketsfs-python
  • Add pypi based installation guide

Resources

Add logging to bucketfs-python library

Summary

Currently, the bucketfs-python library lacks logging functionality, which makes it challenging for users to debug and trace errors effectively. Adding logging capabilities will enhance the usability of the library by providing valuable insights into its runtime behavior.

Proposed Solution

Integrate logging functionality into the bucketfs-python library to enable users to easily monitor and troubleshoot operations. This should include configurable logging levels and options to customize log output.

Expected Outcome

With logging incorporated, users will have improved visibility into the library's internal operations, making it easier to diagnose issues.

Additional Information

Note: Ensure that the logger is appropriately named and can be controlled/configured by the logging configuration of the library users.

Compute hash sum for a file during upload

Background:

  • We had in the past, often problems with corrupted files or wrongly uploaded files
  • Checking the checksum helped a lot in the past

Acceptance Criteria:

  • You can enable this feature with an option, default is off
  • Add the checksum computation to the upload functions
  • The checksum computation should be streamed, to reduce the memory footprint
  • Different checksum should be usable, such as sha512, sha256, md5, ...
  • The checksum should be compatible with the checksum you get from the commandline tools sha512sum, sha256sum, md5sum

๐Ÿ”ง Cleanup and migrate integration tests and setup

Summary

All important integration tests for the old API should be migrated to the new API and made part of the integration tests suite of the new API.

Tasks

  • Migrate and integrate UDF integration tests
  • Add pytest based Integration test settings/configuration
    • Buckets and DB settings used for integration tests
  • Add support for a pytest based setup of the integration tests (start db etc.)

๐Ÿž Uploading pickled model to BucketFS does not work

Summary

Uploading a pickled ... model to BucketFS fails with an exception.

Reproducing the Issue

Product ML file

import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Create a dummy dataset with 10,000 rows
np.random.seed(42)
data = {'X1': np.random.rand(10000),
        'X2': np.random.rand(10000),
        'X3': np.random.rand(10000),
        'y': np.random.rand(10000)}
df = pd.DataFrame(data)

# Split the data into features (X) and target variable (y)
X = df[['X1', 'X2', 'X3']]
y = df['y']

# Create a linear regression model
model = LinearRegression().fit(X, y)

# Fit the model to the data
#model.

# Generate 2,000 rows for X_new
np.random.seed(42)
X_new = pd.DataFrame({'X1': np.random.rand(2000),
                      'X2': np.random.rand(2000),
                      'X3': np.random.rand(2000)})

# Predict on new data
y_pred = model.predict(X_new)

# Print the predicted values
print("Predicted values:")
print(y_pred)

# Calculate the mean squared error
y_pred_train = model.predict(X)
mse = mean_squared_error(y, y_pred_train)
print("Mean Squared Error:", mse)

Pickled model

import pickle
from sklearn.linear_model import LinearRegression

# Save the model to a file
filename = 'dummy_linear_regression_model.sav'
pickle.dump(model, open(filename, 'wb'))

print("Model saved successfully.")

Failing code

import io
from exasol.bucketfs import Service

URL = "http://localhost:2581"
CREDENTAILS = {"default": {"username": "w", "password": "BBiSzwGaD6X7zLcjfpcP0OdGA317JABg"}}

bucketfs = Service(URL, CREDENTAILS)
bucket = bucketfs["default"]

filename = 'dummy_linear_regression_model.sav'
loaded_model = pickle.load(open(filename, 'rb'))


# Upload bytes
data = loaded_model
bucket["dummy/dummy_linear_regression_model.sav"] = data

# Upload file like object
# file_like = io.BytesIO(loaded_model)
# bucket.upload("dummy/dummy_linear_regression_model.sav", file_like)

# bucket.upload("dummy/dummy_linear_regression_model.sav", loaded_model)

Expected Behavior

Uploading model is successful.

Actual Behavior

Uploading model fails with exception

Root Cause (optional)

unknown

Screenshots

image

Reported by: @exa-eswar

Check if bucketfs is reachable

Background

  • Currently, we simply try to download a file without checking if the bucketfs is reachable

Acceptance Criteria

  • Add a function to check if a bucketfs is reachable

Update typeguard version

  • typeguard 3.0.0 leads to TypeError: typechecked() got an unexpected keyword argument 'always'
  • this error temporarily handled in #58
  • remove version restriction after this error is fixed

Add minimal CLI

In the past numerous attempts have been made to mitigate interaction with Exasol's BucketFS:

  • โœ”๏ธ bucketfs-python: Python, active, used for tests automation of various python projects.
  • โœ”๏ธ bucketfs-java: Java, used for tests automation of various java projects.
  • โ“ bucketfs-client: Java, limited functionality, currently not develeoped actively
  • โŒ bucketfs-explorer: Java, GUI application, archived, deprecated
    Currently contained in official documentation (see DOC-2221)
  • โŒ bucketfs-utils-python: depecated, superseeded by bucketfs-python
  • โ“ shell functions for bash based on CURL requests: only for power users, limited functionality, see below
function bucketfs-password() {
    if [ -z "$1" ]; then
       echo "usage: bucketfs-password <container>"
       return 1
    fi
    BUCKETFS_PASSWORD=$(
	docker exec -it $1 \
	       grep WritePass /exa/etc/EXAConf \
	    | sed -e "s/.* = //" \
	    | tr -d '\r' \
	    | base64 -d)
}

function bucketfs-upload() {
    if [ -z "$1" ]; then
       echo "usage: bucketfs-upload <file> [path/in/bucket-fs]"
       return 1
    fi
    if [ -z "$BUCKETFS_PASSWORD" ]; then
       echo "Please set environment variable BUCKETFS_PASSWORD"
       return 1
    fi
    A=2580 # port
    B=$(echo default/$2 | sed -e 's/\/$//') # path
    curl -v -X PUT -T $1 http://w:$BUCKETFS_PASSWORD@localhost:$A/$B/$1
}

Still users struggle and need help from time to time. The current ticket therefore requests to create a minimal CLI solution

  • convenient and easy to use
  • acceptable usability with sufficient functionality and at least some guidance for unexperienced users
  • lightweight with minimal footprint and prerequisites, e.g. frameworks and installations

In summary bucketfs-python seems to be the best candidate. It is used anyway for test automation of various python projects. Currently issue #4 requires to enhance bucketfs-python to list the contents of a folder in BucketFS which could be another building block in order to provide a minimal CLI with very limited effort.

๐Ÿ“š API DesignDoc: "DirectoryBucket"

Summary

Write an API specification for a "DirectoryBucket" (type name to be defined) for the bucketfs library.
The DirectoryBucket acts as a wrapper around a Bucket, targeted at a specific subdirectory, to facilitate object storage operations within that subdirectory context.

Goals

  • Simplify Path Management: Enable components to operate in isolated subdirectories without manual path tracking.
  • Enhance Error Handling: Reduce errors stemming from manual path management

Functionality

  • read(path: string): Read the contents of a file located at path within the subdirectory.
  • write(path: string, content: any): Write content to a file located at path within the subdirectory.
  • delete(path: string): Delete a file or directory located at path within the subdirectory.
  • files(): List all files in the current subdirectory.
  • directories(): List all direct subdirectories as DirectoryBucket instances.
  • join_path(*paths: string[]): Safely join multiple path segments, ensuring proper navigation within the subdirectory.

Note: Consider having the path operations in a path object which will/can be used by the DirectoryBucket.

Related Issues

Download returns IOBase stream

Background

  • At the moment, we can only download files completely from the BucketFS, but sometimes it would be useful to download it in chunks and work on these chucks

Acceptance Criteria

  • We provide a new function which returns an object derived from IOBase (has a file like API) which return the data through the read function. Seek and write is not implemented.

โœจ Add better credentails support to new BucketFs API

Summary

Be more explicit and secure on how credentials are used within the bucketfs api.

Replace the default dict in dict credentials mapping passed to the service with a more sophisticated credentials provider,
which e.g. does not accidentally leak authentication information when printing it. Additionally provide more context
that credentials are mapped to specific buckets.

Details

  • Add Credential classes/objects
  • Credential classes/objects should not leak information when printed
  • Credential classes/objects Support explicit request for unsecure output
  • Add a more explicit data structure / class for the global credentials mapping/store

Examples / Ideas

Secure & Unsecure Output

credentials = Credentials(username='foo', password='bar')


>>> print(credentials)
Credentials(username: ****, password: ****)

>>> print(f'{credentails:unsecure}')
Credentials(username: foo, password: bar)

Global Credentails Store

store = CredentailStore(
      [
          BucketCredentails(bucket='default', username='user', password='pw'),
          BucketCredentails(bucket='myudfs', username='u', password='secret'),
          ...
     ]
)

store = CredentailStore(
      [
          { 'bucket': 'default', 'username': 'user', 'password': 'pw' },
          { 'bucket': 'myudfs', 'username': 'u', 'password': 'secret' },
          ...
     ]
)

store = credentails.Store(
      [
          credentials.Bucket(name='default', username='user', password='pw'),
          credentails.Bucket(name='myudfs', username='u', password='secret'),
          ...
     ]
)

New Usage

from exasol.bucketfs import Service
from exasol.bucketfs import credentails 

URL = "http://127.0.0.1:1234/"
STORE = credentails.Store(
    credentials.Bucket('default', username='w', password='w')
)
bucketfs = Service(URL, STORE)

Notes

  • Printing can/should be implemented by implementing __str__, __format__ and __repr__
  • Consider creating a sub module for the credentials code
  • Keep support for old credential usage but discourage it
  • The Store constructor should support a set of Credentials or just a single one (for simple use cases)
  • Think about for which parameters keyword argument passing should be enforced (e.g. username, password?)

Tasks

  • Add support for improved credentials
  • Add unit and integration tests for this feature(s)
  • Update the documentation to use new (more obvious) API for passing the credentials

โœจ Conditional Running for SaaS Tests in CI

Summary

Add a control mechanism in our Continuous Integration pipeline to selectively execute the SaaS tests. Ideally, these specific tests should only run under certain conditions such as when explicitly invoked, either by a triggering commit or a manual activation via the workflow. This enhancement is intended to streamline our CI process, reducing unnecessary testing cycles and improving overall efficiency.

Furthermore, we should run them only for one python version.

User guide and Examples

It would be beneficial for the users to have a User guide (like other Exasol public repos) along with a few working examples.

Add python tooling (linter, formatters, ...)

Background

Acceptance Criteria

Upload stream of byte chunks from generator

Background

  • Currently, we can only provide upload for file-obj and strings, both functions upload the whole content at once. You can't generate the content on the fly.
  • Python Generators allow the generation of data and then yield the execution to the consumer. The consumer in our case would be the upload. This allows to alternate between generation and upload.

Acceptance Criteria

  • We have a new upload function which accepts generators of bytes objects.
  • Check if requests package already buffers the data to optimize the upload, otherwise implement the buffering

๐Ÿ”ง Refactor `examples` for more clarity

Summary

To enhance the accessibility of the example files in our documentation folder, we should refactor them.
The current approach, which utilizes a comment marker to distinguish between basic and advanced examples within a single file, has proven to be somewhat unclear.

Proposed Changes:

  • Split the Existing Example Files: Divide the current example files into two separate files:
    • xxx_basic.py: This file will contain the basic usage example.
    • xxx_advanced.py: This file will focus on more complex scenarios.

The primary goal of this refactoring is to make the example files less confusing when viewed in isolation. By creating distinct files for basic and advanced examples, we aim to facilitate a better understanding and improve the overall user experience.

Add a release workflow

Acceptance Criteria

  • Documentation generation
  • Release to PyPI
  • Release to Github Release

Simplify bucketfs package and API

Why

  • Simplifies API usage and API documentation
  • Improves readability of client and library code
  • Hides unnecessary internals
  • Provide a more pythonic experience to users of the library
  • Reduce the overall noise in the module
  • More clearly separate the internals from the actual "End User API"

How

Add a new package for the new API and structure to the workspace

The new package can make use of the old package and API (it can start as a "shim") and then bit by bit
integrate the required functionality without breaking existing code.

Reduce repetition in naming whenever the context is also providing that information

e.g.:

from exasol_bucket_fs_utils_python.bucketfs_location import BucketFSLocation

vs.

from exasol.bucketfs import Bucket

Note โ„น๏ธ : If this would increase the size of single module to much, this also can be achieved by re-exporting.

Create new API

Example API

# import required function's and classes
from exasol.bucketfs import (
    BucketFs,
    Bucket,
# conversions likely should be implemented as functions
# as_file()
# as_string()
# ...
    AsFile,
    AsString,
    AsFileObject,
    AsJoblibObject
)


# Create bucketfs accessor object
# Note: The reading available bucketfs etc. from the bucketfs service does not require credentials 
bucketfs = BucketFs(
    host='localhost',
    port=1234,
)

# Create just a bucket accessor
# Note: consider taking the bucketfs service as parameters instead of the host/port.
bucket = Bucket(
    host='localhost',
    port=1234,
    username='readuser',
    password='readpw',
    bucket='bucket',
    ...
)

# Access buckets
for bucket in bucketfs:
    print(bucket)

# Retrieve a specific bucket
bucket = bucketfs['bucketname']

# Upload data to a bucket
file_on_bucket = bucket.upload(content="Some String Content")
file_on_bucket = bucket.upload(content="Some String Content", name="explicit_filename.txt")

with open('/some/file.txt', 'r') as f:
    file_on_bucket = bucket.upload(content=f)


# List files in  a bucket
for file in bucket:
    print(file)

# Download data from a bucket
file_content = bucket['my_text_file.txt']
file_content = bucket.download('my_text_file.txt')

# Conversion helpers
file = AsFile(file_content, '/home/my_file.txt')
string = AsString(bucket['myfile.txt'], encoding='utf-8')
joblib_obj = AsJoblibObject(file_content)

# Delete data from a bucket
bucket.delete('my_text_file.txt')
del bucket['my_text_file.txt']

Remove old API

Once the migration is complete, the old API and package can be deleted and a new version can be release.
(One may consider transitional releases with support for the old and new API + deprecation warnings)

Compute hash for all files in bucket via http download

Background:

  • We had in the past, often problems with corrupted files or wrongly uploaded files
  • Checking the checksum helped a lot in the past

Acceptance Criteria:

  • Implement a function which computes the checksum for all files in the bucket via http/s downloads
  • The checksum computation should be streamed, to reduce the memory footprint
  • Potentially parallelize this to speed up the computation
  • Provide a parameter to set the nrOfCores used for parallel computation
  • Different checksum should be usable, such as sha512, sha256, md5, ...
  • The checksum should be compatible with the checksum you get from the commandline tools sha512sum, sha256sum, md5sum

Wrong assumption of a path in a bucket always having subdirectories

Summary

If a bucket url points to the bucket root then the line below fails

            base_path_in_bucket = PurePosixPath(url_path.parts[2]).joinpath(
                *url_path.parts[3:]
            )

This is in the BucketFSFactory.create_bucketfs_location in bucketfs_factory.py at the time of writing in the line 45.
See here.

Related Issues

๐Ÿž Performance Regression in `bucketfs-python` Compared to `curl` and Previous API

@ahsimb reports bucketfs-python to be multiple times slower than curl.

Summary

The new bucketfs-python API is significantly slower when transferring large files (multiple MBs/GBs) compared to using curl and the previous API version.

Reproducing the Issue

Reproducibility: always

Steps to reproduce the behavior:

  1. Use the new bucketfs-python API to upload a large file (several MBs or GBs).
import exasol.bucketfs as bfs  # type: ignore

bucketfs = bfs.Service(buckfs_url, buckfs_credentials)
bucket = bucketfs[bucket_name]
bucket.upload(bfs_file_name, pickle.dumps(object))
  1. Compare the upload time with that of curl and the older bucketfs-python API method.
    Old API:
exasol_bucketfs_utils_python.bucketfs_location.BucketFSLocation.upload_fileobj_to_bucketfs

Expected Behaviour

The new bucketfs-python API should offer comparable performance to the old API and ideally also to methods like curl.

Actual Behaviour

The upload process with the new API is significantly slower than using curl and the previous API version, affecting efficiency and throughput for large file transfers.

Upload from byte string

Background

  • At the moment, we can only upload strings or files, however sometimes you want to upload binary data, but the data doesn't come from a file

Acceptance Criteria

  • We can upload binary data from byte string without any encoding to the Bucketfs

Move language container fixtures to own repository

Copies of the language container fixtures currently exist in multiple repositories of ours. It would be preferable to move them to their own repository, so we only use one centralized version.

  • Create new repository
  • Move files
  • Make sure all Projects use the files from the repository

Check if file exists in bucket

Background

  • Currently, you can only hope that a file you want to download exists

Acceptance Criteria

  • Add a function to check if a file or directory exists in the bucketfs

Upload directory to bucket

Background

  • We can currently only upload files to bucket
  • Uploading directories with complex directory structure is very difficult with this functionality.
  • We need to add functionality of uploading directly directory to bucket.

Method

  • Directories could be zipped before uploading so that we can upload then zipped file to bucket.
  • After completing upload operation, we have to unzip the file and extract the directory.

Acceptance Criteria

  • Add upload_directory to BucketFSLocation and LocalBucketFSLocation
  • Add unit and integration tests

๐Ÿž Generating and deploying multi version documentation fails

Summary

Generating and deploying multi version documentation does not work properly in all scenarios and workflows.

Reproducing the Issue

Scenario 1:

  • Re-run a workflow which was run successfully for a push on master/main

Scenario 2:

  • Run CI-CD workflow (triggered by tag push)

Expected Behavior

Entire multi version documentation gets built and deployed to GitHub pages.

Actual Behavior

Documentation build/workflow fails.

Root Cause (optional)

No clear single root cause identified yet.

Leads

There are two major issues which have been identified so far:

  1. The docs build expects a setup/structure which causes a broken rendering of the API docs, otherwise it fails
    /tmp/tmpywesr85d/worktrees/worktree_source/doc/api.rst:4:toctree contains reference to nonexisting document 'api/exasol_bucketfs_utils_python'
    
  2. The sgpg tool expects different parameters depending if a tag or branch is used as source, therefore the unparamerized
    github workflow won't work properly in all cases.

Related Issues (optional)

Restrict typeguard version

  • typeguard 3.0.0 leads to TypeError: typechecked() got an unexpected keyword argument 'always'
  • use it as typeguard = "^2.11.1"

Compute hash sum for a file during download

Background:

  • We had in the past, often problems with corrupted files or wrongly uploaded files
  • Checking the checksum helped a lot in the past

Acceptance Criteria:

  • You can enable this feature with an option, default is off
  • Add the computation of the checksum to the download function
  • The checksum computation should be streamed, to reduce the memory footprint
  • Different checksum should be usable, such as sha512, sha256, md5, ...
  • The checksum should be compatible with the checksum you get from the commandline tools sha512sum, sha256sum, md5sum

The SSL certificate verification control doesn't work

There is now the verify parameter in the constructors of both the Service and the Bucket classes. However, when the bucket is accessed through the service, which is the conventional way of getting to it, the 'verify' parameter is not passed from the service to the bucket.

        return {
            name: Bucket(
                name=name,
                service=self._url,
                username=self._authenticator[name]["username"],
                password=self._authenticator[name]["password"],
                service_name=self._service_name
            )
            for name in buckets
        }

โœจ Add extra features to new BucketFs API

Summary

Add missing extra features to new BucketFs API to match needs of UDF use cases

Details

Location / BucketFsLocation

Implement a location "type" which provides to ability to operate on a subpath within a bucket.
Ask @tkilias for more details.

UDF path

Provide api to deduce the bucketfs path within a UDF.

Task(s)

  • Add Location support
  • Add UDF path/url support

Add DirectoryBucket (Pathlike, BucketPath, ...) to new API

Background

  • The new API is object-oriented
  • It uses objects like the Service, Bucket, MappedBucket
  • A MappedBucket is in that sense an Adapter for a Bucket. This way we can use composition instead of inheritance for the implementation.
  • However, Bucket and MappedBucket require the user to specify the path from the root
  • For more complex usage scenarios where multiple components of an application need to store objects in the BucketFS, we want that they can do that independent of each other.
    • For example, each component could use its own subdirectory. However, managing the absolute paths to subdirectories manually is error-prone.
  • For that reason, we need a DirectoryBucket which gets a Bucket and a path to a subdirectory and writes and reads objects below the subdirectory

Acceptance Criteria

  • Implement DirectoryBucket
    • read
    • write
    • delete
    • files
    • directories # return direct subdirectories as DirectoryBucket
    • join_path
  • Implement unit tests
  • Implement integration tests

Compute hash for files in glob in bucket from the filesystem of the UDFs

Background:

  • We had in the past, often problems with corrupted files or wrongly uploaded files
  • Checking the checksum helped a lot in the past

Acceptance Criteria:

  • Implement a function which computes the checksum for all files in a glob in the bucket from the file system in the UDFs
  • The checksum computation should be streamed, to reduce the memory footprint
  • Potentially parallelize this to speed up the computation
  • Provide a parameter to set the nrOfCores used for parallel computation
  • Different checksum should be usable, such as sha512, sha256, md5, ...
  • The checksum should be compatible with the checksum you get from the commandline tools sha512sum, sha256sum, md5sum
  • Two parts, one part runs in the UDF, other part creates a UDF and runs it, but assume the python package is already installed in the used language alias/container

Create bucket via XMLRPC

Background

  • Add the moment, user can only use existing buckets, but can't create one
  • Creating buckets requires admin access to the cluster configuration, which happens currently via XMLRPC

Use Cases

In order to be able to integrate Exasol closely with ML pipelines, this feature would be highly helpful.
-- @exa-eswar

Acceptance Criteria

  • Add a function which creates a bucket in an existing bucketfs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.