Coder Social home page Coder Social logo

python-extension-common's Issues

Make temp_schema optional for wait_for_completion

At the moment when a deployer wants to wait for completion, it always tries to create a temporary schema. This is not possible if the user has no permission for that. In such a case the wait_for_completion option has to be disabled. It would be useful to disable the temporary schema but still do the wait_for_completion using the current schema. Maybe delete the dummy UDF when finished.

Make printing the activation commands optional

We don't always want to see the activation commands printed in the console. In particular, in AI-Lab they may confuse the user. She may think that they are still part of the installation routine, while the AI-Lab runs the session level activation when it opens a DB connection.

  • There should be a parameter that turns the printing of the activation commands on/off. The default value should be ON.
  • When printing the messages, only the session level activation command should be printed in full. Then there should be a warning, that applying a similar activation command at the system level may deactivate languages that were activated since this message was printed.

Activate validation by verifying existence of file exasol-manifest.json

Ticket #42 requested to enhance the SLC Deployer, to optionally wait until SLC is deployed to all nodes in the database cluster. This is implemented by verifying the existence of file exasol-manifest.json, which was added to all SLCS in the scope of ticket exasol/script-languages-container-tool#221.

It turned out that the integration tests of PEC failed, caused by using an SLC release not yet containing a manifest.

The developers decided to temporarily use the file /exaudf/exaudfclient_py3 as a workaround. This doesn't make PEC worse than before but will not provide enhanced validation.

The current ticket requests to replace the filename, again, as soon as the next release of script-languages-release is available, using an update version of SLCT to provide the file exasol-manifest.json.

Another option would be to

  • wait for PR #47 to be merged
  • create a new release 0.21.0 for SLCT
  • update dependencies of PEC
  • build a custom SLC incl. manifest file
  • use this SLC in the integration test of the ExtractValidator

Building the SLC is probably already implemented in a pytest fixture in the scope of ticket #45.

SLC Deployer: Add timeout option to CLI

In the scope of ticket #50 a new ExtractValidator has been implemented.

The current ticket requests adding options to the CLI so that an interactive user can control the timeout duration.

Investigate how to delete an SLC uploaded by an integration test

Multiple integration tests use the same instance of the DockerDb. Most tests upload an SLC to the bucketfs of the ITDE. Currently, we have interference between tests, as the containers uploaded by previous tests are not removed.

Removing a container has side effects.

  • When a file gets deleted, then uploading another file with the same name immediately afterward fails. This means there should be a delay between running subsequent tests if they use the same bucketfs file name.
  • A language container is an archive, that gets unzipped in the bucketfs. Apparently, the deletion of the tar file causes the deletion of the unzipped directory. This needs to be confirmed and verified. We need to find out if this has an impact on the duration of the required delay between tests.

PEC: Update to Python 3.10

Update

  1. minimum Python version in file pyproject.toml
  2. dependencies poetry update
  3. Matrix build definitions in files in .github/workflows: Proposal: Update PTB and re-generate workflows
  4. actions/checkout to v4 in GitHub workflows

Optionally wait until SLC is deployed to all nodes in the database cluster

Tasks

  1. feature
    ckunki
  2. feature
    ckunki
  3. refactoring
    ckunki
  4. feature
    ckunki
  5. refactoring
  6. refactoring
    ckunki

SLC deployer: Convert CLI tests to unit tests with mock

What has happened before?

  • A new ExtractValidator has been implemented as requested by #50
  • And integrated into LanguageContainerDeployer as requested by #49

Scope of the current ticket

The current ticket additionally requests converting the CLI tests for the SLC deployer to unit tests with mock.

What is planned next?

In the scope of ticket #52, we want to add CLI options for the timeout settings.

Add a function validating the completion of the SLC deployment.

The function should try to create a UDF and run it at all nodes. These steps need to be repeated until success or timeout.
Since the success is not 100% reliable proof that all files have been extracted from the archive, (some packages, not used by the UDF, may yet be extracted) the function should take a fixed time pause at the end.

The LanguageContainerDeployer should use this function in its container uploading process.

Error in the report workflow

The release workflow for the python-extension-common doesn't finish cleanly. It fails in the report.yml at the attempt to copy the coverage, as shown here.

A temporary solution is to disable the report.yml by commenting out the calls to it in ci.yml and ci-cd.yml.

SLC Deployer: Check integration tests

A new ExtractValidator has been implemented and integrated into class LanguageContainerDeployer in the scope of tickets #50 and #49.

The current ticket additionally requests checking the existing integration tests of the SLC Deployer

  • if they are actually waiting and
  • if we have a test without waiting

Create new Extract validator

Background of Current Implementation, Additional improvements and Features

  • Check for file manifest.json to be extracted from the SLC
    • (appended as last file to enable reliable detection of complete extraction)
    • check the file to exist on all nodes of the database cluster
  • Support a callback function for updates regarding validation in progress
  • More robust tests
    • mocking time.monotonic() in tests involving tenacity Retrys.
    • Injecting Validator as a separate object to LanguageContainerDeployer
  • Support configuring timeouts in production usage and via CLI

Design

  • Create method in PEC language_container_deployer.py
  • Use method as_udf_path()
  • add an optional call-back for reporting progress (n of m succeeded)
  • Use Context Manager for database schema: with temp_schema(self._pyexasol_conn) as schema:
  • Create a UDF in this schema
  • Create a busy loop with tenacity, incl. number of retries or a global timeout
    • Run UDF on all nodes in parallel using GROUP BY IPROC()
    • Each UDF reports if the expected file exists
    • Aggregate all UDF results
    • total success requires all udfs to succeed

Path Construction in language_container_deployer.py

name = re.sub(r"\.(tar|tar\.gz|zip|gzip)$", "", file_path.name)
manifest = file_path.parent / name / "exasol-manifest.json"
udf_path = manifest.as_udf_path()

Proposal for UDF

--/
CREATE OR REPLACE PYTHON3 SCALAR SCRIPT manifest(my_path VARCHAR(256)) RETURNS BOOL AS
import os
def run(ctx):
        return os.path.isfile(ctx.my_path)
/

SELECT statement

After uploading file sample.tar.gz:

SELECT iproc() "Node", manifest('/buckets/uploads/default/sample/exasol-manifest.json') "Manifest"
from values between 0 and 5 group by iproc();
NODE MANIFEST
0 true
1 true
2 true
3 true

Retrying a code block with tenacity

try:
    for attempt in Retrying(stop=stop_after_attempt(3)):
        with attempt:
            raise Exception(‘My code is failing!’)
except RetryError:
    pass

Fix validation UDF and the way it is called.

There are multiple problems with the dummy udf in the language_container_validation.py.
Firstly, it needs to be a set UDF. Secondly, the way it is called doesn't actually make it run on all nodes.

The proposed change:

CREATE OR REPLACE {language_alias} SET SCRIPT {udf_name}(i DECIMAL(10, 0))
RETURNS DECIMAL(10, 0) AS

def run(ctx):
    return ctx.i
/

Then call this udf like this:

SELECT NPROC();
SELECT {udf_name}(i) FROM VALUES BETWEEN 1 AND {num_nodes} t(i) GROUP BY i;

Create the documentation

Initially, the language container deployment documentation can be taken from the Transformers extension repo.

PEC: Use plugin pytest-saas

  • Replace dependencies to git repositories by dependencies to pypi packages
  • Add dependency to pytest-exasol-saas
  • Replace fixtures

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.