Coder Social home page Coder Social logo

elpiscloud's People

Contributors

aviraljain99 avatar dependabot[bot] avatar harrykeightley avatar mattchrlw avatar nicklambourne avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elpiscloud's Issues

Make frontend api calls fail gracefully.

For example, within lib/api/files.ts, getSignedUploadURLs and getUserFiles are async methods that make network calls such as fetch and getDocs. These operations could fail and these errors (example HTTP errors) would need to be handled properly.

Similarly, where methods from lib/api/files.ts are used, such as within components/datasets/DatasetViewer.tsx (loading the user's datasets) would depend on this error handling being done correctly.

`libsndfile` c library missing in cloud functions, which causes soundfile package to fail.

  • A new sound library needs to be found for performing conversions/audio processing in the cloud functions.
  • Currently we're using soundfile, but this relies on the underlying system having installed libsndfile (which cloud functions don't have as far as I can see).
  • Librosa also uses soundfile to do the audio processing so probably neither of these will work.
  • After this, the audio utilities in the cloud functions folder will need to be rewritten

Log dump:

Traceback (most recent call last):
  File "/layers/google.python.pip/pip/bin/functions-framework", line 8, in <module>
    sys.exit(_cli())
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/functions_framework/_cli.py", line 37, in _cli
    app = create_app(target, source, signature_type)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/functions_framework/__init__.py", line 288, in create_app
    spec.loader.exec_module(source_module)
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/workspace/main.py", line 4, in <module>
    from functions.datasets.process_file import process_dataset_file
  File "/workspace/functions/datasets/process_file.py", line 9, in <module>
    import utils.audio as audio
  File "/workspace/utils/audio.py", line 3, in <module>
    import soundfile as sf
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/soundfile.py", line 142, in <module>
    raise OSError('sndfile library not found')
OSError: sndfile library not found

`process_model` cant serialize Model status enum.

The process_model cloud function fails when trying to serialize the model status enum, as seen in the logs below.

Traceback (most recent call last):
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/flask/app.py", line 2073, in wsgi_app
    response = self.full_dispatch_request()
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/flask/app.py", line 1518, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/flask/app.py", line 1516, in full_dispatch_request
    rv = self.dispatch_request()
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/flask/app.py", line 1502, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/functions_framework/__init__.py", line 171, in view_func
    function(data, context)
  File "/workspace/functions/process_model.py", line 31, in process_model
    publish_to_topic(PUBSUB_TOPIC, [model.to_dict()])
  File "/workspace/utils/pubsub.py", line 33, in publish_to_topic
    serialized = json.dumps(obj)
  File "/opt/python3.10/lib/python3.10/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/opt/python3.10/lib/python3.10/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/opt/python3.10/lib/python3.10/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/opt/python3.10/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type TrainingStatus is not JSON serializable

Trainer topic subscription idempotency.

The trainer subscription can fire the same training event multiple times, which would start different training run throughs on cloud run (wasting resources, overwriting models etc.).

This is not a priority to fix at this point, but something to think about.

Copying over a reply below from Nick.

Maybe I just put it as a flag on the firestore Model that training has begun?

If firebase has atomic transactions this might work, otherwise you're just going to have the same run condition in a different place. The ideal scenario is to have idempotent training jobs, where the end state isn't affected by multiple runs (either because it realised that a training job with the given inputs has already occurred and short-circuits, or it does the training again in a non-destructive way - both of these typically involve hash comparisons between inputs/outputs). This isn't easy to pull off, and probably isn't a priority at the moment, but something to think about.

Originally posted by @nicklambourne in #89 (comment)

Sign in should redirect to the home page

Router improperly redirects on sign in Bug.

Currently the upon signing in to elpis.cloud, the router pops the last page off the stack, which can have the effect of navigating off the site.

We want the router to redirect to the home page instead after users log in.

Front-end view for creating a training job

TODO

  • When users visit /train, they should first of all be prompted to select a dataset to train with.
  • If the user has datasets but they are not yet processed, they should still see datasets, but there should be something to inform the user that they cannot select this until processing has completed.
  • If there are no datasets at all for a certain user, there should be an option to navigate to the datasets page to create one.
  • After selecting a dataset, the user should be provided with training options similar to the desktop version of elpis.
  • When a the options and dataset have been selected, the frontend should create a new training job in firestore with information about the date, dataset involved, and selected options.

Model duplication across frontend, services and cloud functions.

Essentially we'll have 4 representation of models like Dataset, Model etc that could get out of sync in the future.

  • We could write some integration tests
  • Or we could find some way of writing the model definitions once and propagate them through all parts of elpiscloud.

Pin cloud function dependencies

The cloud function dependencies aren't currently pinned at any version, which has the potential to create problems down the track if there are breaking changes (as outlined by Nick in a previous PR).

So essentially we just need to go through the requirements.txt file and find a set of compatible versions to freeze at.

Change tagging CSS to use Tailwind CSS

globals.css contains CSS for tagging functionality, this needs to be changed to use tailwind CSS. Also the tagging needs to be changed to use a cross icon rather than the rotated plus sign it is using at the moment.

Trainer subscription broke during deploy with non-compatible options

Arose from #86

This wasn't seen in the terraform planning stage, but apparently exactly once delivery and push config are incompatible options to have on the pubsub subscription, and it cried during the build pipeline:

│ Error: Error updating Subscription "projects/elpiscloud/subscriptions/trainer_subscription": googleapi: Error 400: A subscription cannot have push config or bigquery config set with exactly once delivery configured.
│ 
│   with module.trainer.google_pubsub_subscription.subscription,
│   on ../../modules/service/main.tf line 40, in resource "google_pubsub_subscription" "subscription":
│   40: resource "google_pubsub_subscription" "subscription" {

Resampling done improperly during dataset preprocessing.

After issues with the soundfile library (#71), I tried to rewrite the audio utilities using the python built in wave module.

When I did so, I thought that resampling was just modifying the sample_rate metadata on the audio file, which has caused some funny bugs with the processed dataset files.

The processed dataset files are not anything like expected, and instead of writing a custom resampling algo, we should find a better external audio utility library.

Header navbar links are not proper links

  • Header links are <li> elements wrapped in a next/link component
  • When right clicking one of the nav links, we don't get the traditional options like open in new tab etc.

Show user a view of all files in a given dataset

Currently in DatasetViewer.tsx, when the user is shown a list of all their datasets, a column should contain a button to 'view' the dataset in that row. This button should show a simple view containing a list of all the user's files that belong in that dataset.

Refactor the elan processing/cleaning scripts taken from desktop elpis.

Not high priority, but the scripts taken from the desktop versions could, and probably should be improved in the following ways (stolen from @nicklambourne):

  • Dataclasses for utterances
  • Smaller functions
  • Significantly less parameters per function
  • Removing questionable default parameters ("", {None})
  • Replacement of prints with calls to logging
  • Actioning or removing TODOs
  • Simplifying the "clean" function logic
  • Separating removing english functionality.

Trainer pubsub subscription retries thousands of times before getting a response.

There's an error that occurs while running the trainer service, where some file isn't found where it's expected.

  • Pub sub subscriptions are 'outstanding' until they receive an ack response (see here)
  • If no ack response is received within a set time (defaults to 10 seconds), the request is retried until it succeeds.
  • Currently the trainer service waits for the entire training to complete before returning a response.
  • This means that although the cloud run service was running, the pubsub subscription had no way to verify that it's request had been received, and was sending tonnes and tonnes of requests to the cloudrun server.
  • Obviously this is a massive issue if we're paying for big compute resources to run the docker images, and they continue to run forever because they get spammed with requests.

Visible user uploaded files at /files should not be limited to wav and elan.

  • Currently the files viewer in /files only shows uploaded elan and wav files, which can be confusing if you've also just uploaded txt files like pronunciation dictionaries etc.
  • We should make the uploaded files section look similar to the dataset file selector, showing every file in a list view instead of segmenting it into multiple sections.

Refactoring resource names in Terraform scripts

Instead of using hyphens in resource names in Terraform, underscores should be used. For example, in architecture/modules/frontend_bucket/main.tf, google_storage_bucket is named static-site. This will be changed to static_site.

Dataset cloud function preprocessing should handle improper datasets.

Currently the preprocessing workflow blindly attempts to create processing jobs from the files supplied in a dataset.
This would fail if the user selected improper files, not enough files etc.

Instead, after forming a dataset object from the incoming event, the process_dataset function should check dataset validity before batching and pushing jobs to the pubsub queue.

If the dataset is found to be invalid, we should indicate so on the dataset model in firestore (setting status to error or something).

On the files page, users should have the option to add tags to their files.

Tags should be used to organise the user uploaded files in firestore.

Each user file in firestore currently has an empty list of tags that a user should be able to update from the frontend. These tags should appear when viewing the files that a user has uploaded, e.g. from the files page or from the create new dataset page. They allow users to filter files not only by the name but also by some other information if they should so choose.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.