Comments (7)
Yeah got it. Now I have good idea of what should be done.
from protes.
Note that it is easy to set up another database collection via the FOCA-based app config used in proTES (it becomes basically a declarative issue).
from protes.
Hi @uniqueg,I would to like to work on this issue but I have some doubts I would like to clear.
As I understood, after the first call we need to store the geolocations to the database and afterwards simply fetch them from the db.
I am confused as to what is meant by first call here ? Is it first call of session initiated by a new user or the first call of day or of a particular timeframe. Knowing what a first call is will help simplify the process
Also for "fetching the missing geolocations from the remote service and adding them to the database collection" when should one check if we have missing geolocations.
from protes.
Great, @limarkdcunha, please go ahead and assign yourself :)
As for your question: Just check for every domain if it's in the database collection. If it is, then use the stored geolocations. If it isn't, follow the current procedure and then store them in the database collection. So for now let's assume all domain-coordinate pairs are stored in the database forever and are never updated once created (unless the database is wiped). This is of course not suitable for a production setting, but it's pretty much the same configuration we have for tasks, and we can worry about life cycle management for these documents later on.
So you can just follow the way we are dealing with the tasks collection, including its registration via the app configuration in config.yaml
that takes care of the boilerplate of setting up and configuring the collection.
from protes.
Hi @uniqueg,thanks for the clarification.
From what I understand from the codebase there are two task_distribution functions (random and distance).If the input urls are present in request ,distance based task_distribution is called or else random one is called.
Random TD does nothing but reshuffles 5 domains present in the app context from config.yaml.Now here I am not able to figure out where are the geolocations fetched for a particular domain? As I am able to infer from create_task function of task_runs.py, url itself is used to create a task using tes.HTTPClient.
Also in your first comment you said "In this way, for the vast majority of calls, only input URI geolocations will need to be fetched from the remote service". I find this statement a little confusing as I previously stated that without input_urls distance based task_distribution is not called.Can you please shed some light on this as well.
A final thing I am not able to assign the issue to myself
from protes.
I have assigned you @limarkdcunha.
Geolocations are only used for the distance-based distribution logic. The point is to choose a TES instance that is closest to the inputs. So if there are no inputs, it doesn't make sense to use it, and we just distribute tasks randomly. In that case, geolocations don't play a role.
So geolocations are ONLY used in the distance-based task distribution logic, meaning that for addressing this issue, you will not need to look at the random distribution logic at all.
Does that make it clear?
from protes.
Mental note: Now that we provide the distance-based task distribution plugin (inside proTES, yet still apart), we need to think carefully how to set up the database so that it is not too tightly coupled with the main app db. Ideally, a collection for storing IP geolocations should be created automatically (and only) when the plugin is used.
from protes.
Related Issues (20)
- Tasks not filtered by name prefix HOT 3
- Middleware called before task is created
- Several issues with TES/input URI processing in distance-based task distribution logic
- Fix docstrings HOT 1
- Fix naming of TES task request objects HOT 2
- Task state set incorrectly when best TES instance fails HOT 4
- Enforce immutability for incoming task document HOT 3
- Error handling issues in calculate_distance function of distance-based task distribution module
- test: unit test for tasks module HOT 1
- ignore Funnel basic credential if task submitted to TESK HOT 2
- Add Access Control
- Tasks with inputs without URLs fail HOT 1
- Report and publish code coverage
- Fetch available TES instances dynamically
- Enable auth and bearer token validation. HOT 2
- Write models for custom config validation
- Next page cursor loop
- build: exclude Connexion >3
- fix: unit tests HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from protes.