Domain Discovery on Any Domain
nasa-jpl-memex / sce-domain-discovery Goto Github PK
View Code? Open in Web Editor NEWDomain Discovery for the Sparkler Crawl Environment
License: Apache License 2.0
Domain Discovery for the Sparkler Crawl Environment
License: Apache License 2.0
We should bring back the alert message on successful model updates
the search box should except advanced search capabilities, eg adding a - before a word should black list it from the search
This should be common functionality for our search engine, we just need to be able to pass it on - then we can include advanced search instructions as well.
When the user performs a search query, show a progress bar or something similar rather than showing the infinity gif. This is because the gif does not indicate if firefox crashed or the parsing is taking longer.
We need to have a /status endpoint that the UI can query to get the status of the system, eg working, exception, idle etc.
The following items in the interface should have tooltips added to them:
If there is a seed file and at least 10 of each type of relevancy marked, the crawl button should be usable. It will just launch the crawl.
just to list some of the thing that need to be resolved in the future:
Either make Accuracy mean something or replace it with something else useful, eg Number of Highly Relevant, Relevant, and Not Relevant pages that have been marked.
If we're going to launch a crawl that will run forever, we need a stop crawl button also.
When a seed file is uploaded, it's name should be displayed in the interface.
Clicking on this name should bring up a list of the urls in the file.
There may be some additional capabilities that should go along with this. Let's discuss before working on it.
It is default internet usage. Please don't make me click the magnifying glass!!
The seeds are the SMEs best proposal of exactly what they are looking for. They should be added to the domain discovery model as "highly relevant".
The users should be able to save and load a model on the server. For instance, the user can save a model on the server by giving a name and load it subsequently. All the saved models must be reported in a list where the users can select the model to be used.
These new functionalities will replace import and export (the model cannot be exported on local).
Once the crawl is launched, the Dashboard button should be usable as well. This is just a link to /banana.
All of these buttons should be in the left hand side bar.
The UI should show the user the current number of pages labelled and the numberof pages in each label and the urls in those labels.
The user is getting confused with the meaning of non-relevant, provide a way to mark all by default as non relevant and ask the user to only click the ones that are interesting. This may reduce the number of labels we have to two
For this first pass, it should accept a .txt file only and do some basic checking to make sure that the file contains useful urls.
The file should then be saved to the appropriate location on the server, and ideally the name of the file will be shown on the interface.
Just a few things - should just be cosmetic and relatively simple
Once we have reworked the model saving system in #7, this is how we make managing models more intuitive.
Each model in the list of models should have a drop down list next to it that contains the following capabilities:
In addition, the following changes to the interface will need to take place:
The left hand side column should contain these things, in this order:
Search box
Selected Model Name
"Create a New Model"
Import a Model
List of models saved on server
The users should not be forced to mark all the 12 results, but they should be able to mark a subset of the results. This may happen, for example, when users forget to mark some result or do not want to mark a specific result. In any case, no error has to be reported when one or more results have not been marked.
This will complete all of the functionality that we created for the DD Eval.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.