Comments (18)
Hi @bhack!
Yes, we plan to release the pretrained Inception3 model. I can't promise any strict deadlines, but it should happen "soon".
from dataset.
The pretrained model has been released. It's decent, but not very good. There are multiple factors that contribute to that:
- Annotations in the training set are noisy. This should get better over time.
- The training procedure was pretty basic. Like, randomly initialize Inception v3, define losses, start training, stop training after a couple of weeks.
- While the model has learned even rare labels, the absolute output from them might be very low (< 0.01). At the same time, the outputs are semantically ordered in the sense that an image with the label raises higher score than an image without label. Therefore, it's possible to calibrate the released model by stretching the outputs per label. This is not done at the moment.
Also, we're now open for the pull requests. See CONTRIBUTING.md for more details.
from dataset.
The document mentions that only 6000 of the labels have been used for the model. Can the list of those 6000 labels be shared? (As in, which 6000 labels were used).
This list will be shared at the same time as the model.
Also, is training such a model even feasible ? (9 million images and over 6000 categories)
I would not speculate on what is feasible. As for the evaluation of the quality of the model we trained specifically for this release, let's wait for the model be available. Generally, the quality is not very high, as the annotations are somewhat noisy at the moment. There's a long road ahead in cleaning them up.
from dataset.
@gkrasin Thank you for releasing this! I've used the pre-trained model of inception V3 as it ships with Tensorflow, and retrained it (Transfer Learning) to include labels that are more commonly seen in my data. All through this process, I was under the impression that the final layer calculates a softmax of the data coming in from the fully connected layer, thus resulting in a ranked output of class predictions on a 0-1 scale, all adding up to 1.
In the cat example that you provide with the trained Tensorflow model, however, there seem to be multiple synonymous/contextually related classes predicted with a high confidence, adding up to values greater than 1. I've noticed similar results on some native media as well:
Could you explain how this works, or point me to a resource that explains it? Thank you!
from dataset.
Hi Aditya,
as OpenImages is a multi-label dataset (e.g. each image can have multiple labels associated with it), we don't use Softmax. Instead, the last layer has a sigmoid non-linearity (and, while training, we used the sigmoid cross entropy loss):
predictions = end_points['multi_predictions'] = tf.nn.sigmoid(
logits, name='multi_predictions')
In other words, instead of predicting a class of an image, the net predicts labels / tags, and each value is a probability that the given label is set. If an image has a cat and a mouse, the net (in the ideal case) is expected to have both labels set.
from dataset.
Yes. For example, consider this image:
Ideally, the net would give at least the following labels:
animal(1.0),
prairie(1.0),
grass(1.0),
mammal(1.0),
llama(1.0),
grazing(1.0),
fauna(1.0),
vicuã±a(1.0),
guanaco(1.0),
meadow(1.0),
pasture(1.0),
grassland(1.0),
wildlife(1.0)
In the reality, since the net is not perfect (and not even calibrated, see my comment above, the outputs are:
5723: /m/0jbk - animal (score = 0.90)
2537: /m/035qhg - fauna (score = 0.85)
3473: /m/04rky - mammal (score = 0.82)
45: /m/01280g - wildlife (score = 0.79)
4605: /m/09686 - vertebrate (score = 0.74)
4558: /m/08t9c_ - grass (score = 0.32)
664: /m/01gd91 - pasture (score = 0.31)
5648: /m/0hkvx - prairie (score = 0.29)
522: /m/01c7cq - grassland (score = 0.27)
1494: /m/025st_8 - meadow (score = 0.18)
3981: /m/068hy - pet (score = 0.13)
2811: /m/03hh2k - grazing (score = 0.13)
3745: /m/05h0n - nature (score = 0.11)
...
Anyway, as you can see it has detected quite a few labels, some of which are not directly correlated (grass and mammal).
from dataset.
It could be nice to have that for a run in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/deepdream/deepdream.ipynb
from dataset.
Yes, indeed. :)
from dataset.
The document mentions that only 6000 of the labels have been used for the model. Can the list of those 6000 labels be shared? (As in, which 6000 labels were used).
Also, is training such a model even feasible ? (9 million images and over 6000 categories) Wouldn't training one from scratch and tweaking a model trained on, say ImageNet, give similar results?
from dataset.
@gkrasin thanks! :)
from dataset.
How the noise labeled samples could be detected with an openset approach?
from dataset.
@bhack sorry, I didn't get your question. Can you please rephrase or elaborate it a bit?
from dataset.
I mean that we could use an openset approach to neural networks like OpenMax that linked in the previous message to try to detect noisy label samples as "unknown"
from dataset.
/cc @abhijitbendale
from dataset.
@bhack yes, using algorithmic ways to detect noise (such as OpenMax) is a good idea.
from dataset.
Got it. The probability assigned to each label would be independent of other classification outputs. This net should then (in the ideal case) be able to pick out multiple objects/situations (for the lack of a better word) in an image with high confidence. Would you say that's accurate?
from dataset.
That's pretty clear now. You just saved me a whole lot of work with this. Thanks, @gkrasin!
from dataset.
@AdityaChaganti you're welcome. :)
from dataset.
Related Issues (20)
- OpenImages V6 data set HOT 1
- there are no cat and dog coarse-grain category. HOT 1
- Image 01a624308e2f8c5d in oidv6-train-annotations-bbox.csv is mislabled
- Mislabeled Images HOT 1
- segmentations.csv mask 3 coordinates HOT 1
- Decoding Openimages v6 mask coordinates HOT 2
- BadZipFile Error HOT 3
- Soil-dataset
- L
- Golf rounds
- OIDv4 Tool Kit Windows 10 Python 3.7 HOT 2
- Extended dataset download per category? HOT 1
- (V5) Mismatched image and mask resolutions. HOT 2
- Explore UI does not load images HOT 2
- How to report invalid/questionable images? HOT 5
- Open Image Dataset V5 to COCO JSON format
- Why not build a video instance segmentation dataset?
- Where can I download the OpenImage V2 dataset? HOT 1
- Hierarchy question
- Request to add pretrained large-scale object detector to "Community Contributions" HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dataset.