Comments (5)
Hi @louisgv,
images.csv has OriginalMD5 column. If you are downloading the images in the original resolution, it will help.
If you're downloading thumbnails, no checksum is possible, as (at least) Flickr generates these thumbnails on the fly. Every time these thumbnails are slightly different. More over, depends on the load, they might have more or less details in them.
from dataset.
Oh, I have misunderstood. You mean the hashes for archives with the metadata.
from dataset.
For the current set of archives:
$ sha1sum images_2017_07.tar.gz
e12e364e5aa44bd40dcdb78c54a7d8ed5c45f0ee images_2017_07.tar.gz
$ sha1sum annotations_human_bbox_2017_07.tar.gz
31c7c72c5ac3c4c2f4becab0b4cfd5d5d5de7a3f annotations_human_bbox_2017_07.tar.gz
$ sha1sum annotations_human_2017_07.tar.gz
5827193489d42635e5a78b467c4dede3ff066df3 annotations_human_2017_07.tar.gz
$ sha1sum annotations_machine_2017_07.tar.gz
511f006422c84984beb57578594e4b57c75a0b1b annotations_machine_2017_07.tar.gz
$ sha1sum classes_2017_07.tar.gz
71ba8dd692ec3538543f45fdb2b88f6be59c096a classes_2017_07.tar.gz
I expect that this will diverge from truth when minor updates are posted (but I hope that when it happens, the archive names will be different)
from dataset.
I was about to make a pull request for rkrasin@1e3cede but then realized that gzip already has a checksum (CRC32). It's relatively weak (just 4 bytes), but it's a good guard against bad internet connection.
How did you learn that the archive was corrupted?
from dataset.
Closing since this seems resolved (hash part of gzip).
from dataset.
Related Issues (20)
- OpenImages V6 data set HOT 1
- there are no cat and dog coarse-grain category. HOT 1
- Image 01a624308e2f8c5d in oidv6-train-annotations-bbox.csv is mislabled
- Mislabeled Images HOT 1
- segmentations.csv mask 3 coordinates HOT 1
- Decoding Openimages v6 mask coordinates HOT 2
- BadZipFile Error HOT 3
- Soil-dataset
- L
- Golf rounds
- OIDv4 Tool Kit Windows 10 Python 3.7 HOT 2
- Extended dataset download per category? HOT 1
- (V5) Mismatched image and mask resolutions. HOT 2
- Explore UI does not load images HOT 2
- How to report invalid/questionable images? HOT 5
- Open Image Dataset V5 to COCO JSON format
- Why not build a video instance segmentation dataset?
- Where can I download the OpenImage V2 dataset? HOT 1
- Hierarchy question
- Request to add pretrained large-scale object detector to "Community Contributions" HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dataset.