Coder Social home page Coder Social logo

Comments (5)

rkrasin avatar rkrasin commented on July 22, 2024

Hi @louisgv,

images.csv has OriginalMD5 column. If you are downloading the images in the original resolution, it will help.

If you're downloading thumbnails, no checksum is possible, as (at least) Flickr generates these thumbnails on the fly. Every time these thumbnails are slightly different. More over, depends on the load, they might have more or less details in them.

from dataset.

rkrasin avatar rkrasin commented on July 22, 2024

Oh, I have misunderstood. You mean the hashes for archives with the metadata.

from dataset.

rkrasin avatar rkrasin commented on July 22, 2024

For the current set of archives:

$ sha1sum images_2017_07.tar.gz
e12e364e5aa44bd40dcdb78c54a7d8ed5c45f0ee  images_2017_07.tar.gz
$ sha1sum annotations_human_bbox_2017_07.tar.gz
31c7c72c5ac3c4c2f4becab0b4cfd5d5d5de7a3f  annotations_human_bbox_2017_07.tar.gz
$ sha1sum annotations_human_2017_07.tar.gz
5827193489d42635e5a78b467c4dede3ff066df3  annotations_human_2017_07.tar.gz
$ sha1sum annotations_machine_2017_07.tar.gz
511f006422c84984beb57578594e4b57c75a0b1b  annotations_machine_2017_07.tar.gz
$ sha1sum classes_2017_07.tar.gz
71ba8dd692ec3538543f45fdb2b88f6be59c096a  classes_2017_07.tar.gz

I expect that this will diverge from truth when minor updates are posted (but I hope that when it happens, the archive names will be different)

from dataset.

rkrasin avatar rkrasin commented on July 22, 2024

I was about to make a pull request for rkrasin@1e3cede but then realized that gzip already has a checksum (CRC32). It's relatively weak (just 4 bytes), but it's a good guard against bad internet connection.

How did you learn that the archive was corrupted?

from dataset.

nalldrin avatar nalldrin commented on July 22, 2024

Closing since this seems resolved (hash part of gzip).

from dataset.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.