Coder Social home page Coder Social logo

covid-tweetids's Introduction

COVID-TweetIDs

A dataset of tweets gathered during the 2020 COVID-19 pandemic.

As of 4/19/2020, there are ~41M tweet IDs ranging from March 17th to April 15th 2020.

Data Sharing Format

Per Twitter's terms of serivce for content redistribution we provide lists of tweet IDs which can be found here. Each month of tweets gets its own folder containing one text file per hour named with the date and UTC hour.

Collection strategy (as of 3/21/2020):

We use the twitter streaming API to collect tweets related to keywords that are related to the ongoing coronavirus pandemic. Filter keywords were selected based on the following methods:

  • A term frequency analysis was done on a random sample of 30,000 tweets collected with keywords "coronavirus" and "covid". Of the top 200 most frequent non-stopword terms, 50 were manually selected.
  • Some keywords were borrowed from a similar effort at USC.
  • Some keywords were added based on knowledge of emerging discussion topics related to the pandemic.

Note on keyword selection:

While many of the keywords in our list are specific to COVID-19, there are some that are more general such as "school", "work", "sick", "testing", and "closed". By including these keywords we are casting a wider net to pick up discourse on topics that are heavily influenced by the COVID-19 pandemic but may not be directly related to the virus. We believe that this broader view is necessary in order to be able to capture the impact that the pandemic is having on society at large.

Keyword list (as of 5/24/2020):

See keywords.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.