princepeak Goto Github PK
Type: User
Company: Drexel University
Bio: Graduate Student of Master of Data Science
Location: Philadelphia
Type: User
Company: Drexel University
Bio: Graduate Student of Master of Data Science
Location: Philadelphia
Systematic dataset of Covid-19 policy, from Oxford University
COVID-19 Tweets Dataset
Covid-19 Twitter dataset for non-commercial research use and pre-processing scripts - under active development
Workspace for DSCI591 captone project I
The ongoing pandemic of coronavirus disease 2019-2020 (COVID-19) is caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). This pathogenic virus is able to spread asymptotically during its incubation stage through a vulnerable population. Given the state of healthcare, policymakers were urged to contain the spread of infection, minimize stress on the health systems and ensure public safety. Most effective tool that was at their disposal was to close non-essential business and issue a stay home order. In this paper we consider techniques to measure the effectiveness of stringency measures adopted by governments across the world. Analyzing effectiveness of control measures like lock-down allows us to understand whether the decisions made were optimal and resulted in a reduction of burden on the healthcare system. In specific we consider using a synthetic control to construct alternative scenarios and understand what would have been the effect on health if less stringent measures were adopted. We present analysis for The State of New York, United States, Italy and The Indian capital city Delhi and show how lock-down measures has helped and what the counterfactual scenarios would have been in comparison to the current state of affairs. We show that in The State of New York the number of deaths could have been 6 times higher, and in Italy, the number of deaths could have been 3 times higher by 26th of June, 2020.
Problem Statement The objective of this task is to detect hate speech in tweets. For the sake of simplicity, we say a tweet contains hate speech if it has a racist or sexist sentiment associated with it. So, the task is to classify racist or sexist tweets from other tweets. Formally, given a training sample of tweets and labels, where label '1' denotes the tweet is racist/sexist and label '0' denotes the tweet is not racist/sexist, your objective is to predict the labels on the test dataset. Motivation Hate speech is an unfortunately common occurrence on the Internet. Often social media sites like Facebook and Twitter face the problem of identifying and censoring problematic posts while weighing the right to freedom of speech. The importance of detecting and moderating hate speech is evident from the strong connection between hate speech and actual hate crimes. Early identification of users promoting hate speech could enable outreach programs that attempt to prevent an escalation from speech to action. Sites such as Twitter and Facebook have been seeking to actively combat hate speech. In spite of these reasons, NLP research on hate speech has been very limited, primarily due to the lack of a general definition of hate speech, an analysis of its demographic influences, and an investigation of the most effective features. Data Our overall collection of tweets was split in the ratio of 65:35 into training and testing data. Out of the testing data, 30% is public and the rest is private. Data Files train.csv - For training the models, we provide a labelled dataset of 31,962 tweets. The dataset is provided in the form of a csv file with each line storing a tweet id, its label and the tweet. There is 1 test file (public) test_tweets.csv - The test data file contains only tweet ids and the tweet text with each tweet in a new line.
Retrieving data via YouTube APIs
This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require a headless browser, like other selenium based solutions do!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.