sohamohajeri,Soha Mohajeri,github

breast-cancer-classification-by-deep-learning-with-tensorflow

buzzfeed-news-analysis-and-classification-by-natural-language-processing

FakenewsNet is a repository for an ongoing data collection project for fake news research at ASU. The repository consists of comprehensive dataset of Buzzfeed news and politifact which contains two separate datasets of real and fake news. The FakenewsNet consists of multi-dimension information that not only provides signals for detecting fake news but can also be used for researches such as understanding fake news propagation and fake news intervention. However, the repository is very wide and multi-dimensional, In this project, we perform a detailed analysis on Buzzfeed news dataset. The Buzzfeed news dataset comprises a complete sample of news published in Facebook from 9 news agencies over a week close to the 2016 U.S. election from September 19 to 23 and September 26 and 27. Every post and the linked article were fact-checked claim-by-claim by 5 BuzzFeed journalists. There are two datsets of Buzzfeed news one dataset of fake news and another dataset of real news in the form of csv files, each have 91 observations and 12 features/variables. The Buzzfeed news dataset consists of two datasets which has the following main features: id: the id assigned to the news article webpage Real if the article is real or fake if reported fake. title : It refers to the headline that aims to catch the attention of readers and relates well to the major of the news topic. text : Text refers to the body of the article, it elaborates the details of news story. Usually there is a major claim which shaped the angle of the publisher and is specifically highlighted and elaborated upon. source: It indicates the author or publisher of the news article. images: It is an important part of body content of news article, which provides visual cues to frame the story. movies: It is also an important part of news article, a link to video or a movie clip included in a article, also provides visual cues to frame the story. In this analysis, we do not consider features like url, top_img, authors, publish_date, canonical link and metedata because these usually provide redundant information which we can be obtained from other main variables and do not add more value to our analysis. The two main features we care about are the source of the fake news and the language used in the fake news. In particular, we are interested in finding sources which published fake news and finding words that are more associated with one category than other. The main purpose of this analysis is to develop methods to analyze fake news versus real news. This project is divided into two parts: (1) Exploratory Data Analysis (2) Classification. The goal of the second part is to build a classifer that can predict and detect fake news. We use three different classifiers to classify documents into real/fake news categories.

calculatedynamicrange

MATLAB function to calculate Dynamic Range of 10-bit YUV video sequences

calibre-webserver

A simple books website. 一个简单的在线版个人书库。

carbon-nanotube-sensors-multiphase-project

covid-19-analysis-visualization-and-forecasting

Introduction COVID-19 Analysis: The dataset used in this notebook (Covid-19_dataset.csv) is same as the COVID19_line_list_data.csv dataset taken from https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset, but the only difference is that in our dataset death and recovered features are encoded as (0 or 1) and not in form of dates as in the later dataset. There are three parts to my report as follows: -Cleaning -Visualization -Prediction. The first purpose of choosing this work is to find out which factors are more important in the death and recovery of patients. The second purpose is implementing several machine learning algorithms to predict the death and recovery of patients and compare the result to discover which algorithm works better for this specific dataset.

covid19-global-forecasting-kaggle-competition

cryptocurrency-price-prediction-using-deep-learning-lstm

datascience

Uniswap v3 Data Science Models

fake-news-detection-by-natural-language-processing

fetch-rewards

flask-framework

Basic template for using Flask on Heroku

forecasting-criminal-activity-in-san-francisco

house-price-predictions-by-deep-learning-with-tensorflow

house-price-predictions-with-xgboost-regression-and-linear-regression

lending-club-risk-analysis-and-prediction-by-deep-learning

linkedin-job-description-keywords-extraction-with-nlp

mall-customer-segmentation-by-k-means-clustering

malware-classification

Towards Building an Intelligent Anti-Malware System: A Deep Learning Approach using Support Vector Machine for Malware Classification

movie-recommendation-systems

movielensrecommender

national-health-dataset-dimensionality-reduction-and-clustering

The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES interview includes demographic, socioeconomic, dietary, and health-related questions. Here, we use the Demographics dataset and reduce its dimensionality by Principal Component Analysis (PCA). Afterwards, we find the main clusters by KMeans Clustering.

neural-networks-and-deep-learning-coursera-assignment-week-2

neural-networks-and-deep-learning-coursera-assignment-week-3

neural-networks-and-deep-learning-coursera-assignment-week-4

pima-indians-diabetes-predictions-with-xgboost-and-knn-classifiers

predicting-loan-eligibility-by-machine-learning-algorithms

# Predicting Loan Repayment The dataset for this project is retrieved from kaggle, the home of Data Science. The major aim of this project is to predict whether the customers will have their loan paid or not. Therefore, this is a supervised classification problem to be trained.

sohamohajeri Goto Github PK

Soha Mohajeri's Projects

Recommend Projects

Recommend Topics

Recommend Org