Coder Social home page Coder Social logo

erfaniaa / fake-job-posting-detection Goto Github PK

View Code? Open in Web Editor NEW
14.0 2.0 2.0 15.89 MB

Detect fake job posting with deep learning

License: GNU General Public License v3.0

Python 100.00%
tf-idf job-posting deep-learning machine-learning classification

fake-job-posting-detection's Introduction

Fake Job Posting Detection

Detecting fake job postings with deep learning

Introduction

I have used deep learning to solve a binary classification problem: "Is this job description real? Isn't it a fake one?"

The used dataset can be found here.

Method

After reading data from the CSV file they should be vectorized, so I used tf-idf algorithm for the strings. Then, I implemented a fully-connected neural network in PyTorch framework for processing those vectors:

class Network(nn.Module):
	def __init__(self, input_size=NETWORK_INPUT_SIZE, output_size=NETWORK_OUTPUT_SIZE):
		super(Network, self).__init__()
		self.fc1 = nn.Linear(input_size, 256)
		self.fc2 = nn.Linear(256, 128)
		self.fc3 = nn.Linear(128, 64)
		self.fc4 = nn.Linear(64, 32)
		self.fc5 = nn.Linear(32, 16)
		self.fc6 = nn.Linear(16, 8)
		self.fc7 = nn.Linear(8, 4)
		self.fc8 = nn.Linear(4, output_size)

	def forward(self, x):
		x = self.fc1(x)
		x = F.relu(x)
		x = self.fc2(x)
		x = F.relu(x)
		x = self.fc3(x)
		x = F.relu(x)
		x = self.fc4(x)
		x = F.relu(x)
		x = self.fc5(x)
		x = F.relu(x)
		x = self.fc6(x)
		x = F.relu(x)
		x = self.fc7(x)
		x = F.relu(x)
		x = self.fc8(x)
		return x

We have an imbalanced dataset for this binary classification problem. Because of that, I have used torch.nn.BCEWithLogitsLoss as my loss function. And for the cross-validation part, skorch library has been used in my code.

Result

After running the code, a confusion matrix and some related statistics will be shown to you:

Predict     real        fake           
Actual
real        16864       150         
fake        384         482         


Overall Statistics: 

95% CI                                                            (0.96764,0.97263)
Kappa                                                             0.62834
NIR                                                               0.95157
Overall ACC                                                       0.97013

Class Statistics:

Classes                                                           real          fake             
ACC(Accuracy)                                                     0.97013       0.97013 
ERR(Error rate)                                                   0.02987       0.02987 
F0.5(F0.5 score)                                                  0.9804        0.71008 
F1(F1 score - harmonic mean of precision and sensitivity)         0.98441       0.64352 
F2(F2 score)                                                      0.98846       0.58838 
FN(False negative/miss/type 2 error)                              150           384     
FNR(Miss rate or false negative rate)                             0.00882       0.44342 
FP(False positive/type 1 error/false alarm)                       384           150     
FPR(Fall-out or false positive rate)                              0.44342       0.00882 
PPV(Precision or positive predictive value)                       0.97774       0.76266 
TN(True negative/correct rejection)                               482           16864   
TNR(Specificity or true negative rate)                            0.55658       0.99118 
TP(True positive/hit)                                             16864         482     
TPR(Sensitivity, recall, hit rate, or true positive rate)         0.99118       0.55658 

Run

First of all, install the dependencies:

pip3 install -r requirements.txt

Then, run the project using Python version 3:

python3 main.py

fake-job-posting-detection's People

Contributors

erfaniaa avatar

Stargazers

SHASHAWNK avatar Matthew Truth avatar Rajat Kumar Dabas avatar sarvesh avatar  avatar  avatar Hadi Nazari avatar ArmanAminian avatar  avatar Mohammadhossein Zarei avatar Roozbeh Sayadi avatar Shayan Hosseini avatar Roozbeh Sharifnasab avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.