Coder Social home page Coder Social logo

mahboobalam39 / insta-hyre Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 12.58 MB

This project aims to scrape job data from Instahyre, apply clustering algorithms to group companies based on job postings, and create a web application for users to explore company classes, job counts, seniority levels, and industries based on the skill searched.

Jupyter Notebook 99.66% Python 0.13% HTML 0.21%
data-visualization excel k-means-clustering machine-learning python scrapping-python sql job-analytics

insta-hyre's Introduction

Project Name: Instahyre Job Analytics

WhatsApp Video 2023-07-24 at 10 15 43 AM

Introduction

The project's objective is to gather job-related information from Instahyre using Python's Selenium library and organize it in a specified format. The collected data will then be converted into three separate tables: jobs, company, and details, utilizing the Pandas library. To enable user-friendly searches, a search bar will be implemented using the Flask web framework, allowing users to look up skills. The search results will display essential details, such as the most common experience level, industry, and company class where the skill is in demand, along with the number of available job opportunities. To enhance user experience, the FuzzyBuzzy library will be employed to correct any input errors made by users in the search bar.

Problem Aimed to Solve:

The aim of the project is to automate job data collection from Instahyre using Selenium, structure it into tables with Pandas, create a user-friendly search interface using Flask, and enhance search accuracy with FuzzyBuzzy. This will save time, provide detailed job information, improve search precision, analyze job trends, and efficiently match candidates with suitable positions.

User's Manual

Files/Folder Description
Phase - 1 Includes the following folders:
Table creation: (Creating database tables)
Data Analysis: (Analyzing data sets)
Web Scraping: (Extracting data from websites).
Phase - 2 Includes the following folders:
App Logics: (Implementation of application logic.)
Data Preprocessing and Model Creation: (Data preparation and development of machine learning models.)
App: (Final application code.)

Data Description

  • Jobs Table:
Column Name Description
JobID Primary key for Jobs table
Designation The designation of the job
Industry Industry of the company from which the job is
Location Location of the job
Skills Skills required for the job
DetailID A key to map with details table, as every job has some description
CompanyID A key to map with company table, as one company can have multiple jobs
  • Company Table:
Column Name Description
CompanyID Primary key for Company table
Name Name of the company posting the job listings
Founded Founded year of the company
Employees Total number of employees in the company
  • Details Table:
Column Name Description
DetailID Unique identifier for each set of additional details
Skills Skills or qualifications required for the job
Involvement The nature of involvement in the job
Exp Year of experience needed for the job
HR Name of HR who posted the job

Methodology

The following methodology was used to accomplish the project objectives:

  1. Data Scraping: Job data was obtained from Instahyre using Python's Selenium library, considering specific criteria like job titles, locations, and company names.

  2. Data Conversion: Utilizing Pandas, the scraped data underwent transformation into three tables: jobs, company, and details.

  3. Data Cleaning and Preparation: The data cleaning phase involved eliminating irrelevant data, handling missing values, standardizing formats, removing duplicates, cleaning text, managing outliers, type conversion, consistency checks, categorical data normalization, and ensuring data integrity.

  4. Company Classification: Companies were classified into five classes (Class0 to Class4) based on employee count and company age using K-Means clustering. The optimal number of clusters was determined using the Elbow Method.

image

Elbow Method

image

Scatter Plot of Clusters
  1. User-Friendly Interface: A Flask web framework introduced a search bar for users to look up skills. FuzzyBuzzy library corrected any input errors. Search results displayed the most common experience level, industry, company class related to the skill, and the number of available job opportunities.

Challenges and Learnings

1). Webpage with HTML/CSS:

  • Challenge: Design a webpage using HTML/CSS.

    Learning: Learn HTML structure, CSS styling.

2). User Text Processing with FuzzyWuzzy:

  • Challenge: Process user text using FuzzyWuzzy.

    Learning: Understand text manipulation, fuzzy matching.

3). Backend with Flask, Webpage Interaction:

  • Challenge: Create Flask backend, connect to webpage.

    Learning: Grasp Flask basics, dynamic content.

4). Model Deployment Exploration:

  • Challenge: Explore deployment options.

Results and Conclusion

1. This webpage is designed to accept user input.

image

2. The webpage generates output based on the skills searched by the users.

image

3. This webpage showcases a comprehensive list of jobs related to specific skills entered by users, along with supplementary information.

image

A short demo video of our app (Deployed on the local host server)

App_video.mp4

References

insta-hyre's People

Contributors

mahboobalam39 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.