leo-holanda / vagometro Goto Github PK

View Code? Open in Web Editor NEW

115.0 115.0 5.0 20.49 MB

IT jobs tracker in Brazil

Home Page: https://vagometro.vercel.app

License: GNU General Public License v3.0

HTML 32.44% TypeScript 63.91% SCSS 0.16% JavaScript 0.04% Python 3.44% Procfile 0.01%

angular daisyui emprego gupy linkedin mongodb tailwind typescript vagas

vagometro's Introduction

Hi, I'm Leonardo!

I'm a software developer who graduated in Information Systems from the Federal University of Alagoas (UFAL).

As an intern in web development at NINC, I was part of a team that develops a business management software for engineering companies using MEAN stack and Socket.IO.

In my graduation thesis, I proposed and assessed a software that helps researchers conduct educational experiments in online learning environments. I was also part of a team at NEES that developed this software using Vue.js and PHP/Laravel.

In my spare time, I develop open-source projects to learn and consolidate concepts and technologies while also aiming to solve real problems or just code something fun. Take a look at them in the pinned repositories below!

EDIT 2024: Hi everyone! I want to let everyone who uses my projects know that I'm no longer able to dedicate time to maintain them in the near future. I'm completely focused on another career path that requires much time and effort. When I achieve my goal, I want to dedicate time again to address issues and try to improve the projects because this is something I do because I like it. I want y'all to know that I really appreciate the kind feedback and donations that I got from the projects. These are things that I remember fondly from some tough years that are now in the past. Thanks again and may God bless y'all!

🛠️ Familiar with

TypeScript
Python
Angular
Node.js, Express and Flask
MongoDB and PostgreSQL
AWS and GCP

Find me

vagometro's People

Contributors

Stargazers

Watchers

Forkers

danielschmitz karolinagusmao intelmib iamrosada

vagometro's Issues

Find a solution to database size increasing

Until the database hits the free tier limit, its size will increase every time new jobs are inserted by the serverless functions. Then, it should stop working.

It can be said that it makes no significant difference whether saving new jobs in a database or just appending them in a compressed JSON file that is uploaded to R2 and requested by Cloudflare Workers. Because it is very unlikely that the job data will change after the job posting is closed.

To avoid exceeding the database free tier limit, there should be a solution where the jobs are automatically appended to a JSON file that is then saved in R2. There should be specific JSON files for each collection, i.e., webdev, qa, ai, etc. They should be compressed using zip and then decompressed at the client in an effort to increase download speed and, consequently, reduce page load time.

Implement multi-word keyword matching

Currently, the keyword matching logic only works for keywords with just one word like Figma, JavaScript, etc. A keyword like "Clean Code" isn't matched because the job description is split by whitespaces so it would try to match ["Clean", "Code"] with "Clean Code" which doesn't work.

The idea is to implement a logic to match multi-word keywords first, then split the content and match the one-word keywords.

Implement job status (open or closed)

Today, it is not possible to tell if the job is open to applications or not. The serverless functions only get the new jobs and save them in a database. They don't come back later to check if the jobs were closed or not.

Since this can be very useful for job hunting, it's a feature that the project must have. There should be separate serverless functions that check the job status after they have been saved to the database. It is yet to be defined how these checks will work.

Edit:
Jobs from GitHub are posted as issues. So to know if the job is no longer accepting applications, it would be necessary to check the issue status which is a boolean field in the job object obtained from the API.

Jobs from Gupy are obtained through their portal API. I believe that once the job no longer accepts applications, it no longer appears in the API query results. So requesting all pages and checking if the job can be found or not must be enough to determine its status. Can be annoying for Gupy which is a problem.

Jobs from LinkedIn are obtained through web scraping. There is no job status indication in the HTML pages. The logic seems to be the same from Gupy. Going to a job page reveals its status by the presence of a HTML tag that contains a message warning that the job no longer accepts applications. It would be necessary to check every job page to obtain this data.

Some sort of community input where people report the jobs status should also be considered. Otherwise, consider how should the checks occur. Should it be daily? From which date should the function start doing the check? Consider the cost of resources for the GCP Cloud Run free tier. AWS Lambda functions shouldn't be a problem.

Implement optimisations to prevent the notebook fan from turning on so much

I think that one of the reasons that this app is making the notebook fan turn on is because there are lots of things that should run just once and it's running multiple times.

In the process of mapping a job posting, the title and description is parsed multiple times when just one is enough. When this process is multiplied by the number of jobs and considering the title and description string sizes, the related-terms maps iteration and so on, it surely takes a toll on the computer resources.

So instead of parsing it for every piece of data that is meant to be extracted, just parse it once and then extract all data in one pass.

Also, the ranks in the all-overview are being remade everytime the user returns to the stats route when they should be remade just when the selected job collections change.

Think more about possible optimisations.

Projeto Região Norte e Nordeste

Fala Léo, tudo bem?
Tem alguma forma que consiga fazer contato contigo?
Estou com um projeto de portal de empregos para região norte e nordeste e preciso de fazer algumas integrações com Linkedin e Gupy... Me avise se estiver disposto a conversar.

Adriano Fante - Skills IT (63) 99287-8781

Add more than the default colors that DaisyUI provide to use in keywords badges

The keywords badges in the job list cards are separated by categories, which can be frontend, backend, cloud, etc... and each category should have its own color to make it easier to separate them. There are a limited amount of colors available through DaisyUI which are not enough to cover all this categories. Adding more colors is necessary to do so.

Centralize all related term files

There are multiple files across the folders that store terms that are related, for example, to job contract types, workplace types, etc and are used in keyword matching. To improve organization, find all files and place them inside a folder named related-terms in the shared folder.

Implement linked keyword

Sometimes, a job posting mentions .NET but doesn't mention C#. Since you need to use C# when developing things with .NET, a good ideia would make .NET a keyword that is linked to C#. So, every time that .NET is matched but C# doesn't, It matches C# too. And the same thing for other use cases.