Coder Social home page Coder Social logo

Hi there 👋

LinkedIn Profile: https://www.linkedin.com/in/rohan-raj-8ba608b8/

Featured projects:

  • A Visual Narrative of Ramayana using Extractive Summarisation, Topic Modeling and NER tagging. link
  • Building pipeline to process the real-time data using Spark and Mongodb/postgresql. link
  • ETL workflow and data analysis. ETL-workflow using prefect and pygrametl (SCD, slow changing dimension). Product classification based on product name. link
  • Data-Driven-Storytelling-Old-Car-Price-Prediction link

Open source Contribution:

Master Thesis:

Thesis Title: End to End UX Analytics Framework.
The focus of research is to build an infrastructure designed to facilitate new levels of analytical insights derived from exploiting all relevant data. The platform will cover various forms of data and analytics: transactional data, order data, App usage data, user data, and so forth. It will also establish an adaptable, scalable IT infrastructure, tuned for a complex data environment, and it will be designed to benefit from the cloud technologies. The end product will be to perform data analytics for taking insightful business and design decisions.

Key terms: Data Infrastructure, Data Warehouse, Business Intelligence, ETL, Continuous Integration and Continuous Deployment (CI/CD) Pipeline, Docker, Kubernetes (OKD), GitLab, PostgreSQL, Tableau, Python, AWS services, GCP services, Azure services, Cloud computing

Bachelor Thesis:

Thesis Title: New avenues in opinion mining : Considering dual sentiment analysis.
To address this problem for sentiment classifi cation, Dual sentiment analysis (DSA) has been expanded from a 2 facet classifi cation to a 3 facet classifi cation which considers neutral reviews from the data set as well for better accuracy and understanding. For each training and test review, a novel data expansion technique is being proposed that will use opposite class labels of positive and negative sentiments in one to one correspondence for a dual training and dual prediction algorithm. A corpus method based pseudo-antonym dictionary has also been proposed to remove the single language (English) based restriction and to maintain domain consistency as it will be pairing up words on the basis of sentiment strength. paper

Rohan Raj's Projects

data-management icon data-management

Data Profiling and Data Cleaning by mentioning all the relevant data quality dimension

etl-workflow icon etl-workflow

ETL workflow and data analysis. ETL-workflow using prefect and pygrametl (SCD, slow changing dimension). Product classification based on product name.

ner-lstm icon ner-lstm

Named Entity Recognition using multilayered bidirectional LSTM

ocr-comparison icon ocr-comparison

Comparing OCR models: Tesseract and Transkribus for Devanagari script.

ocropy icon ocropy

Python-based tools for document analysis and OCR

ramayanaocr icon ramayanaocr

A Visual Narrative of Ramayana using Extractive Summarisation, Topic Modeling and NER tagging

roadtaxtracker icon roadtaxtracker

CRUD operation for a fleet of vehicle in Singapore to ease roadtax renewal

titanic-dataset icon titanic-dataset

This dataset has passenger information who boarded the Titanic along with other information like survival status, Class, Fare, and other variables. The unfortunate event which was occurred on 15 April 1912, the Titanic sank after colliding with an iceberg, aboard 2224 peoples. Titanic passenger Data Analysis consist: Data Exploration and Preparation, Data Representation and Transformation, Data Visualization and Presentation

winutils icon winutils

Windows binaries for Hadoop versions (built from the git commit ID used for the ASF relase)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.