Hello! Iām Fei Wang, a Ph.D. in Chemical Engineering from Rice University, now navigating through the exciting world of Data Science. After dedicating 5 years to the oil and gas industry as a Seismic Image Analyst, where I delved deep into the nuances of imaging, data interpretation, and analytical problem solving, I have embarked on a journey to explore the expansive domain of Data Science.
My transition to Data Science has been a deliberate and well-structured journey. I immersed myself in a rigorous 600+ hour Data Science training program through Springboard, developing core competencies in Statistics, Machine Learning, Python, and SQL. My capstone projects, focusing on image classification and natural language processing, were testaments to my dedication and capability to apply latest machine learning techniques to real-world challenges.
- News Analysis for Potential Investment Strategies on Tesla Stock
- Utilized Large Language Models (LLMs) for news classification and sentiment extraction
- Enhanced LLM performance for task-specific accuracy using prompt-engineering and Parameter-Efficient Fine-Tuning (PEFT) with LoRA technique
- Deployed dashboards for dynamic visualization and analysis of data insights, facilitating potential investment strategies
- Plant Leaf Diseases Diagnosis with Deep Learning
- Built and optimized a deep CNN model for image classification using TensorFlow with a high accuracy of 0.9 on unseen data
- Mitigated imbalanced class issue using Bootstrapping and Bagging techniques
- Overcame insufficient data challenge by data augmentation
- How to outbid next SpaceX Falcon 9 launch?
- Using SpaceX API and web-scraping techniques to collect data
- Extensive EDA including geographic maps and dashboards
- Data wrangling for missing values and categorical variables
- Four machine learning classification models using Scikit-Learn with hyper-parameter tuning
- How much we will sell next month?
- Extensive EDA for time-series data
- To discover time-related patterns
- To find cycles
- Hybrid model including a simple linear regression and a XGB model for residuals
- Extensive EDA for time-series data
- Who survived the Titanic tragedy?
- Feature engineering to achieve more accurate predictions
- Mutual Information Analysis
- Principal Component Analysis (PCA)
- Advanced Random Forest Model from TensorFlow with automated hyper-parameter tuning
- Feature engineering to achieve more accurate predictions
-
Python
Pandas, Numpy, Pandas, Scikit-learn, TensorFlow, Transformers
-
Database
SQL, MySQL
-
Data Visualization
-
Machine Learning
Linear Regression, Logistic Regression, Decision Trees, XGB, Random Forest, SVM, KNN, K-mean Clustering, Convolutional Neural Netwokrs (CNNs) Netrual Language Processing with LLMs