Imtiaz Khan
, Nice to meet you 👋
Hi I am Professional Summary
Staff Data Scientist with 11 plus years of experience executing data-driven solutions to increase efficiency, accuracy and utility of data processing. Proficient in building Natural Language Understanding and Natural Language Generation Models (NLP=NLU+NLG). I am looking for open source colloborations and working on productive remote teams.
Technical Skills
Languages
: Python, C++,SQL
Developer Tools
: Jupyter, VS Code, Git,Confluence,Jira, Azure/AWS/GCP
ML Tech
: NLP NLU, NLG, RNN,LSTM ,BERT,GPT,CNN,Transformers,HuggingFace, Fastai
ML Tools
: MLFlow, Grafana, Prometheus, Gradio, WanDB
Experience
PaloAlto Networks
PII Information Classification
April 2022 – Present
Senior Staff Machine Learning
- Enabled a pii/phi/pci detection strategy across multiple communication channels and different types of files using NER techniques.
- Reduced the false positive rate from 15 percent to 7 percent by using advanced semi-supervised learning techniques and architectures like longformer.
- Strategized the structured pii identification using character cnn models and integrated it with existing regex based system.
Novartis Nov 2020 – March 2022
DocZ Document AI Assistant Product Engineering AI/NLP Expert
- Enhanced In-house DocZ product to condense clinical study report information with NLP Actions using techniques like Named entity recognition (NER in scispacy and microsoft text analytics for health).
- Condensed the clinical study report document by 75 percent by using One-shot Summarization by using Universal sentence Encoder Embeddings.
- Improved the table extraction of measurements by 95 percent from irregular rtf files to excel by using tabula module in python.
Deloitte Dec 2018 – Nov 2020
Fraud Detection Machine Learning/NLP Consultant
- Implemented machine learning to reduce fraud by 8 percent by using Gradient Boosting Trees.
- Brought down the client metric (false positive/true positive) ratio under 4 as opposed to 6.5. Complaint Categorization
- Automated the complaint categorization from manual process by using tfidf,text analytics, logistic regression with 0.8 F1 Score at each level.
- Reduced time of complaint categorization for 1000 complaints from 20 business hours to 2 minutes.
Accenture May 2012 – Dec 2018
Question Generation Wizard Software NLP Engineer
- Automated generation of FAQ questions given answer and context using LSTM/RNN Encoder Decoder deep learning models.
- Able to reduce the time of FAQ creation of questions when compared with an Subject Matter Expert by 80 percent. Ticket Classification
- Leveraged Azure Machine learning for efficient classification of incoming software/hardware related tickets into issue categories using email description by using ensemble of models like logistic regression,boosted decision trees and naive bayes algorithms.
- The time of classification of tickets to correct categories was reduced to 10ms. Forecasting Consumer Goods
- Converted Alteryx workflows of forecasting of top 8 products to R.
- Reduced forecast time by 63 percent and increased revenue by 29 percent to the client.
Education
Indian School of Business
Business Analytics Graduate Certificate
V.R Siddhartha engineering College
Bachelors in Electronics and Communication Engineering