Farshid PirahanSiah

I am currently seeking opportunities in computer vision.

About Me

I am an accomplished Research Engineer with over 7 years of experience, including a PhD in Computer Science and a Bachelor's in Software Engineering. My career has been dedicated to the fields of Computer Vision, Machine Learning, and ML Operations. I have a proven track record of transforming complex algorithms into production-ready applications, utilizing cloud technologies and containerization, and optimizing ML pipelines and infrastructure in fast-paced environments.

With expertise in Machine Learning, IoT, Medical Imaging, and Robotics, I am proficient in designing image analysis algorithms and have made significant contributions to patents, books, and research papers. As a Technical Lead and Research Engineer, I have collaborated with stakeholders to define and achieve KPIs, managed cross-functional team projects, and deployed scalable AI solutions across cities and cloud platforms.

Experience

10+ years: PhD, R&D in Computer Vision, C++
7+ years: Machine Learning/Deep Learning, Python
5+ years: IoT, Model Optimization on Edge, Robotics, Medical Image Processing, Cloud Solutions (AWS)
3+ years: Technical Lead, Global Collaboration, Development Leadership
1+ years: LLM, Multimodal LLM, Vision-Based LLM, RAG, Langchain, OpenAI API

Core Skills

Computer Vision and AI: Advanced expertise in image processing, deep learning model development, and application deployment.
Project Management: Experienced in leading complex, cross-functional projects internationally.
Technological Proficiency: Skilled in OpenCV, NumPy, Pandas, Matplotlib, PyTorch, TensorFlow, Docker, and AWS.

Technical Skills

Languages: Python, C++, MATLAB
Frameworks: TensorFlow, PyTorch, Keras
Tools: Docker, Kubernetes, AWS, Git, Datadog, MLflow
Operating Systems: Windows, MacOS, Linux
Containerization: Docker
Cloud Computing: Amazon Web Services (AWS)
Others: CI/CD, GitHub Actions

Leadership and Collaboration

Strong teamwork and leadership capabilities have enabled me to lead international projects that exceed business expectations, driving growth and technological innovation in computer vision and machine learning.

Patents and Publications

Patents: A METHOD FOR AUGMENTING A PLURALITY OF FACE IMAGES - 2021
- The present invention relates to a method for increasing data for face analysis in video surveillance.
- WO2021060971A1
Patents: A METHOD FOR DETECTING A MOVING VEHICLE - 2021
- The present invention relates to a method for detecting a moving vehicle.
- WO2021107761
Patents: System and method for providing advertisement contents based on facial analysis - 2020
- Invented an algorithm, methods, and system for advanced facial attribute detection, leading to improvements in advertising systems.
- WO2020141969A2 WIPO (PCT)
Book Chapter: Camera Calibration and Video Stabilization for Robot Localization, Springer, 2021.
Authored over 16 publications in books, journals, and conferences globally.
My Google Scholar citation metrics are: Total Citations 141, h-index 7, i10-index 6

Feel free to connect with me through LinkedIn or follow my updates on GitHub. I am always interested in discussing potential opportunities or collaborations in computer vision and AI-related projects.

List of My Top GitHub Repositories:

1. Generative AI Business Intelligence Computer Vision (BI4CV)

Repository Link: BI4CV
Description: Welcome to the BI4CV repository! This project is dedicated to revolutionizing how businesses utilize image and video data for insightful Business Intelligence (BI). Our tools are designed to enhance data storytelling through advanced visualizations, interactive dashboards, and comprehensive reports. Our system smartly selects the optimal visual representations based on the complexity of your dataset, making analytics accessible to all users.
Key Features:
- Advanced Visualization Tools
- Smart Dashboard Creation
- Anomaly Detection
- Local LLM Integration
- User-Friendly Interface

2. CV_metaverse/3D Multi-Camera Calibration

Repository Link: 3D Multi-Camera Calibration
Overview: This project focuses on geometric camera calibration techniques to estimate the parameters of a lens and image sensor. Such calibration is crucial for applications in machine vision, robotics, navigation systems, and 3-D scene reconstruction.
Research Links:

3. Advanced Programming with Modern C++ 23 for Image Processing

Repository Link: C++ Image Processing
Functionality: This repository contains advanced C++ code examples for image processing tasks. The main function, int func_image_info(cv::Mat src, cv::Mat &dst), provides detailed information about an image including size, histogram, and more.
Additional Resources: YouTube tutorial on OpenCV

4. The collection of Python scripts provides a range of functionalities: one script automates logging into KaggleHub and setting up a pretrained Gemma model for chat simulations, another builds a GUI for real-time OpenCV function testing using PyQt5, while a third manages an asynchronous chat application with aiohttp. Additionally, there's a script integrating machine learning models for data analysis using advanced libraries like langchain, another launching AI-powered chat applications, and one demonstrating interactions with natural language understanding models on HuggingFace using LLMware. There's also a script using MLflow to manage the machine learning lifecycle and another detailing the local setup of Kubernetes via Terraform, showcasing infrastructure management and resource cleanup. These scripts employ a variety of technologies including Python, Gemma, PyQt5, OpenCV, aiohttp, asyncio, langchain, transformers, MLflow, Kubernetes, Terraform, and Docker, useful for tasks ranging from machine learning to software deployment. #MachineLearning #SoftwareDevelopment

Source Code
- This script logs into KaggleHub, downloads a pretrained Gemma model and tokenizer, sets up the model, and enables interactive chat simulation. ## Libraries: os, sys, torch, kagglehub, gemma_pytorch
Source Code
- This Python script uses PyQt5 to create a GUI application for real-time testing of OpenCV functions on images. ## Libraries: sys, PyQt5, cv2, numpy, screeninfo
Source Code
- his Python script defines an asynchronous chat application that uses the aiohttp library to interact with a chat model API, handling concurrency with semaphores and maintaining conversation history. ## Libraries: asyncio, aiohttp, collections, json, re
Source Code
- LLMOps & RAG: This Python script integrates various machine learning models and APIs to process financial data, interactively analyze text, images, and tables, and generate structured outputs. It employs libraries like langchain, transformers, and torch, alongside environmental variables for secure API key handling.
Source Code
- Launch your chat app with OpenRouter's AI! 🚀 Utilize asyncio and aiohttp for seamless conversations and manage interactions with a smart queue. Dive into the future of chat applications now!"
Source Code
- This Python script demonstrates how to use the LLMware library to interact with various models hosted on HuggingFace for natural language understanding tasks. It performs a specific query about an invoice using a provided context and compares the response from the model with a pre-defined answer. The script uses time measurement to track model loading and processing times.
Source Code
- This Python script uses MLflow, a platform for managing the machine learning lifecycle, including experimentation, reproducibility, and deployment. It demonstrates how to log parameters, metrics, and artifacts within an MLflow experiment. Specifically, it logs a parameter named "param1" with a value of 5, logs multiple values for a metric called "foo," and records a markdown file as an artifact.
Source Code -This script outlines the setup and use of Kubernetes on a local machine using Terraform, along with tools like Docker and Kubernetes command-line interface (CLI) utilities, all managed through Homebrew on macOS. It demonstrates the installation of required software, setting up Kubernetes with Terraform, querying the Kubernetes cluster, and visualizing Terraform plans. Finally, it guides through cleaning up resources with Terraform. This sequence ensures a practical approach to infrastructure as code (IaC) development and testing in a controlled, local environment.

Development of Generative AI Pipelines:

Engineered end-to-end, cloud-based generative AI solutions, overseeing the entire pipeline from data ingestion and model training to deployment and scaling.
Expertise in Multimodal Large Language Models (LLMs):
- Specialized in integrating multimodal LLMs for image processing applications, enhancing system capabilities to interpret and analyze visual and textual data simultaneously.
Innovations with Large Vision Models (LVMs):
- Developed and optimized Large Vision Models for edge computing, ensuring efficient processing and responsiveness in IoT devices.

Technical Expertise in Large Language Models (LLMs) and AI Development:

Multimodal RAG Systems: Led the development of Retriever-Augmented Generation (RAG) applications integrating text, image, and structured data, enhancing multimodal interaction capabilities.
Advanced AI Pipelines: Engineered end-to-end solutions for generative AI, leveraging cloud-based architectures to deploy scalable and efficient AI systems.
Deep Learning Implementation: Proficient in implementing complex deep learning models, with extensive use of libraries such as PyTorch, OpenAI's GPT models, and langchain for sophisticated text and image processing tasks.
Data Handling and Processing: Experienced in manipulating large-scale datasets, implementing custom extraction and partition techniques for PDF data integration, utilizing Python's robust libraries like PyPDF2 and pytesseract for OCR functionalities.
Optimization Techniques: Applied advanced machine learning techniques including hyper-parameter tuning, quantization, and model compression to enhance performance and efficiency on target hardware platforms, particularly in edge computing scenarios.
AI Model Deployment: Skilled in deploying AI models using Docker, managing environments with dependencies including langchain, unstructured, PyPDF2, and various OpenAI services, ensuring smooth transition from development to production.
Research and Development: Authored comprehensive documentation and guides, effectively summarizing research findings and technical processes, demonstrated through detailed GitHub repositories and Jupyter notebooks.
AI-Powered Summarization: Developed capabilities for summarizing diverse data elements (text, tables, images) using AI-driven approaches, significantly improving information accessibility and user engagement.
Community Contribution and Collaboration: Actively engaged in community forums and collaborative projects, contributing to open-source projects and providing innovative solutions to complex problems in the AI space.

Patents and Publications

Patents:
- A METHOD FOR AUGMENTING A PLURALITY OF FACE IMAGES - 2021 (WO2021060971A1)
- A METHOD FOR DETECTING A MOVING VEHICLE - 2021 (WO2021107761)
- System and method for providing advertisement contents based on facial analysis - 2020 (WO2020141969A2 WIPO (PCT))
Publications:
- Camera Calibration and Video Stabilization for Robot Localization, Springer, 2021.
- Authored over 16 publications in books, journals, and conferences globally.

My Works on LLMs:

Image Processing GPT:
- Computer Vision Developer GPT
- Expert in Python, OpenCV for image processing and computer vision applications.
MindMap about LLMs & LLMOps:
- My Note LLMs
Code for chat app with OpenRouter's AI:
- Source Code
- Utilize asyncio and aiohttp for seamless conversations and manage interactions with a smart queue.
Fine-tune LLMs:
- Source Code
Microsoft AI Lab: RAG Workflow with Azure AI:
- Lab Focus: Hands-on RAG workflow development using Azure AI Studio and Prompt Flow.
- Skills Acquired: Mastery in LLMOps, Azure AI Studio usage, and Prompt Flow integration.
- Tools Used: GitHub Codespaces, Visual Studio Code, Azure AI & ML Studio

pirahansiah Goto Github PK

Farshid PirahanSiah

About Me

Experience

Core Skills

Technical Skills

Leadership and Collaboration

Patents and Publications

List of My Top GitHub Repositories:

1. Generative AI Business Intelligence Computer Vision (BI4CV)

2. CV_metaverse/3D Multi-Camera Calibration

3. Advanced Programming with Modern C++ 23 for Image Processing

Development of Generative AI Pipelines:

Technical Expertise in Large Language Models (LLMs) and AI Development:

Patents and Publications

My Works on LLMs:

Farshid PirahanSiah, PhD.'s Projects

Recommend Projects

Recommend Topics

Recommend Org