Name: Sigma Jahan
Type: User
Company: Dalhousie University
Bio: Comp Sci PhD student at Dalhousie University. My research interests lie in automated software debugging to ensure reliable Artificial Intelligence system!
Twitter: JahanSigma
Location: Halifax, Canada
Sigma Jahan's Projects
Dataset can be found here: https://www.kaggle.com/datasets/brsdincer/australia-and-investigative-special-wildfires-data
A vision system is being built to automate the distribution of mails by employing drones to carry the box and deliver it to the correct destination.
With rapid technological progress in the Internet of Things (IoT), it has become imperative to concentrate on its security aspect. This paper represents a model that accounts for the detection of botnets through the use of machine learning algorithms. The model examined anomalies, commonly referred to as botnets, in a cluster of IoT devices attempting to connect to a network. Essentially, this paper exhibited the use of transport layer data (User Datagram Protocol - UDP) generated through IoT devices. An intelligent novel model comprising Random Forest Classifier with Independent Component Analysis (ICA) was proposed for botnet detection in IoT devices. Various machine learning algorithms were also implemented upon the processed data for comparative analysis. The experimental results of the proposed model generated state-of-the-art results for three different datasets, achieving up to 99.99% accuracy effectively with the lowest prediction time of 0.12 seconds without overfitting. The significance of this study lies in detecting botnets in IoT devices effectively and efficiently under all circumstances by utilizing ICA with Random Forest Classifier, which is a simple machine learning algorithm.
Using GitHub to anticipate bugs, features, and questions might be advantageous for better resource use. The GitHub Bugs Prediction dataset from Kaggle is utilized for forecasting, and Random Forest Classification using Term Document Metrix is employed to predict bugs, features, and questions based on GitHub titles and text content. This report will compare the Random Forest performance evaluations for different tree counts. With such a massive dataset comprising text data, there is a lot to consider while analyzing it, primarily because of the preprocessing required to represent raw text and make it worthwhile
Config files for my GitHub profile.
My academic personal websites!
We conduct a large-scale empirical study to understand better the impacts of textual dissimilarity on the detection of duplicate bug reports.