omarmohamed2011 / data-preprocessing-techniques Goto Github PK
View Code? Open in Web Editor NEWWhen it comes to creating a Machine Learning pipeline, data preprocessing is the first step marking the initiation of the process. Typically, real-world data is incomplete, inconsistent, inaccurate (contains errors or outliers), and often lacks specific attribute values/trends. This is where data preprocessing enters the scenario – it helps to clean, format, and organize the raw data, thereby making it ready-to-go for Machine Learning models. Let’s explore various steps of data preprocessing in machine learning, but firstly we need to understand the concept of Noisy data.