Common Issues In Training ML Model

There are well-known issues faced by many machine learning developer in their day-to-day life.

Data Quality Issues:

Getting the data for specific needs is not an easy task for data scientists.
If data is obtained but there are still many issues so we need to preprocess them and convert them into problem fit.
We need to apply the various methods by using the pandas and sklearn library for preprocessing the data.
Data has many missing values, mismatched values, Outliers, and so on.

Feature Selection and Engineering

The feature simply means columns in data.
Lots of features are not good for machine learning models because all features are not equally important for prediction.
For that, we need to apply feature engineering. Get the features only that are important for the specific problem.
Some of the methods for Feature selection:
Correlation coefficient
Fisher’s Test
Information Gain

Overfitting and Underfitting

Overfitting and Underfitting are well-known issues in machine learning.
Overfitting means the model is performing best in training but not well in testing data.
Underfitting means the model is not performing well in training data.
These are issues caused by insufficient data.

Model Complexity

Selecting an appropriate model architecture
Controlling model complexity to avoid overfitting

Exploding and Vanishing Gradients

This obstacle was a major barrier to training large networks.
This problem is more prevalent in deep networks with many layers, such as deep neural networks (DNNs) and recurrent neural networks (RNNs).

Transfer Learning Challenges

Choosing the specific pre-trained model for our use cases.
Finetuning the pre-trained model is also hard.

Data Leakage

Unintentional inclusion of information from the test set in the training process

Deployment Challenges

Many large models are not easily handled in production and also need more space for it.

Like

Dislike

Thanks for feedback.