The Objective Nature of Supervised Learning and the Subjective Art of Unsupervised Learning

Anthony Mipawa
3 min readMar 22, 2023
Photo by DeepMind on Unsplash

What does supervised & unsupervised machine learning mean?

Supervised & Unsupervised Machine Learning:

These are the types of machine learning 😎

In supervised machine learning, the algorithm needs supervision while it is learning, by feeding it with the information to help it learn. They are trained on a labeled dataset, where the desired output is known. The algorithm then makes predictions based on this labeled data.

Consider the scenario, you might provide the machine with 500 examples of clients who defaulted on their loans and another 500 examples of clients who didn’t if you were seeking to learn about the connections between loan defaults and borrower information. The machine can determine the information you’re looking for thanks to the labeled data, which “supervises” it.

Typical cases: Loan Default Prediction, Medical Image Classification, etc

In Unsupervised machine learning, algorithm learning without supervision. Trained on an unlabeled dataset, where the desired output is not known. The algorithm then tries to identify patterns and relationships within the data without any pre-defined objective.

Consider the scenario where you were unaware of which clients had defaulted on their debts. Instead, after receiving borrower data, the machine would analyze the data to identify patterns among the borrowers before clustering them into different groups.

Typical cases: Market Basket Analysis, Customer Segmentation

If these are types of machine learning, where does the subjectivity and objectivity of this topic originate?

Subjectivity And Objectivity:

All thoughtful consideration, subjectivity & objectivity between the two types of machine learning originate from the decision point, why you choose x models instead of a,b,c models.

To say something is a subjective meaning that is based on or influenced by personal feelings, or opinions. While Objective is not influenced by personal feelings, based on factual information.

In supervised machine learning we choose the best model based on performance metrics and considering the risk and business cases. All of this is factual information that’s a point where supervised machine learning is Objective. Keep in mind that choosing the best model is not always a straightforward process, and may require several iterations of training and evaluating different models to find the optimal solution.

In unsupervised machine learning the dance is different, consider a case of customer segmentation choosing the number of optimal clusters is very trick, considering evaluation metrics like inertia, Silhouette score, and the common technique Elbow method, As we increase the value of K(clusters) inertia get smaller which is better, but in a real sense, we can’t afford to have 60 or even more clusters in production.

Considering the visual of Inertia value versus K values. The trick is to look for a point in a line where Inertia starts level-off after bending the elbow and the determination of the sweet spot is always a challenge that can range from two numbers of clusters.

There is no direct answer to how many clusters can be created, which means the debate begins here. We choose the best model or number of clusters based on different factors rather than evaluation metrics alone. Some of those factors have nothing to do with performance metrics; that’s a point where unsupervised machine learning is Subjective.

Here is an example: Let’s say that you are working for a Bank and the Bank wants to create targeted email campaigns for different groups of customers on how they interact with their products. During the discussion with the marketing team, somebody might say “Hey we only have enough budget for three campaigns”. Well, guess what regardless of the performance metrics K is now going to be 3.

Note: In Unsupervised machine learning things are subjective rather than objective as in supervised machine learning.

Final Thoughts:

As a data professional, it is crucial to effectively communicate with relevant stakeholders, including teams that will be impacted by your solution. This collaboration allows you to leverage domain knowledge and ensures that your solution is optimized for the specific needs and requirements of the production environment. By considering these factors, you can effectively deliver impactful results that meet the needs of your organization.

I hope you found our discourse stimulating. Please do not hesitate to share your reflections in the comments section. Your contributions would be highly valued, especially as a fellow Machine Learning enthusiast.

--

--