Machine Learning can seem like a daunting field. But the core concepts are, with a little help, quite accessible.
To better understand the field of Machine Learning, we wanted to provide some quick overviews of the fundamental concepts as part of our ML 101 series. In this first post, we review common applications of the field, and the differences between the two subtypes of Supervised vs. Unsupervised Machine Learning.
What is Machine Learning?
The basic objective of Machine Learning is to use computers to learn information, without being explicitly instructed to do so. Most often, this involves using a set of historical outcomes, to make predictions about future outcomes.
This becomes useful when you want to automate insights around very large datasets, that would be too difficult for a human being to do on a recurring basis. There are many examples where such an approach can be useful, and can be found in many of our everyday applications, including:
- Mining large datasets of information from web analytics, payment, behavior data (e.g. from Heap, Stripe, Mailchimp)
- Building applications to extract automated insights that would be too difficult to write by hand, via fields of Natural Language Processing and Computer Vision (e.g. Apple Siri, Google Photos)
- Designing self customizing programs that provide recommendations and adapt a product based on past user behaviors and preferences (e.g. Amazon, Netflix)
- Understanding human learning itself, via deep learning and real artificial intelligence (e.g. Google Brain, IBM Watson)
Building solutions for each of these cases can be resolved via a multitude of different Machine learning approaches. Some can be solved by simpler regression models, while others require more complex neural networks. Each of the respective approaches however can be broken down into two general subtypes – Supervised and Unsupervised Learning.
Supervised Learning refers to the subset of Machine Learning where you generate models to predict an output variable based on historical examples of that output variable.
For example, say you wanted to predict the price (output) of a house. If you had a set of historical data points or inputs on a set of houses (i.e. their size) and the related price, you could use Supervised Learning to find a model that predicts price based on house size.
Supervised Learning lends itself to a variety of algorithms that can help define the relationship between the input and output variables. Common algorithms you will find within Supervised Learning include:
- Regression: where you predict a real value output or amount of something, based on past inputs; e.g. predicting house prices, predicting purchase amounts (an example of a linear regression is shown in the diagram above)
- Classification: where you predict discrete value outputs of something (often notated as 0 or 1), based on past inputs; e.g. predicting whether a customer will churn vs. not churn, predicting whether a student will pass vs. fail a class
An important note to keep in mind, is that Supervised Learning only works if your historical dataset contains real values for the output you are trying to predict. For instance in the house price example above, if you didn’t have real examples of house prices relative to size, you wouldn’t be able to use a regression to predict house prices even if you have their size.
Say you have a dataset though where you don’t know the output value – how do you generate predictions on such a data set?
Unsupervised Learning is the subset of Machine Learning that helps with such a case. Given a dataset without a defined output variable, Unsupervised Learning algorithms will help find structure or patterns in the underlying data.
For example, let’s return to the example of housing data. Say we have historical data on house size and age, and are trying to find if there is some way to classify groups of these houses. We could use Unsupervised Learning to group sets of houses together based on the inputs of size and age, and see if there are patterns to the outcomes.
A common algorithm relevant to such a scenario is Clustering, wherein you group sets of objects together that seem to share similar attributes. The algorithm won’t define the actual label (as Supervised Learning would) or context for the clustered groups, but it will help you find the clusters themselves. (an example of a clustering is modeled in the diagram above).
There are many real world examples where clustering is leveraged, including:
- Social network analysis to define groups of friends
- Market segmentation of companies by location, industry, vertical
- Organizing computing clusters based on similar event patterns and processes
In summary, Machine Learning is the methodology of using computers to make predictions on outcomes in the future based on outcomes in the past. Supervised Machine Learning will help you predict actual or discrete outputs of an outcome, assuming you have historical data on such outcomes. And Unsupervised Learning can help you when you don’t know the actual outcomes, but want to extrapolate relevant outcomes from the underlying historical data.
This blog post is based on concepts taught in Stanford’s Machine Learning course notes by Andrew Ng on Coursera.