What can’t machine learning do?

 
pexels-photo-595804.jpg
 

As the market for services in both big data and advanced analytics continues to grow rapidly, artificial intelligence (AI), machine learning (ML), and deep learning (DL) remain the topic of hot discussion. Those who are new to the subject often seem to misunderstand the terms or incorrectly use them interchangeably. In 2017, Google’s AlphaGo challenged and decisively beat South Korean master Lee Sedol in the ancient board game “Go,” creating worldwide conversation over the presumed infinite powers of artificial intelligence and its sub-components. Given the media’s perhaps overzealous portrayal of this technology, it’s important to establish fact from fiction when it comes to the extent of its capabilities.  While the technical aspects behind AI and its various sub-components are complex, its core functions can be more easily explained.

AI, simply put, is the capacity for computers to copy and mimic intelligent human behaviour; enabling the ability for machines to perform logical and rational decision making. ML is a subset of this technology; it involves designing specific algorithms that can learn from a set of provided examples (i.e. data). What’s important to understand is that ML is not a one-size-fits-all tool. Like humans, machines are not “born smart” they must be taught. Each algorithm needs to be programmed uniquely, just as each student learns best when taught according to their specific needs.

Successful ML involves the optimal combination of the right data with the right algorithm. That being said, your model can only be as good as the quality of your data. If you only know simple algorithms, but have the right dataset, you are in a much better position than someone who knows complicated algorithms, but does not have the right dataset. DL models are driving most of today’s ML technology. They allow machines to learn new information the same way a human brain does. It continuously compares newly gained information against past inputs, to allow for a “deeper” level of understanding. Therefore, DL models only work if you have a dataset with the right volume and variety, otherwise inaccuracies are continuously compounded.

The next logical question is: How do you obtain the “right” dataset? The answer lies in both data variety and proper data representation. While the amount of data is important, its quality is even more powerful. Your model will only know how to deal with the type of data that it was significantly exposed to in its training set. You need to successfully manipulate your data to fit your model’s needs. For example, raw data is hardly useful to train any model; you need to accurately describe it before it can work effectively.

Screen Shot 2018-05-24 at 11.08.56 AM.png

Describing raw data is a multi-step process involving feature extraction, data representation, and data transformation. Firstly, you need to understand the relevant and irrelevant characteristics of your unprocessed data in order to define the model’s scope/purpose. A model will work most effectively when only the pertinent features are extracted, reducing all noise. Secondly, strong data labels (i.e. attributes, predictors, and features) are necessary for establishing applicable input-output pairs. It’s essential that data is linked to specific outcomes, allowing the model to distinguish proper causation and data representation. Lastly, although deep learning models can automate some parts of the conventional feature engineering explained above, all models still need a significant investment in data cleaning and data transformation. Translating findings into the desired format or programming language is unique to each model and can be a very extensive process. Data transformation alone can account for up to 80% of the hard work that goes into training a model. It requires significant insight and intuition for this to be done properly.

A common misconception surrounding AI and more specifically ML, is that computer algorithms can learn independently without human interaction. In fact, machines need to be told what to do in either a supervised or unsupervised manner. The learning phase of creating an intelligent model will differ depending on the researcher’s approach, but it will always require processed data sets.

Supervised learning exposes models to a large amount of input-output data sets. Exposure to a large number of examples teaches a machine how to accurately predict a certain result or outcome based on a defined set of parameters. It begins to recognize what specific combinations of inputs generate a specific output, becoming increasingly intelligent in its decisions.

Unsupervised learning is slightly more complex, you might not have access to or even know the exact definition of the decisions you want your model to make. However, you know what type of information is hidden in your data. Based on this knowledge, you define some metrics that can be measured from your data. This is typically done by either clustering data (grouping based of similarity) or by association (identifying relationships between data). These metrics are either related to specific decisions or can be interpreted by the user in a meaningful way.  Any unsupervised model makes some assumptions on your data, measures some metrics, and comes to certain conclusions based on the values of those metrics.

While this may seem complicated, its actual application is relatively straightforward. For example, imagine you have a large stack of photos. For each photo, you have provided an extensive list of information as to what is on it and what it shows. You train your model with these examples until it is able to recognize exactly what is being displayed in new photos. This is supervised learning. Now imagine you have another stack of photos that has one of ten people pictured on each of them. You have provided your model with a set of metrics to help recognize similarities and differences (clustering and association). You then program your model to divide the photos into ten piles, each with photos of one individual. This is unsupervised learning.

When it comes to machine learning there are ultimately three main concepts that beginners need to understand: you need the right data, you need to process it properly, and you need to teach it to your model effectively. While you don’t need to have a PhD to understand the benefits that this technology can provide at a very broad level, the computing behind these models is extremely complex. It’s called “artificial” intelligence because it’s man-made. Your model will only become as intelligent as the quality of the algorithms, data sets, and learning models that were specifically created for it.

Simon Hicks