Machine Learning Everywhere

The field of statistics typically has had a bad reputation. It's seen as difficult, boring and even a bit useless. Many of my friends had to take statistics courses in graduate school, so that they could analyze and report on their research. To many of them, the classes were a form of nerdy, boring torture.

Maybe it's just me, but after I took those courses, I felt like I was seeing the world through new eyes. Suddenly, I could better understand the world around me. Newspaper articles about the government and scientific and corporate reports made more sense. I also could identify the flaws in such reports more easily and criticize them from a position of understanding.

Much of the power of statistics lies in the creation of a "model", or a mathematical description of reality. A model is a caricature of sorts, in that it doesn't represent all of reality, but rather just those factors that you think will affect the thing you're trying to understand. A model lets you say that given inputs A, B, C and D, you can know, more or less, what the output will be.

Sometimes, the goal of a statistical model is to predict a value—for example, given a certain size and neighborhood, you can predict the price of a house. Or, given someone's age, weight and where they live, you can predict his or her likelihood of getting a certain disease.

Often, the goal is to predict a category—for example, in an upcoming election, for whom are people likely to vote? Taking into account where they live, what level of education they've received, their ethnic background and a few other factors, you can often predict for whom people will vote before they know it themselves.

During the past few years, there has been a huge amount of buzz around the terms "big data", "data science" and "machine learning". As these buzzwords continue to gain acceptance, many statisticians are wondering what the big deal is. And to be honest, their complaint makes some sense, given that "machine learning" is, more or less, a computerized version of the predictive models that statisticians have been creating for decades.

Now, why am I telling you this? Because I actually do believe that machine learning is a game-changer for huge parts of our lives. Just as my perspective was changed when I learned statistics, giving me tools to understand the world better, many businesses are having their perspectives changed, as they use machine learning to understand themselves better. Everything from online shopping, to the items you see in your social-network feeds, to the voice-recognition algorithms in your phone, to the fraud detection used by your credit-card company is being affected, boosted and (hopefully) improved via machine learning.

This means that no matter what sort of software development you do, you would be wise to gain as much experience as you can with machine learning. Its benefits might not be obvious to you at once or even be applicable to your work right away. But machine learning is becoming ubiquitous, and there is no shortage of ways in which to use it.

So with this article, I'm starting a series on machine learning and some of the ways your organization can take advantage of it. I'll look at a number of problems, many of which are common on web applications, that can benefit from using machine learning. Along the way, I hope you'll get lots of ideas for the sorts of analysis and uses that machine learning can bring to your applications.