
This appeared in the June 1996 issue of
This is an example of a classification problem, in which we want to predict a categorical variable (in this case type of beer) using other information (in this case maltyness and bitterness). It's a pretty easy problem, at least for some of the classes, as you can see there is good seperation between the classes (for example, the craft ales, lagers, and regular beers are seperated). Using other information (another variable in the article is the percent alcohol, which is a pretty good predictor for the nonalcoholic beers), we can get even better classification.
This could also be an example of a clustering problem, if we didn't know the classes of beers in advance. That is, if we just had a lot of beers, and there weren't any classes, we might notice in the plot that there seem to be three "clumps" or clusters.
This is data mining in the sense that we want a flexible way to predict beer type. One characteristic of data mining problems not exhibited in this example is a large volume of data - there are only 69 beers, and around a dozen variables recorded on each beer. In the course we will look at larger problems as well, and explore how algorithms scale to these larger problems.