What Grape is a Wine Made From?
Using R, I wrote my own implementation of the classic k-means clustering algorithm. The goal was to group wines together by chemical characteristics, in an attempt to determine the grape cultivar that a wine was produced from. The data, classic machine learning dataset, is from UC Irvine's Machine Learning Repository.
The clustering was over 95% accurate at predicting a wine's grape from a chemical analysis of the wine. Note the few points below that are not grouped with points of like color:
To visualize the 13 different chemical attributes in two dimensions, I used principal component analysis:
The R code for this project can be found in my GitHub repository.
The full report includes a deeper analysis, and evaluates the ability of k-means to work as a predictive model for this dataset. The report is available here.