People are often confused about what these are and what the difference is. So here is an explanation using the old-fashioned way: in an Excel spreadsheet Credit: Thinkstock Machine learning gets a lot of buzz. The two most talked about classes of algorithms are classification and clustering. Classification is assigning things a label. Clustering is grouping things that look like they go together. Yet people are often confused about what these are and what the difference is. That confusion is partly because many explanations quickly go into a bunch of formulas. Instead, here is an explanation of clustering and classifying things the old-fashioned way: in an Excel spreadsheet. How classification works Let’s say that you want to predict which students will likely graduate and which students will likely drop out. Perhaps you want to flag them so you can assign a counselor. So, you have two labels: risk and low-risk. To do this using classification, you need a training set of students already known to have graduated. (Please note that I acquired this data the same way a stable genius does: I made it up. Don’t use it for anything but understanding what classification means.) Forget the algorithm for now. Let’s use this spreadsheet: IDG In the sheet’s data are some patterns among GPA, number of suspensions, and whether the student has been expelled. Mentally, you can make some correlations and note some exceptions. download Classification example in Excel The source data for the classification example. So, based on the following data, can you decide who is likely to graduate? If so, congratulations! You’re a classification algorithm. How clustering works Now let’s look at clustering. I have no labels for this data set. I just want the computer to effectively find the ones that are like the other ones and group them. IDG This data also has some patterns in it that you can see: The first and last column are probably meaningless for grouping purposes. However, there are several that have 1 1 1 in the first field. In fact, there are some that have 1 1 1 and then 0 0 0 and then 1 1 1. Now group those rows a cluster. You can probably find the opposite pattern as well. That is another cluster. You may also find some smaller matches, like 1 1 1 0 0 0 1 1 (it’s not in the sample data here, so you’re not missing something). Group that one; it can also be a cluster. download Clustering example in Excel The source data for the clustering example There are various algorithms that do this computationally. Some even do different forms of classification and clustering. However, the basic idea is that this something you can do in Excel. Related content analysis Beyond NoSQL: The case for distributed SQL What if the main problem with relational databases was the back end and not the front end? By Andrew Oliver Jul 20, 2020 9 mins NoSQL Databases Databases SQL analysis 9 career pitfalls every software developer should avoid If you love to code, and don’t think much about your career or your business, it’s time to get real and rethink how you approach software development By Andrew Oliver Jun 29, 2020 8 mins Developer Careers Software Development analysis Developers will decide cloud winners and losers Companies might choose AWS, Microsoft Azure, or Google Cloud, but their developers will decide what runs there By Andrew Oliver Jun 15, 2020 4 mins Open Source Cloud Computing Software Development analysis Rethinking software developer events after COVID-19 Virtual events might actually be better for developers than the real thing—if we do them right By Andrew Oliver May 07, 2020 8 mins Careers Software Development Resources Videos