# Week 7: Hierarchical Clustering

Time Estimates:
Videos: 10 min
Activities: 40 min
Check-ins: 1

## Hierarchical Clustering

The other type of clustering we will implement this week is called Hierarchical or Agglomerative clustering.

Required Video: Intro to Hierarchical Clustering

Note that there are three ways of comparing two clusters, to determine if they should be merged:

1. Complete Linkage - Uses the furthest distance between a point in cluster A and a point in cluster B. This is the default behavior in hclust().

2. Single Linkage - Uses the closest distance between a point in cluster A and a point in cluster B.

3. Average Linkage - Uses the distance between the centroids of the clusters.

Check-In 1: Case Study: Federalist Papers

In the k-means coursework, you identified the authorship of the disputed Federalist Papers.

Try this out with hierarchical clustering instead.

1. Convert your fed data into a matrix. (Do not reduce dimension with PCA.)

2. Use hclust() on your data.

3. Create a dendrogram of the results, with the observations (“nodes” or “leaves”) labelled by author.

For extra fun, try out these ways to make prettier or more informative dendrograms: