Good to see you here!

Today's quiz would be based on Cluster Analysis. So let's get started.

Cluster Analysis is the statistical procedure that is aimed at grouping data object basedon the information found in the data set that describes the objects and their attributes

The objective of cluster analysis ia to group objects with similar characteristics into one cluster.

The two types of clustering are:

Hierarchical Clustering: Clusters are arranged in a hierarchical tree

Partitioning Clustering: Data are grouped into distinct subsets that does not overlap

K-Means clustering is a partitioning clustering approach where each cluster is associated with a centroid or center point and each data point is assigned to the centroid that is closest to it. The number of clusters is specified in advance.

i. Choose the initial value of K

ii.

iii. Form K clusters by assigning each point to the closest centroid

iv. Recalculate the centroid of each cluster

v. Move the centroid to the new computed position

vi.

Not efficient if data contains outliers

Fails for non-convex round clusters

The McQueen's Algorithm is used for measuring the goodness of the clustering and for minimizing the compactness function in finite steps

The two types of hierarchical clustering are:

Top-Down Clustering

Bottom-Top Clustering

k-Means produces single partition while hierarchical produces different partitions

k-Means needs the number of clusters specified in advance while hierarchical does not

k-Means is have a more efficient run-time than the hierarchical

A dendrogram is a tree diagram used to illustrate the arrangement of clusters in hierarchical clustering.

I would stop here so I can allow you some time to get your head around these concepts.

Thank you for reading.!

Feel free to check out the quiz on other Statistics topics.

Today's quiz would be based on Cluster Analysis. So let's get started.

**Question 1: What is Cluster Analysis?**Cluster Analysis is the statistical procedure that is aimed at grouping data object basedon the information found in the data set that describes the objects and their attributes

**Question 2: What is the Goal of Cluster Analysis?**The objective of cluster analysis ia to group objects with similar characteristics into one cluster.

**Question 3: What are the two types of Clustering?**The two types of clustering are:

Hierarchical Clustering: Clusters are arranged in a hierarchical tree

Partitioning Clustering: Data are grouped into distinct subsets that does not overlap

**Question 4: Describe the k-Means Clustering**K-Means clustering is a partitioning clustering approach where each cluster is associated with a centroid or center point and each data point is assigned to the centroid that is closest to it. The number of clusters is specified in advance.

**Question 5: Write the k-Means Clustering Algorithm?**i. Choose the initial value of K

ii.

**repeat**iii. Form K clusters by assigning each point to the closest centroid

iv. Recalculate the centroid of each cluster

v. Move the centroid to the new computed position

vi.

**until**The centroids position don't change**Question 6: How do you Choose Initial Value of K for k-Means Clustering**- Use another clustering method to estimate it
- Run the algorithm with different values of K and then choose the one that is optimal
- Use the prior knowledge about the characteristics of the data

**Question 7: How do you choose the centroid for the cluster?**- Random selection from the feature space
- Random selection from the data set
- Look for dense regions of space
- Space them uniformly around the feature space

**Question 8: How is the quality of a cluster measured?**- The size of the cluster vs the distance betweent the clusters
- The Distance between members of the clusters
- Teh Diameter of the smallest sphere

**Question 9: What are some limitations of k-Means Clustering?**Not efficient if data contains outliers

Fails for non-convex round clusters

**Question 9: What is McQueen's Algorithm used for?**The McQueen's Algorithm is used for measuring the goodness of the clustering and for minimizing the compactness function in finite steps

**Question 10: Outline and explain the two types of Hierarchical Clustering**The two types of hierarchical clustering are:

Top-Down Clustering

Bottom-Top Clustering

*How Bottom-Top or Agglomerative Clustering work*- Start with each of the data points in its own cluster
- Merge two clusters that are similar
- Repeat the merging untill there is a single cluster of allt he data points

*How Top-Down or Divisive Clustering Work*- Start with all examples in one big cluster
- Remove the data point that seems to far away from other points
- Repeat the process untill all points is in its own cluster

**Question 11: Mention three ways to compute dissimilarity between clusters**- Single Link
- Complete Link
- Group Average

**Question 12: Compare k-Means and Hierarchical Clustering**k-Means produces single partition while hierarchical produces different partitions

k-Means needs the number of clusters specified in advance while hierarchical does not

k-Means is have a more efficient run-time than the hierarchical

**Question 13: What is a Dendrogram?**A dendrogram is a tree diagram used to illustrate the arrangement of clusters in hierarchical clustering.

I would stop here so I can allow you some time to get your head around these concepts.

Thank you for reading.!

Feel free to check out the quiz on other Statistics topics.