Understanding the concept of Hierarchical clustering Technique

This is a less popular technique in real world.Ward’s Method: This approach of calculating similarity between two clusters is exactly same as Group Average except that the Ward’s method calculates the sum of square of the distances Pi and PJ.Mathematically this can be written as,sim(C1,C2) = ∑ (dist(Pi, Pj))²/|C1|*|C2|Pros of Ward’s method:Ward’s method approach also does well in separating clusters if there is noise between clusters.Cons of Ward’s method:Ward’s method approach is also biased towards globular clusters.Space and Time Complexity of Hierarchical clustering Technique:Space complexity: The space required for the Hierarchical clustering Technique is very high when number of data points are high as we need to store the similarity matrix in the RAM..The space complexity is order of square of n.Space complexity = O(n²) where, n is the number of data points.Time complexity: Since we’ve to perform n iterations and in each iteration we need to update similarity matrix and restore the matrix, the time complexity is also very high..The time complexity is order of cube of n.Time complexity = O(n³) where, n is the number of data points.Limitations of Hierarchical clustering Technique:There is no mathematical objective for Hierarchical clustering.All the approaches to calculate the similarity between clusters has its own disadvantages.High space and time complexity for Hierarchical clustering..Hence this clustering algorithm cannot be used when we have huge data.References:https://cs.wmich.edu/alfuqaha/summer14/cs6530/lectures/ClusteringAnalysis.pdfwww.appliedaicourse.comhttps://en.wikipedia.org/wiki/Hierarchical_clustering. More details

Leave a Reply