Type of Clustering:[1]
k-means clustering
The algorithm steps are:
• Choose the number of clusters, k.
• Randomly generate k clusters and determine the cluster centers, or directly generate k random points as cluster centers.
• Assign each point to the nearest cluster center, where "nearest" is defined with respect to one of the distance measures discussed above.
• Recompute the new cluster centers.
• Repeat the two previous steps until some convergence criterion is met (usually that the assignment hasn't changed).
Hierarchical Algorithm
• Produces a set of nested clusters organized as a hierarchical tree
• Two main types of hierarchical algorithm are either agglomerative ("bottom-up") or divisive ("top-down")
• Agglomerative(凝聚) algorithms begin with each element as a separate cluster and merge them into successively larger clusters
• Divisive(分離) algorithms begin with the whole set and proceed to divide it into successively smaller clusters
Distance measure
• Min: 以 A group 對應到 B group 中最短兩點的距離作為相似度的標準
• Max: 以 A group 對應到 B group 中最長兩點的距離作為相似度的標準
• Group average: A group 中的每一點對應到 B group 中每一點的距離,所有距離加總的平均
• Distance Between Centroids: 以 A group 中心點對應到 B group 中心點的距離作為相似度的標準
-------------------------------------------------------------------------------
[1] http://en.wikipedia.org/wiki/Cluster_analysis
k-means clustering
The algorithm steps are:
• Choose the number of clusters, k.
• Randomly generate k clusters and determine the cluster centers, or directly generate k random points as cluster centers.
• Assign each point to the nearest cluster center, where "nearest" is defined with respect to one of the distance measures discussed above.
• Recompute the new cluster centers.
• Repeat the two previous steps until some convergence criterion is met (usually that the assignment hasn't changed).
Hierarchical Algorithm
• Produces a set of nested clusters organized as a hierarchical tree
• Two main types of hierarchical algorithm are either agglomerative ("bottom-up") or divisive ("top-down")
• Agglomerative(凝聚) algorithms begin with each element as a separate cluster and merge them into successively larger clusters
• Divisive(分離) algorithms begin with the whole set and proceed to divide it into successively smaller clusters
Distance measure
• Min: 以 A group 對應到 B group 中最短兩點的距離作為相似度的標準
• Max: 以 A group 對應到 B group 中最長兩點的距離作為相似度的標準
• Group average: A group 中的每一點對應到 B group 中每一點的距離,所有距離加總的平均
• Distance Between Centroids: 以 A group 中心點對應到 B group 中心點的距離作為相似度的標準
-------------------------------------------------------------------------------
[1] http://en.wikipedia.org/wiki/Cluster_analysis
留言
張貼留言