Cluster Analysis

Type of Clustering:[1]

k-means clustering

The algorithm steps are:
• Choose the number of clusters, k.
• Randomly generate k clusters and determine the cluster centers, or directly generate k random points as cluster centers.
• Assign each point to the nearest cluster center, where "nearest" is defined with respect to one of the distance measures discussed above.
• Recompute the new cluster centers.
• Repeat the two previous steps until some convergence criterion is met (usually that the assignment hasn't changed).

Hierarchical Algorithm
• Produces a set of nested clusters organized as a hierarchical tree

• Two main types of hierarchical algorithm are either agglomerative ("bottom-up") or divisive ("top-down")
• Agglomerative(凝聚) algorithms begin with each element as a separate cluster and merge them into successively larger clusters
• Divisive(分離) algorithms begin with the whole set and proceed to divide it into successively smaller clusters

Distance measure

• Min: 以 A group 對應到 B group 中最短兩點的距離作為相似度的標準
• Max: 以 A group 對應到 B group 中最長兩點的距離作為相似度的標準
• Group average: A group 中的每一點對應到 B group 中每一點的距離，所有距離加總的平均
• Distance Between Centroids: 以 A group 中心點對應到 B group 中心點的距離作為相似度的標準

-------------------------------------------------------------------------------
[1] http://en.wikipedia.org/wiki/Cluster_analysis

留言

這個網誌中的熱門文章

CodeBlocks 多國語言的設定步驟

多年來一直都是使用 CodeBlocks 英文的介面，不曾想過要將 CodeBlocks 設定成多國語言的開發環境，對於不習慣於英文介面的國人，設定中文的使用介面是非常需要的環境，在 CodeBlocks 論壇有一篇文章提到：Do you know http://wiki.codeblocks.org/index.php?title=Internationalization ?這個連結說明在 Windows 系統如何設定 CodeBlocks 成為 Internationalization 的環境，整個設定過程如下：到 CodeBlocks 翻譯文件網站下載 .mo 檔案：下載時需要 Ubuntu One 的帳號及密碼，登入後點選 .mo 檔案(不要下載 .po 檔是可編輯檔) 系統會傳送一封信件，點選信件的連結，將 .mo 檔案下載將檔案複製到 C:\Codeblocks\share\CodeBlocks\locale\zh_TW 目錄(沒有這個目錄請自己建立) 開啟 CodeBlocks >> Setting >> Environment >> View >> Internationalization 選項打勾 >> 點選 Chinese 重新開啟 CodeBlocks 要加入其他語言的 .mo 檔案，則在 locale 目錄中新增其他語言的目錄名稱，例如：德國 de_DE，這樣 CodeBlocks 就是多國語言的開發環境了。當如果要恢復英文的介面，只要取消 Internationalization 的選項勾選，然後再次重新開啟 CodeBlocks 就回到英文的開發環境。後記： CodeBlocks 翻譯文件網站要下載 .mo or .po 檔案需要等待系統回復信件到 Email 信箱，無法及時處理，將這些檔案儲存在 Google Driver 的 src/CodeBlocks 目錄，以後可以從這裡直接取用。

閱讀完整內容

cmd 程式無法執行的解決步驟

因為要設定 cmd 的編碼方式為 Unicode 編碼( chcp 65001)，可能不小心修改了編碼，而導致cmd 無法開啟，主要的原因是：「cmd 變成沒有編碼」，所以才造成 cmd 無法開啟。在 Windows 8 中要恢復 cmd 編碼的步驟如下： 1. 滑鼠移到左上角，會出現功能的選項，點選「搜尋」的圖示 2. 在輸入的格子中，輸入「cmd」但是不要按下 enter 3. 滑鼠移到「cmd 命令提示字元」，，按下「滑鼠右鍵」 4. 下面會出現一些選項，點選「開啟檔案位置」，如此可以找到 cmd 命令提示字元的位置 5. 在「命令提示字元」檔案中按下滑鼠右鍵，並點選「內容」 6. 點選「選項」，把「950 - Big 5 繁體中文」的編碼加入

閱讀完整內容

洗鏡光 - DCview.com達人部落格

要找 working set 的資料，從 [1] 的網站中得到他寫的作業系統筆記，而他筆記的內容大部分是從洗鏡光老師投影片的內容整理而來，於是 google "洗鏡光" 找的洗鏡光老師的投影片，結果是：「洗鏡光 - DCview.com達人部落格」，這是介紹「相機」的網站阿，怎麼是洗鏡光老師的 blog 呢？後來自己認為：「洗鏡光老師不可能沒有自己的網頁」，於是在「程式設計俱樂部」論壇[2]中找到洗鏡光老師的發言，其中有老師的英文名字( shene )，再使用 shene 找，於是在找到洗鏡光老師[3]在美國的網站。從老師英文的網站中，在得知老師在台灣的網站就是「洗鏡光 - DCview.com達人部落格」，繞了一大圈才在「文章列表-- 電子計算機（電腦）科學 (3)」中，真正找到洗鏡光老師的投影片。在 blog 中，另外有2篇文章，有一篇是說明「浮點數精確度」的問題，是值得詳細閱讀。 -------------------------------------------------------------- [1] http://nixchun.pixnet.net/blog/category/523852 [2] http://www.programmer-club.com.tw/ [3] http://blog.dcview.com/blog.php?m=Bj8CZQ%3D%3D

閱讀完整內容

Elvis Hsieh Blog

搜尋此網誌