What is association rule mining? A: Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction Definition • Support count (\sigma): itemset 出現的次數 ∘ Frequency of occurrence of an itemset ∘ E.g. \sigma({Milk, Bread,Diaper}) = 2 • Support: 某一個 itemset / itemset 的總數 ∘ Fraction of transactions that contain an itemset ∘ E.g. s({Milk, Bread, Diaper}) = 2/5 • Frequent Itemset ∘ An itemset whose support is greater than or equal to a minsup threshold • Rule Evaluation Metrics ∘ Support (s): Fraction of transactions that contain both X and Y, i.e. \sigma({X, Y}}/T ∘ Confidence (c): Measures how often items in Y appear in transactions that contain X, \sigma({X, Y}}/ \sigma (X) Maximal Frequent Itemset: • An itemset is maximal frequent if none of its immediate supersets is frequent(最相近的 superset 都不是 frequent itemset) • example: ∘ Items: a, b, c, d, e ∘ Frequent Itemset: {a, b