Saturday, June 5, 2010

Association Analysis / Rule Learning

"Association Analysis" or "Association Rule Learning" is the discovery of interesting relationships between variables in large datasets. The relationships are usually represented in the form of the association rules containing antecedent and consequent.

For e.g. {Onions, Oil} => {Tomatoes}. Here antecedent is Onions and Oil and consequent are Tomatoes. This implies that customers, who buy onions and oil both, are very likely to buy tomatoes as well. How much is it likely that a customer who buys onions and oil also buy tomatoes?  This measure is given by the confidence of the rule. Such rules are generated from databases having many transactions. We would ideally want many transactions where a customer bought the three items together (to be really sure). This measure is the support of the rule.

Two most popular algorithms for learning “association rules” are “Apriori” and “FP grow”.