Ans :
In machine learning, association rules are one of the important concepts that is widely applied in
problems like market basket analysis. Consider a supermarket, where all the related items such as
grocery items, dairy items, cosmetics, stationary items etc are kept together in same aisle. This
helps the customers to find their required items timely. This further helps them to remember the
items to purchase they might have forgotten or to they may like to purchase if suggested.
Association rules thus enable one to corelate among various products from a huge set of available
items. Analysing the items customer buy together also helps the retailers to identify the items they
can offer on discount. For example, retailer selling baby lotion and baby shampoo on MRP, but
offering a discount on their combination. Customer who wished to buy only shampoo or only
lotion, may now think of buying the combination. Other factors too can contribute to the purchase
of combination of products. Another strategy can keep related products on the opposite ends of the
shelf to prompt the customer to scan through the entire shelf hoping that he might add a few more
items to his cart.
It is important to note that the association rules do not extract the customerβs preference about the
items but find the relations among the items that are generally bought together by them. The rules
only identify the frequent associations between the items. The rules work with an antecedent (if)
and a consequent (then), both connecting to the set of items. For example, if a person buys pizza,
then he may buy a cold drink too. This is because there is a strong relation between pizza and cold
drink. Association rules help to find the dependency of one item on other by consider the history
of customerβs transaction patterns.
Basic Concepts
There are few terms that one should understand before understanding the algorithm.
a. k-Itemset: It is a set of kitems. For example, 2-itemset can be {pencil, eraser} or {bread, butter}
etc., 3-itemset can be {bread, butter, milk}.
b. Support: Frequency of appearance of an item appears in all the considered transactions is called
as the support of an item. Mathematically, support of an item x is defined as:
π π’πππππ‘(π₯) = ππ’ππππ ππ π‘ππππ πππ‘ππππ ππππ‘ππππππ π₯ / πππ‘ππ ππ’ππππ ππ ππππ ππππππ π‘ππππ ππ‘πππ
c. Confidence: Confidence is defined as the likelihood of obtaining item y along with an item x.
Mathematically, it is defined as the ratio of frequency of transactions containing items x and y to
the frequency of transactions that contained item x.
ππππππππππ(π¦) = ππ’ππππ ππ π‘ππππ πππ‘ππππ ππππ‘ππππππ π₯ πππ π¦ / ππ’ππππ ππ ππππ ππππππ ππππ‘ππππππ
Confidence can also be defined as probability of occurrence of y, given probability of
occurrence of x.
ππππππππππ(π₯ => π¦) = π(π¦/π₯)
where x is antecedent, and y is a consequent. In terms of support, confidence can be described
as:
ππππππππππ(π¦) = π π’πππππ‘ (π₯ βͺ π¦) / π π’πππππ‘(π₯)
d. Frequent Itemset: An item whose support is at least the minimum support threshold is known
as a frequent itemset. For example, let minimum support threshold is 10, then an item set with
support score 11 is a frequent itemset but an item set with support score 9 is not.