Web Mining: Short/Long Term User Profile

February 21, 2010
211 Views

It’s now nearly one year that I manage a project about targeted advertising for a Telco company in Switzerland. The particularity of our approach is the fusion of offline (CRM) and online (web) customer profiles. We build these extended customer profiles (ECP) on a shifting time window. These ECP are then mined to predict some event such as the click of customers on a given ad.

Several factors can influence the results obtained: the quality of the CRM data, the granularity of the web log aggregation, the data mining technique used, the size of the time window to build the ECP, etc. In this post, I will focus on the time window size. Without going into too much detail, there are two choices: short or long term time window. As example, short term can be from 1 to 7 days, while long term can be 7+ days. The choice of the time window size (short or long) is an important decision that will affect the results.

The choice depends on what we want to capture. If we want the recent interests of customers…

It’s now nearly one year that I manage a project about targeted advertising for a Telco company in Switzerland. The particularity of our approach is the fusion of offline (CRM) and online (web) customer profiles. We build these extended customer profiles (ECP) on a shifting time window. These ECP are then mined to predict some event such as the click of customers on a given ad.

Several factors can influence the results obtained: the quality of the CRM data, the granularity of the web log aggregation, the data mining technique used, the size of the time window to build the ECP, etc. In this post, I will focus on the time window size. Without going into too much detail, there are two choices: short or long term time window. As example, short term can be from 1 to 7 days, while long term can be 7+ days. The choice of the time window size (short or long) is an important decision that will affect the results.

The choice depends on what we want to capture. If we want the recent interests of customers, then a short time window should be used. In this case, only the very recent web activities of costumers are taken into account. It will thus have a bigger variation over the time. In the case of a long time window, the overall customer interest is taken. In our project, we identify the customer at the house hold level (rather than at the person level). It is thus not possible to differentiate between the father and the soon who are both using the same internet connection. Knowing this, getting the recent and volatile interest of the customer makes no sense and we have decided to use the long time window.

Link to original post