Nowadays, retail managers have access to tremendous amounts of data that allow them to assess business performance and understand customers' behavior. Thanks to that and the help of bright business analysts, they no longer need to rely only on their intuition and gut feeling but can make proper factual decisions. However, historical data offer insights only on the past actions and behavior. How about assessing novel concepts: the idea of a new type of promotion, store redesign or price changes in size not experienced in the past...
A fashion chain manager might feel that it would be beneficiary to re-price the whole base assortment increasing all prices by 10%. A supermarket manager might suspect that opening stores one hour later will allow to reduce costs hardly influencing sales. It would be a pity to follow such feelings and introduce the idea in all stores only to discover that it yields a profit loss and makes customers unhappy. What is more, if a change is implemented in all stores at once, a reference point is missing and consequently it is difficult to judge the effect in terms of causality. What if sales would have gone up anyway and it only coincided with a new promotion? Therefore, instead one could conduct a business experiment. Take action only in few stores, select the right control stores, compare the results and assess your initial feeling.
A good experiment starts with a good research hypothesis. You need to specify what your outcome variable and treatment variable are. It determines what data and what statistical tests will be required to verify your intuition and estimate the treatment effect. To ensure that the results of the experiment are reliable, treatment and control groups need to be wisely selected. You cannot use only your best performing stores as a test group and compare their outcomes with stores that always performed significantly worse. In an ideal experiment we talk about random assignment. In practice it means that for each of the experimental stores, you need to be able to find a similar enough reference store. What exactly this means may differ depending on the analysis that follows. If a difference-in-difference estimator is used, the common trend assumption should hold. Experimental and reference stores should experience a comparable behavior of the outcome variable in the period preceding the experiment. If a regression analysis involving nearest neighbors is applied, the most important factors that influence the outcome variable should be comparable across experimental and reference stores. The stores should have similar size, number of competitors in the area or number of people living within a certain distance.
By incorrectly selecting treatment and control groups, you can unconsciously introduce bias into your experiment. This is the most common but not the only mistake accompanying business experiments. Another one is terminating the experiment too soon. It is possible that an immediate effect of a new promotion or permanent price decrease is highly positive but it fades away after few weeks when customers internalize the change. Assessing treatment effect only based on the first few weeks may lead to the wrong conclusion. You should also control for the external factors, making sure that they affect both experimental and control group. Over the experimental period, the only factor that should clearly differ across the groups of stores is the presence (and lack) of the treatment.