Computer Vision for retail is no longer a viral topic, but a real investment that pays off and brings benefits: it improves time for outlet audits and increases the planogram compliance rate. But at the same time, such projects have not yet become a widely spread practice.
In this article, we will analyze why companies need computer vision, how computer vision works in merchandising, how it is introduced and whether it is possible to implement a project in a few weeks.
Table of Сontents
Why Companies are Adopting Computer Vision
The end customer of FMCG companies is a person who stops by for groceries after work or spends free time shopping. One way or another, the buyers want to quickly find all the goods and go about their business. Therefore, the product should always be displayed on the shelf in the store and it should be easy to find.
Here the bottlenecks emerge for most FMCG companies: either the product is not on the shelves, or the layout does not match the planogram, or there are no POS materials, or the shelf labels are mixed up, etc. All these things need to be monitored.
Most companies do this manually. A merchandiser comes and audits the outlet: he or she checks the layout against the planogram, rearranges the SKUs, POS materials, shelf labels. Then takes a photo and sends it to the supervisor. After that, the merchandiser verifies stock balance, draws up a report on the outlet and goes to the next one. There are several such outlets per day. This process has some aspects that impair its effectiveness:
- manual verification of the layout against the planogram,
- manual verification of POS materials and shelf labels,
- manual processing of hundreds of photos by a supervisor.
As a result, multiple issues arise: “human factor” errors and hundreds of raw photos from merchandisers, since supervisors do not have physical time to review all of them. Accordingly, executives receive incomplete and outdated analytical data which, in turn, is used to make managerial decisions. This may result in lower sales and loss of profits.
To avoid all these problems and understand the real-world setting on the shelves, companies are introducing automation of merchandising processes. So, computer vision is one of the solutions that allow streamlining and minimizing manual operations in this process.
How is Computer Vision Introduced and Whether it Can be Faster
Yes, it can. It is quite practicable to deploy a computer vision-powered solution quickly. Depending on the complexity, this will take from two to four weeks. But first, let’s take a look at how computer vision is implemented.
Any deployment of computer vision-powered solutions takes place in several stages:
- Data preparation
- Selection of architecture and model training
- Evaluation and additional training of model
- Putting into commercial operation
Stages of Computer Vision Project
Data preparation. For training, the neural network is launched on a prepared dataset. Data should be augmented during preparation. This means adding distortions (low-key photos, glares, etc.) to enhance data diversity and, subsequently, the recognition accuracy. All data is divided into three sets: training, validation and test.
Selection of architecture and model training. Choose the architecture of the model to be used for training. Models can be of different types, for example, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Vision Transformers (ViT), etc.
Further, we run the neural network on the training data. The model is trained in several iterations. During each iteration, the model identifies patterns and features that help it recognize objects such as SKUs, shelf labels, POSMs, and classify products, such as water of a specific brand, volume and taste. At the end of each iteration, the neural network is tested on the validation dataset. Thus, the model gradually improves its parameters.
Next, we launch the neural network on unlabeled photos. The process of using a trained model to make predictions on new data is referred to as inference. This can be done in real time, where the model generates predictions based on input data over a short period of time. If the neural network labels the photo incorrectly, labeling is corrected and the photos are uploaded to the neural network again.
Evaluation and additional training of model. After the model has been trained, it must be evaluated on a test dataset to determine how well it performs on completely new data. If the model demonstrates poor performance, it is adjusted and the training process is repeated.
Error analysis helps to understand which objects the model fails to recognize well, and which attributes are most important in order to correctly classify these objects. This allows improving the model and making it more efficient with real-world photos from merchandisers.
Putting into commercial operation. When a model successfully labels photos consistently with 95% or better accuracy, it is ready for commercialization. At this stage, the application with the model is integrated into the business processes of the FMCG company. Goods Checker can be used as a standalone application or be integrated into existing IT applications using the API.
How to Reduce Project Time
Data preparation stage is the bottleneck in the introduction of computer vision solutions. Labeling of the entire dataset is most often done manually, so the process is time- and labor- consuming.
To make this stage faster, we do not labeling manually. To create a dataset, we use renders of each SKU and create an artificial set of photos. We use it to train the neural network to recognize goods. Back in 2019, at arXiv.org, AI experts from Google Cloud AI suggested using this method for data labeling.
As soon as we reach the required level of recognition accuracy, we proceed to the next stage. We run the neural network on real photos, additionally train the model, if required, and evaluate its efficiency.
The artificial data generation approach helps to significantly reduce the implementation time, while the manual image labeling approach takes from several months to a year. We complete the project in 2-3 weeks, depending on the number of SKUs, their shape and variety.
For example, for a merchandising agency, we completed a pilot project in just two weeks. For the pilot, 6 cities, 45 retail chains and 694 retail outlets were selected. Goods Checker was used by 4 supervisors and 8 merchandisers. After two weeks, the recognition accuracy was 95% and higher.
Goods Checker helped expedite reporting by 70%. Now merchandisers spend less than 20 minutes on reports instead of one hour. Moreover, the audit of the outlet became faster by 10-50%, depending on the size of the store.
Using Machine Vision for FMCG is an investment that pays off.
Adopting Computer Vision is Now Much Easier
Everyone is used to the fact that the deployment of complex technologies is long and expensive. Since you need to buy equipment and modify applications a lot. Today everything is different. Every year the process of development and introduction is optimized and simplified, technologies become more accessible.
For example, companies offer their solutions under the SaaS model, and implement quick pilot projects. This approach allows customers to test the product in real-world contexts and understand whether it is suitable for their company and addresses their tasks.