Written by Adrian Beck
New insights on the use of video and video analytics in retail
Research Grant Providers
This report, published in 2020, focuses upon providing a detailed and objective assessment of how and why video technologies are currently being used within the retail sector. In addition, it provides a critical review of both the potential and the challenges of utilising video analytics and the key strategic trends that can be seen in the retail industry.
It draws upon in-depth interviews with representatives from 22 retailers based in the US and Europe, with collective sales of nearly €1 trillion, equivalent to approximately 12% of their retail market, and operating in over 57,000 retail outlets. In addition, interviews were carried out with representatives from five major video technology providers.
The research summarises the way in which retailers currently use video technologies, grouping them into 8 use cases: Delivering Deterrence; Providing Reassurance; Ensuring Compliance; Undertaking Reviews; Informing and Enabling Business Decisions; Carrying out Monitoring; Generating a Response; and Identifying, Detecting and Alerting. It also describes some of the current ways in which video analytics are being used, focussed upon security and safety, and business intelligence applications. It offers a comprehensive review of some of the challenges of using video analytics in a retail environment, including issues around data accuracy, managing scalability, addressing bandwidth and processing concerns, difficulties of measuring and proving the ROI, and recognising the impact of contextual complexity and degree of clarity of purpose on video analytic performance.
The research concludes by bringing together some key strategic trends, including: the importance of developing organisational leadership in retailing on video technologies; ensuring that businesses focus upon delivering cross functional use, build greater capacity through data integration, and build video systems that meet organizational requirements.
Retailing is a constantly evolving and dynamic industry, hard wired into the economies of most countries. It is also an increasingly competitive and complex environment, demanding businesses to not only be agile and responsive to the needs of their customers, but also make use of a growing array of technologies. While video systems have been used by retailers for some time, primarily to deliver security and safety, this report provides a stimulus to think more creatively now, and in the future, about how its role can be further developed to enable retailers to meet their core goals of satisfying customers and returning a profit.
The ECR Retail Loss Group is delighted to have had the opportunity to support this important and ground-breaking study. As a technology, CCTV has been a part of the retail environment for very many years – we are all very familiar with seeing it in retail stores and increasingly in many other settings as well.
However, longevity and ubiquity do not always translate into rationality – as an industry, we have not always been good at developing a clear understanding of what this technology is actually supposed to deliver – why do we invest in it and what do we want it to do? We are also hearing that AI and Machine Learning technologies will ‘transform’ retailing, bringing significant new opportunities to further satisfy customers and protect retail profits, although what this will look like in practice is still to be determined.
This new report on the use of video technologies the retail industry by Professor Beck is therefore very timely indeed. As with all his work, he focusses a critical lens on the subject, offering a detailed analysis of not only how and why the retail industry utilises a range of video technologies, but also what the prospects, problems and practicalities are of employing video analytics in a retail setting.
We very much hope that this work will help you to develop your own CCTV strategy, enabling you to maximise your investment in this often used and constantly evolving technology.
We would very much like to thank both Professor Beck for undertaking this study and all the retailers and technology providers who so willingly agreed to help him with his research. As always, the ECR Group very much appreciates your commitment to contributing to the development of new thinking and ideas.
Chair of the ECR Retail Loss Group
This research focuses upon providing a detailed and objective assessment of how and why video technologies are currently being used within the retail sector. In addition, it provides a critical review of both the potential and the challenges of utilising video analytics.
It draws upon in-depth interviews with representatives from 22 retailers based in the US and Europe, with collective sales of nearly €1 trillion, equivalent to approximately 12% of their retail market, and operating in over 57,000 retail outlets. In addition, interviews were carried out with representatives from five major video technology providers.
For the most part, retailers invest in video technologies to address issues of security and safety although there is a growing realisation that future business cases will need to embrace a more cross functional model, ensuring greater value is realised from the investment.
Retail use of video technologies can be grouped into four modes with eight areas of focus:
There is growing interest in the potential of video analytics to contribute to the profitability of retail businesses. However, some respondents had concerns about the extent to which the claims being made about its potential could be translated into reality.
Motion triggers were by far the most common, particularly relating to building perimeters, and areas with limited access authorisation. In addition, some were developing analytics to identify the movement of objects, such as high-risk products. There was also the use of analytics to address losses associated with self-scan checkouts, and some respondents were reviewing the use of facial recognition. Pre-emptively recognising violent incidents is an area of experimental development but thus far, is regarded as fraught with operational difficulties.
Five areas were identified as being the focus of use cases: improving customer service through better staff response times and product availability; generating heat maps and customer dwell times; people counting and queue monitoring; delivery alerts; and improving pick accuracy. Concerns were raised about developing a viable ROI for many of these analytics.
Users of video analytics pointed to the challenges of managing outputs from these systems, in particular the number of False and Overload Positives – the former being where the system has incorrectly identified a predetermined event, and the latter where the system correctly identifies events that are poorly defined and/or are overly common.
The key is developing sufficiently nuanced systems that are capable of accurately and consistently filtering the vital few events from the trivial many. If this does not happen, then there is a danger of them becoming the next deliverer of the ‘Crying Wolf Syndrome’, creating a presumption of error and ultimately alarm fatigue in those tasked with responding.
The second major challenge relates to difficulties in applying video analytics at scale, not least because of the significant impact of context on their efficacy. Even subtle differences in the environment within which they operate can drastically affect their capacity to work consistently, requiring detailed contextual tuning. For some this extended process of fine tuning ended up becoming a major distraction to resolving the original underlying problem.
While the cost of computer processing decreases in real terms and many countries are rolling out improved Communications Networks, this research identified concerns about how bandwidth and computing power can impact upon the use of video analytics. This needs to be considered when deciding upon which analytics to employ and the extent to which they are going to be scaled.
As with many other interventions designed to positively address retail losses and profitability, getting to grips with measuring their impact is a recurring concern. Indeed, this has been a perennial problem for many video-based interventions, particularly where outcomes are often intangible and yet desirable, such as staff safety. But, it is important to develop a co-ordinated and considered approach to measuring how and why video analytics impact upon a retail business, especially when their use cuts across a range of retail operations.
The efficacy of video analytics is undermined by two intertwined factors – clarity of defined objective and degree of operational retail complexity. As the retail environment becomes more complex, both in terms of process and shopper environment, and the link between stated objective and outcome trigger becomes less clear, then the likelihood of False and Overload Positives increases considerably.
While video technologies have been utilised in some form or other in retailing for over 40 years, the research found few examples of retailers where its role, purpose and capability to contribute to business success was clearly articulated. As detailed already, it is a technology with a broad ranging and rapidly evolving capability, but what seems clear from this research is the need for explicit leadership, greater application across retail functions, improved integration of video technologies with existing systems, and better alignment of video system design with organisational objectives.
As the potential for video technologies to impact upon retail businesses increase, it becomes ever more important that an overarching and cross-organisational approach to developing and managing its use is established. Retail organisations should appoint a ‘video technology tsar’ to positively and proactively lead on the current and future use of these systems. Their role should be to ensure:
While the Loss Prevention function has traditionally played a dominant role in the use of video systems, it does not necessarily follow that they should take on this leadership role. The move towards greater use of IP-based video systems and the value and importance of system integration, may mean that the IT function increasingly takes on this responsibility. Either way, what is key is that the role of Video Tsar is established, recognised and empowered.
Making the case for any investment in retailing is increasingly subject to its applicability across an organisation – how might multiple business functions benefit? The application and use of video systems is no different – business cases need to show how they can provide value beyond the traditional confines of the Loss Prevention department.
Video systems are increasingly but one of several data sources that can be used to improve operations and business profitability. Technology providers need to work with their retail partners to ensure that, wherever possible, video data is fully integrated into the broader organisational information web to maximise impact and value.
Unless retailers understand how they want to benefit from investments in video technologies, system designs will continue to be piecemeal, partial and poorly configured. While future-proofing is hard, developing a clear strategy for how it will be used to improve profitability will be a key first step in ensuring that any proposed system design is fit for purpose.
Retailing is a constantly evolving and dynamic industry, hard wired into the economies of most countries. It is also an increasingly competitive and complex environment, demanding businesses to not only be agile and responsive to the needs of their customers, but also make use of a growing array of technologies. While video systems have been used by retailers for some time, primarily to deliver security and safety, it is hoped that this report will provide a stimulus to think more creatively now, and in the future, about how its role can be further developed to enable retailers to meet their core goals of satisfying customers and returning a profit.
It is hard to overestimate the extent to which video technologies are now becoming embedded in societies around the world, although estimates on the scale of the market are often difficult to ascertain, particularly for the retail sector. One technology provider has suggested that the global market for video hardware and software used in retailing in 2020 could be as much as $1.7 billion, rising to perhaps as much as $2.2 billion by 2023 (2) .
This is perhaps not surprising – the use of video technologies is not only longstanding, but also increasingly ubiquitous in this sector – it is hard these days to find a segment of retail space that is not under the gaze of some form of video system (3) . Indeed, retailing has been at the forefront of the use of what are often called Closed Circuit Television (CCTV) systems since the 1970s and 1980s (4) . Primarily deployed as a crime prevention and detection tool, and initially based upon analogue technologies, it is now going through a period of considerable and remarkable change (5) .
This is being driven by a rapidly changing technological and social context, including: major developments in digitisation and associated analytical capabilities; significant reductions in capital costs; advancements in miniaturisation and networking capabilities; and growing societal acceptance of the routine surveillance of public spaces (6) . This has led to the potential role and capability of video technologies in retailing beginning to expand beyond its traditional role as merely a facilitator of safety and security – an insurance safety net should anything untoward happen.
Indeed, what can now be seen is the growing capability of video technologies to be used for a wide range of retail activities, including managing the retail environment, such as controlling access and store functionality, and playing a role in enhancing business profitability, such as enabling customer counting, monitoring queuing, shopper dwell times and stock levels (7) .
However, capability does not always translate into actual use – just because a technology can do something does not always mean that it will be used or operationalised as originally intended, or indeed deliver on initial expectations. For instance, in the early days of the use of CCTV systems in public spaces in the UK, grandiose claims were frequently made about its capacity to significantly reduce crime, which were seldom found to have much veracity (8) .
Similarly, within retail spaces, early studies found little evidence that these systems had much of a lasting impact on levels of retail loss and invariably it was almost impossible to achieve a favourable Return on Investment (ROI) on purely retail loss reductions alone (9) . Equally, it is not clear the extent to which those tasked with using these systems have the capacity and working practices to fully utilise the functionality available – the difference between initial specification and practical application can sometimes be profound (10).
Moreover, the recent rapid growth in the development of video systems that can potentially provide an ‘analytic’ capability – in effect building some form of automation into the viewing, reviewing and responding process, has further heightened interest in utilising these systems more broadly across retail environments, although ensuring that this brings a genuine ‘benefit’ is also the subject of much speculation at the moment.
Talk of video-based Artificial Intelligence (AI) and Machine Learning (ML) technologies being the next ‘big’ transformative change in retailing is currently dominating many trade shows and future gazing deliberations. But whether their introduction and use currently makes sense or are indeed the most appropriate interventions to address the pressing and varied concerns of retailing, is certainly open to debate and critical review.
Given this, the ECR Retail Loss Group considered it important to commission research to look carefully at: how current and future video systems can/might contribute to the retail environment; the benefits they can bring; the lessons that can be learnt from those currently using them; and the ways in which this investment can be maximised to have the greatest effect.
As many retailers around the world begin to consider, and in many respects, embrace the transition from analogue to digital video systems, key questions are now arising about how this technology can be utilised to the greatest effect in the retail environment – how can this investment be maximised to benefit retail organisations the most, to create a much more cross-functional and multi-dimensional engagement with the technology?
To date, it has been a technology closely associated with, and focused upon, the delivery of safety and security in retail spaces. However, as this research will show, new use cases are emerging that are beginning to not only significantly enhance its capacity to deliver these objectives, but also enable it to contribute more broadly across retail organisations. This research, therefore, focuses upon providing a detailed and objective review of how video technologies are currently being used, particularly focusing upon the growing area of video analytics and the opportunities and challenges that they present
It should be noted that this research was undertaken prior to the COVID-19 Pandemic that swept around the world in early 2020. At the time of publication, it continues to have a profound impact upon the world of retailing, although how long-lasting its effect will be is impossible to predict. It is also hard to know at this stage how it may influence the use of video technologies by retailers, although early reactions suggest it could play some role in helping to manage customer counting and social distancing (11).
This study is based upon in-depth interviews with representatives from 22 retailers based in the US and Europe (12). They represent some of the largest retail businesses in the world with collective sales in 2019 of nearly €1 trillion, equivalent to approximately 12% of the total US and European retail market (13). In 2018/19, these companies operated in over 57,000 retail outlets as well as the majority having extensive online operations. In addition, interviews were carried out with representatives from five video technology providers. In total, the research is based upon nearly 30 hours of interviews. Where possible, visits were made to some of the retailers to get first-hand experience of their video systems in operation.
Those retailers that took part in the research were self-selecting – the researcher conducted an online survey, distributed widely through existing contacts, representative trade associations and social media platforms, asking respondents to describe their use of video technologies. Those completing this survey were then asked whether they would be prepared to take part in more detailed research interviews. In addition, through contacts made via the ECR Retail Loss Group and the Retail Industry Leaders Association’s Asset Protection Leaders Council in the US, additional retail companies were approached for interview.
While this study is primarily interested in the views and experiences of retail users of video systems, some technology providers were also approached for their thoughts on aspects of the research. These were selected through existing contacts, size of operation and types of technology being developed. As with the retailers selected for inclusion, this research does not claim to be based upon a detailed and representative sample of the retail industry nor the companies offering video technologies. As such, the results from this research need to be treated with caution – they are only based upon what some companies are using and developing and it is recognised that, given the dynamic nature of this sector, there is likely to be wide range of other video technologies and use cases in existence that are not included.
The companies that agreed to contribute to this research and have their name disclosed are: Abercrombie and Fitch; Adidas; Axis; Big Y; Boots UK; Carrefour; Co-op; Five Below; Genetec; JD Sports; John Lewis & Partners; Lowes; Marks & Spencer; Meijer; Morrisons; Next; Primark; Sainsbury’s; Tesco; Travis Perkins Group; Walmart; and Zebra.
Throughout this report the terms ‘video technologies’, ‘CCTV’, and ‘video systems’ will be used interchangeably to refer to any system designed to observe, collect, collate and analyse both analogue and digital images derived from various types of video cameras. For the sake of brevity, the term ‘Loss Prevention’ will be used to describe the retail function tasked with the management of retail losses.
The topic of video technologies potentially covers an extremely wide range of issues: from technical aspects such as system design and capability right through to why they are used and how. Inevitably, therefore, this study had to focus upon just some of these issues. The report is broken down into three substantive sections. The first is focussed upon developing a more detailed understanding of how retail companies are using video systems. The second section then goes on to look in some detail at the topic of ‘video analytics’, firstly understanding for what purposes they are being used by retailers and secondly, what their experiences of using them have been. The third part of the report then goes on to capture some of the strategic trends in the use of video systems in retailing identified by this research.
Because retailers have been utilising various forms of video technologies since the mid 1970s – 45 years or more – it may seem odd to begin this research by asking the deceptively simple questions of why do retailers invest in them and for what purposes are they used? However, evidence of ubiquity does not always translate into uniformity of purpose nor understanding of rationale. This section of the report, therefore, focusses upon providing a detailed review of not only why respondents to this research made the decision to invest in video technologies, but also the main ways in which they were currently using them.
Respondents had very mixed views on what the overall purpose of their video systems was although most erred towards what might be regarded as a more established ‘security’ orientation: ‘The main purpose is about the prevention and detection of crime – provide evidence to see what is happening to try and stop it’[RR17]; ‘It [video] protects assets and staff; it’s to deter internal and external theft and identify offenders’[RR9]; ‘Historically it’s been used primarily to catch/identify bad guys…deterring and detecting thieves and investigating crime’[RR13].
But for others, the purpose had become much more blurred and unfocussed: ‘What’s the primary objective of CCTV in our business, to be honest, I’m not really sure what it’s for – from a theft perspective it’s pretty limited’[RR6]; ‘We need to figure out our strategy to understand what do we want video to do; if we are not careful we will not have a joined up approach’[RR17]. There was certainly a growing realisation from some respondents that for the investment in video technologies to make sense, then its purpose had to be increasingly more than just delivering a form of securitised comfort blanket:
As will be seen below, this lack of an overarching common view of the purpose of video technologies in retailing is well reflected in the breadth of use cases to which is it now being used, and while security and crime prevention undoubtedly remain a dominant driver of purpose, other functionality is increasingly being drawn into the broader rationale for using video systems.
Since its early introduction into the retail environment, video systems have presented a challenge in terms of advocates showing a clearly identifiable Return on Investment (ROI), particularly relating to the tangible components of retail loss. For instance, installing CCTV has rarely translated into a measurable and sustainable reduction in shrinkage (14). But, in many cases, its role has often been seen as delivering much more intangible benefits that are less easy to capture in a businesses’ P&L, such as staff and customer safety and protecting organisational reputation: ‘Our ROI is much more subjective and intangible but we know that store management are more likely to leave if violent incidents are not resolved’[RR1].
Indeed, video systems are often a form of ‘insurance policy’ – retailers showing due diligence when things go wrong in their spheres of influence and responsibility – was a common response to questions relating to how a business case was developed. This was particularly the case when it came to law enforcement: ‘There is a growing expectation from the police that every store will be connected and so we need to make that happen’[RR5].
As will be discussed below in more detail, there appears to be a broad range of use cases for utilising video technologies in retail, but the challenge has often been collating evidence to either make or confirm a persuasive business case – the data can be hard or practically impossible to gather unless systems have been actively put in place to collect it.
For example, multiple respondents to this research highlighted the positive role they believed their video systems play in managing the issue of false Health and Safety claims made by customers and staff, reducing their insurance premiums by showing due diligence, and meeting requirements set by governmental licencing authorities for the sale of alcohol. All of these clearly make a difference to the profitability of a retail business and yet too often, the ‘value’ of video systems in these types of examples is not actively measured, understood and collated into an overarching business case for their use.
No doubt the ‘intangible’ benefits of ensuring safety and security will prevail at a macro level for justifying the continuing use of some forms of video technologies, but if retail businesses want to expand its use and begin to take advantage of developments in its capability, then they will need to adopt more cross-functional and inclusive business models that more accurately capture the full value they are capable of offering.
Respondents to this research provided a rich tapestry of use cases and applications for video technologies in their businesses and so it was important to try and begin to categorise them into discrete areas of use. As can be seen in Figure 1, respondents’ uses can be broadly grouped into eight areas:
Since its introduction specifically into retail environments and more broadly in public spaces, a key tenet of video technologies has been their supposed ability to create a deterrent effect – to put off would-be offenders because they are concerned about an elevated risk of being caught. This typically works in two ways: first, creating an increased sense of risk that a security response will be triggered which will lead to a higher chance of being caught in the act; and secondly, an elevated sense of risk that offenders will be identified and subsequently caught and prosecuted (15).
In both scenarios the would-be miscreant needs to not only be aware of the presence of the video system but also believe it delivers a credible risk – either somebody is watching in real time and can respond, and/or the recorded images will be capable of being reviewed and used at a later date for identification and possible punishment (16).
Both in retail spaces and across a wider range of environments within which video systems are now employed, proving this deterrent effect has been difficult – the complexity of measuring deterrence is well known – how can you prove that something didn’t happen that might have happened had the video system not been present? Certainly, studies have been undertaken to try and measure changes in key indicators such as the number and types of crimes and levels of loss before and after the introduction of video systems, but there is often a plethora of confounding factors that undermine the reliability of the results (17).
Of particular concern is the extent to which the growing ubiquity of video systems throughout many societies can and is undermining any form of deterrent impact they might be able to deliver – are the general public and offenders alike simply becoming oblivious to the technological ‘wallpaper’ that adorns many modern urban environments? This is certainly a sentiment that was expressed by many of the respondents to this research.
A number of respondents were resigned to the view that their video systems no longer deliver a generalised deterrent effect upon would-be offenders: ‘I don’t think people care anymore if there is a cameras system in a store’[RR9]; ‘So we know we don’t see the deter value in having CCTV anymore – that’s gone now’[RR15].
Perhaps one of the most overt ways in which this deterrent potential has been utilised in retail stores is through the use of Public Video Monitors (PVMs), which typically show consumers, via a large display, an image of themselves (and others) entering or visiting particular parts of the store. While some limited research (18) has suggested they do have some impact on rates of loss, one respondent to this research was summarily dismissive of their value: ‘We have stopped fitting PVMs – because they are as effective as a bell box [external burglar alarm box]; everybody has them and nobody needs to be reminded that you have a camera system because everybody’s got one’[RR9].
While the generalised deterrent capacity of video systems was regarded by some as waning, there was evidence that the focus of generating video-based deterrence was shifting away from the general towards the specific – making it much more oriented towards the individual miscreant rather than consumers in general. This trend of ‘personalising’ video deterrence was evident in several different technologies being utilised, not least Body Worn Cameras (BWCs), Face Boxing technologies, Selfscan Personal Display Monitors (SCO PDMs) and to some extent patrolling CCTV Vans.
While the use of BWCs is now relatively common within the law enforcement community in many countries, their use in the retail environment is still a recent development, with several companies in the UK, championing their use. Of the 22 companies taking part in this research, nearly two-thirds (60%) were either using or trialling some form of BWC. The technology, while varying in design and functionality depending upon the supplier and the retailer requirement, was found to be employed for five reasons:
Of the 13 companies using or had tried using BWCs, the majority employed them only on in-store security guards (70%) while just two retailers had taken the decision to offer them to retail staff, although a third company was contemplating doing this. A further four companies were using them for different types of staff and settings: one was employing them on employees responsible for home delivery, another had provided them to security guards operating in distribution centres and at the Company’s Headquarters; a third had given them to security staff patrolling vacant properties; and a fourth utilised them on security staff issuing banning notices to known shop thieves.
Several respondents expressed their reluctance to deploy BWCs on retail staff, concerned about the type of image it might portray:
… there is that ‘military’ effect on how staff look – do we want our staff to look like the military? There is a nervousness about whether we want this in all our stores. Is it brutalising the retail space? For some it reassures, for others it’s a barrier [RR18].
Others were also concerned about the way in which they may be perceived when used by retail staff: ‘There are concerns from some pockets of the business about its impact on customer perceptions of the retail space – is it not safe?’ [RR16].
Whilst it was not possible to get extensive and definitive data on the extent to which the use of BWCs had impacted upon levels of violent incidents in retail stores, seven of the 13 companies had carried out some form of evaluation/review, often based upon small trials in a few stores looking at their impact on the number of recorded incidents. In addition, some had surveyed the wearers of BWCs to ascertain their views on using the technology.
For all those that had carried out a trial, they found that the use of BWCs was associated with a reduction in the number of incidents of violence and verbal abuse: the lowest reduction was recorded at 30% while the highest was an 80% reduction in these types of incidents. Overall, when averaging across this data, the use of BWCs was associated with a reduction in the number of violent and verbal abuse incidents by about 45%.
For the most part, staff attitudes to the use of this technology was very positive: ‘It’s been a very positive story for us; feedback is really good’[RR2]; ‘Staff absolutely love them’[RR6]. None had found any negative feedback from customers and indeed some argued that they viewed them as an indication of the extent to which the business was taking safety and security seriously.
A small minority of respondents did identify some issues from users – one suggested that their feedback had been more mixed with some staff concerned about being filmed and having their actions scrutinised. Another staff survey found that while one-third of users had felt safer wearing the BWCs, the same proportion did not feel any safer. For one company their trial use was brought to an end because of concerns about how they were being used in retail stores: ‘the problem was with compliance – being used when they shouldn’t have been; until we can get this right, nobody is going to be using them in retail’[RR15]. Notwithstanding these views, most of the feedback from the 13 companies employing this technology was very positive indeed.
In terms of next steps, respondents suggested several ways that they might move forward with this technology:
While certainly less prevalent than BWCs, the use of ‘Face Boxing’ on Public View Monitors (PVMs) is another example of how retailers are beginning to try and ‘personalise’ the deterrent component of their video systems. This is a relatively simple technology that digitally ‘draws a box’ (some companies used a green box, while others a red box) around the head/ face of all those entering a retail store.
The general idea is that it gives the impression that the company may be using some form of facial recognition (even though this technology was not actually in use) or it simply catches the attention of the consumer: ‘We use Face Boxing on PVMs – nothing being stored; it is a relatively crude technology to give the impression we have FR [Facial Recognition] in operation’[RR20]. For another it was felt that this technology better drew the consumer’s attention to the presence of video systems in the store:
We have put in 40” monitors in stores that puts a green box around customer faces as they walk in – you can’t but notice that you are on the monitor – hung very low down at the entrances. We are trying to make it more obvious that the customer is on CCTV [RR2].
As with many other video systems, it is very difficult to know the extent to which this type of technology has an impact on levels of loss or incidents of violence and verbal abuse, although one respondent claimed that it had had an effect: ‘in the stores where it has been used we have seen a drop in losses and a drop in incidents of violence’[RR5].
A third technology which highlighted this drive towards generating a more personalised video-based deterrent was the increasingly widespread use of video screens attached to Fixed Self-checkout (SCO) terminals showing an image of the person using it and, in some cases, the products being scanned. As other ECR research has shown, the risk of errant behaviour at SCO terminals is a growing concern, particularly from users either not scanning items at all or mispresenting items to secure a lower price (21).
Because of the relatively un-supervised nature of this type of checkout and the elevated risks associated with it, retailers have been introducing a range of approaches (such as video-based technologies, revised store processes, changes to store design, and improvements in staffing levels and training) to try and address their losses.
Emerging research suggests that much of this loss may well be due to a group of shoppers who would not normally engage in overt shop theft behaviour now taking advantage of the perceived low-risk opportunities these systems offer – previously ‘honest’ shoppers now engaging in relatively small-scale, hard to prove theft.
While this is a disturbing outcome from the introduction of SCO technologies, it is also the case that this new group of miscreant opportunistic shoppers is also highly likely to be very risk averse – they are relatively easily deterred or ‘nudged’ back to honesty. For the most part, they would rarely view themselves as shop thieves or have entered the retail store with an explicit agenda to steal at scale.
Therefore, any ‘amplification’ of the risk of getting caught is likely to drive them back to honesty. In this respect, the use of SCO PDMs is an example of retailers further personalising the risk amplification process: ‘look, we are recording what YOU are doing at this SCO machine’. Respondents to this research echoed this: ‘SCO is a big risk for us, we need to show the shopper that they will get caught and we think PDMs help to do this’[RR20].
It is still early days in terms of understanding the effectiveness of any of the various ways in which retailers are trying to moderate SCO-related losses, but one major UK Grocer has shared some of their findings relating to the use of SCO PDMs. In a trial based upon measuring the change in store shrinkage in 10 stores before and after the introduction of SCO PDMs, they found that losses were significantly lower in nine of these stores (22). While encouraging, more research is required to understand the extent to which these systems impact upon store losses and whether their ‘potency’ is eroded over time as has been found with other forms of video-based deterrent (23).
While not as strictly ‘personalised’ as the examples above, one of the respondents to this research shared their experience of trying to generate a more credible and focussed deterrence to counter criminality around their stores. This was done using highly overt vehicles equipped with video and audio technologies that could be targeted on high risk areas/stores:
They patrol in car parks and have a guard inside; they are a good deterrent; the guard will respond if something is happening. It looks like a police van, has a mast that goes up and also has a PA system [RR14].
Other companies had also explored the use of overt semi-mobile video systems deployed in car parks focussed on both deterring crime and, in some instances, helping to collect evidence. As with the other interventions described above, it was not possible to collect any verifiable data on their impact over time beyond anecdotal evidence provided through these research interviews.
The final area relating to deterrence that was discussed by respondents was the impact the introduction of video systems can have upon levels of internal theft and process-related losses. Very often the focus of attention is upon the threat from external actors, namely shop thieves, but numerous research studies have highlighted the threat that can come from within, from the people retailers’ employ in their businesses (24).
Video systems can be installed to try and manage this issue in various ways, such as locating cameras above Points of Sale (PoS) to deter staff from stealing cash and/or giving stock away to family and friends (sweethearting), positioning cameras in backroom areas to deter the pilfering/ eating of stock, and within secure spaces, such as cash offices, to again deter the theft of cash. For one respondent to this research, the introduction of video systems into their stores was primarily focussed upon driving down internal losses:
I believe that the reason this is happening [lower losses] is because of the reduction in the risk of internal theft. People act differently when they believe they are being monitored. Also has helped to improve store execution – stores begin to tidy up the backrooms, hallways, store floor areas. Appears to have significant deterrent value for staff in particular [RR7].
This company had measured the impact of this investment and argued that every store installation thus far had received a positive ROI within one year based upon a reduction in recorded store shrinkage: ‘When we invest in CCTV, the shrink reduction pays for the cost of the installation in the first year’[RR7]. Further data from this company showed how the shrink savings continued for a further 2 years after installation before levelling out – this was compared against a sample of control stores where they had yet to install CCTV.
These are certainly impressive findings indicating a good result in stores where a video system is installed where nothing was previously present. For most retailers, however, few will be in a position where they have not already introduced some form of CCTV into their stores and therefore, they are unlikely to be able to observe similar levels of success.
While deterrence was a widely articulated passive use case for video systems in retailing, using them more actively to undertake reviews of recorded images was almost as prevalent in the companies taking part in this research. Globally, retailers are probably one of the largest non-governmental routine collectors of video images: ‘A several hundred store chain with 10 lanes of checkout will probably have 1,000 years of video stored at any point in time!’ [RR15]. This enormous library of images represents both a challenge and an opportunity to retailers – a challenge in terms of how to retain and access this information, and an opportunity to review activities that may be of value to the business.
It is important to recognise that this form of video review is not generally regarded as video analytics – the selection of images to review is rarely automated or driven by algorithms but done simply by other data triggers or manual interventions. At the moment, most of this reviewing activity is focussed upon events that are, or may, pose a risk to a retail organisation, and is therefore typically undertaken by loss prevention-focussed staff.
Most respondents to this research identified three main areas where post hoc review of video images was taking place on a regular basis: collecting evidence on persistent offenders; scrutinising exception reports from PoS systems; and reviewing incidents relating to health and safety. In addition, several respondents also described some of the other ways in which they were using their video systems to review events and activities, but these were much less common and are therefore described together at the end of this section.
For numerous retailers their post incident video reviewing process was focussed upon trying to collate sufficient evidence on persistent offenders to facilitate either their prosecution by the criminal justice system or the imposition of company-specific banning orders: ‘We build dossiers on repeat offenders and then give them to the police’[RR5]. Others echoed this use and the value video images brought to this work: ‘We create a folder with all the available evidence including CCTV and then give this to the police – without video they won’t do anything’[RR10].
For those retailers that had developed a centralised security hub, this was one of their core activities although for businesses with many physical stores, delivering this at scale was at best challenging. In addition to building evidence files on persistent offenders, some respondents used the review of shop theft incidents to alert store staff of potential criminal gangs that may be heading in their direction: ‘We send reports out to stores warning them of potential gangs in the area – what they look like and what they might target’[RR10].
Only a very small minority of the respondents to this research who were undertaking this type of activity were able to put a monetary value on the ROI – not least because it is very hard to maintain an accurate flow of data once a persistent offender case file is shared with the police or store staff are given advanced warning of a criminal gang operating in their area. Given all the myriad other factors that impact upon shrinkage in a retail store, it is almost impossible to show any causation between changes in levels of loss and the post hoc review and collation of images of persistent shop thieves. That is not to say that this type of work has no value, but more a reflection on the challenges of justifying the business case for investing in video for this purpose.
A second area of video review related to assisting in the assessment of retail transactions deemed to be ‘unusual’ or exceptional. This is an example of where video played a confirmatory role when connected with other systems, such as data mining software utilised to analyse PoS and Refund data: ‘The [PoS] data is always the instigator and the video the verifier’[RR17].
Numerous systems exist that ‘mine’ PoS data for unusual or exceptional activity, such as staff members who perform well above the average number of refunds or voids or carry them out at unusual times such as when the store is officially closed to customers. It is then the task of loss prevention investigators to determine the veracity (or not) of these ‘exceptional’ transactions and, where appropriate, build a case to act against a member of staff deemed to be behaving inappropriately. Without the capability to review video images of these transactions and events, it would inevitably be a time-consuming and challenging process.
Respondents using their video systems for this purpose highlighted the value and importance of having them synchronised with their PoS systems/Data Analytic software: ‘We have the CCTV system connected to PoS – we can review video at the same time as the transaction – the key is having time synchronisation between cameras and PoS system otherwise searching can be time consuming’[RR7]. Another illustrated the value of doing this:
We use an exception reporting system called [name of provider], connected to CCTV and they are synchronised in terms of time – this probably saves 15 minutes per event, 50 events per week adds up to a considerable amount of time – allows us to review a lot more exceptions and so increases the chance of finding incidents [RR1].
While some had begun to look at ways in which the video data could offer more of an analytic capability, especially when trying to identify fraudulent behaviour amongst the very large volume of customer refunds most retailers have to deal with, none felt they had adequately addressed the high volume of false positives that are generated at the moment, something which will be discussed in more detail later in this report.
Where the post hoc review of video images was deemed to be of particular value was in relation to ‘accidents’ occurring in retail premises, such as customers and employees tripping and falling: ‘If somebody has an accident, there will be a review and then if the injured person files a claim then the video would be reviewed’[RR17]. Depending upon the part of the world in which a business operates, this is probably one of the most expensive areas of liability a retailer faces on a regular basis. In the UK for example, over 11,000 retailer employees a year are involved in slips and trips resulting in a serious incident and it has been estimated that health and safety related incidents were costing in the region of £5.2 billion a year (25).
While most health and safety incidents are probably genuine, respondents to this research highlighted the value video data can play in checking the veracity of any given claim: ‘In a depot there have been 4 or 5 claims which have been proven to be false – historically we would have just paid out to protect the brand but now there has been a cultural change as a consequence of putting in the cameras’[RR17].
This was echoed by other respondents who pointed towards a pre-video corporate culture of presumption of liability and payment of claims due to a lack of easily accessible counter evidence. Several respondents were able to offer, albeit rather anecdotally, evidence of the positive impact of now using video systems to review the degree to which their businesses were genuinely liable for health and safety-related incidents.
For example: ‘After the first year [of installing the video system] we had the same number of claims but we paid out 15 times more on the claims prior to putting those cameras in – they [the cameras] literally paid for themselves within the 2nd quarter’[RR4]; ‘I reckon we have saved over half a million in false claims in the DC – such as showing staff messing about on forklift trucks before they got injured’[RR5].
For another respondent who was in the process of updating their video system, part of the rationale was based upon reducing the corporate liability bill:
The Risk Department is exceptionally excited by the prospect [of having access to the video system] and Customer Service are now looking at the video to see whether the company should pay out, and how much – I think half of the claims we currently get are false [RR18].
Two respondents also highlighted a related financial benefit of having a video system capable of effectively monitoring and limiting a businesses’ exposure to false claims, namely a reduction in the cost of insurance premiums: ‘At the moment we pay out on nearly 90% of claims against us due to no CCTV … we believe we can reduce our insurance premium by 50% by showing them the processes we [now] have in place’[RR18]. Another respondent was prepared to put a value on now having an IP networked video system and its impact on reducing their insurance premiums:
… I would probably say between 2-5%. Our premiums are experience rated, meaning that the insurer looks at our claims experience and provides premium rates from that. CCTV has probably reduced the value of many claims and entirely defended others, therefore I would be confident that our claims experience is much better for having the CCTV [RR23].
Other respondents also pointed to the way in which having incidents recorded reduced their liabilities through having the capacity to show due diligence when a case went to court: ‘We have seen certain types of claim – mainly acts of violence – reduced because we can now show due diligence on the part of the company in their efforts to try and prevent the incident from occurring’[RR11]
Finally, two companies described how they used their video systems to proactively manage their exposure to the risks of health and safety claims. The first tasked SOC staff in quiet periods to carry out random spot checks on stores looking for breaches, such as blocked fire escapes. The second used their networked video system to prioritise training and maximise the use of limited resources:
Our safety manager will, rather than travel to 80 stores and spend all his time behind a steering wheel, he will do a video audit to identify which stores he should go to and do training – the video helps to decide where to go to first [RR4].
As with the other areas where video was being used to review incidents and events, there was a general lack of attention to try and accurately measure the financial value it was bringing to the business. However, in many respects, this is more a case of a lack of prioritisation than a challenge of data collection – the methodology to understand the value video systems can bring to the management of health and safety-related incidents is relatively straightforward to envisage (26).
As will be discussed later in this report, if retailers wish to reap the benefits from the use of their video system in this area, then they will need to design it accordingly – as one respondent put it: ‘Prior to its installation it was always the case that a slip and fall happened where we didn’t have a camera’[RR4].
In addition to the three main areas described above, several respondents also described some of the other ways in which they were reviewing their recorded images although these were far less common.
Some respondents shared how they used their video systems to remotely undertake routine store audits based upon recorded data. This was done primarily to give constructive feedback to a sample of stores, such as those that had recently opened or were experiencing unusual levels of loss. One respondent described their process:
We do CCTV observations or snapshots – take a period of time when a member of the team will take one hour to look at previously recorded information – such as review engagement with customers at PoS, then share the results with the rest of the organisation [RR7].
Another described how they utilised this type of audit process to: ‘… create an illusion of centralised watching and checking’[RR19], in effect try to generate a Foucauldian-style response in store managers (they are not sure whether they are being watched or not and so temper their behaviour accordingly) (27). This is an interesting idea but again, almost impossible to measure its impact.
One respondent described how they used their recordings of in-store violent incidents to review how staff had responded to them, with a view to improving their employee training. In this respect, it was an interesting example of video review enabling the business to understand mistakes being made by their staff: ‘we found that in 24% of incidents staff were “overly” involved, in other words not doing what we had told them to do’[RR5]. No doubt including these images in future training sessions would be a very useful way of embedding better staff practices around the management of these types of incidents.
An innovative use case offered by another respondent related to using the review of recorded images to investigate discrepancies in deliveries from third-party suppliers. In this example, a retailer received direct-to-store deliveries from suppliers and where a store manager subsequently queried the accuracy of any given delivery, video footage was reviewed by the SOC (28). It was recognised that this was only really possible where camera coverage was good enough and the discrepancy was of sufficient size to make its identification possible, such as missing pallets or roll cages, but it was viewed as another use of the review capability of both the video system and the SOC.
Finally, one respondent described how they used video review to ascertain the veracity or not of a certain type of customer claim via their online business. In this example, the retailer had positioned cameras over the area where customer orders were collated and packaged ready for despatch. When a customer made a claim that their delivered order was missing several items, these images could be reviewed to understand whether the claimed missing items had been included when the order was originally packed.
As with most video review activities, the retailer was not able to put a value on this capability, but they did suggest it was in use ‘extensively’. As with some of the other examples described above, it would not be difficult to begin to develop ways to measure this value, which could then be used to help build a more cogent and persuasive business case for the use of video in retailing. It would seem the challenge lies in businesses developing a more co-ordinated and cross organisational management of the utilisation of their video systems, something which will be discussed later in this report.
The third way in which some respondents were utilising their video systems was to use them to enable real-time viewing. When CCTV was first introduced in a range of settings, this was considered to be the primary way in which it would be used – operators ‘watching’ over surveilled areas ready to react to any given incident or event (29). This is best characterised by the ‘wall of monitors’ often seen in some SOCs and other CCTV control centres (30).
However, the reality in most cases is that using humans to try and identify miscreant behaviours via video feeds from what can sometimes be tens of thousands of cameras often covering large, complex, crowded and rapidly changing environments, is at best ‘challenging’ (31). Many thefts in retail stores can be quick, discrete and covert events that are hard to identify at the best of times. When they happen in large crowded stores, the odds of most video operators watching the ‘right’ screen
at the ‘right’ time to witness the event is vanishingly small (32). This realisation has led many retailers to significantly curtail the extent to which they have staff actively monitoring real time video feeds – achieving a ROI is extremely difficult if not impossible in most cases.
However, several respondents to this research did identify some of the limited ways in which real time monitoring did take place within their organisations.
One company made use of a HQ-based security function to undertake real time monitoring of selected stores at night:
The HQ gatehouse will monitor in real time selected stores – every night they get a list of stores to review. If they see anything they will report it and every morning they then complete a report which is shared with the [reviewed] store and the LP team. Stores are chosen mainly on risk but also physical changes, a tip off, new stores etc [RR14].
Similarly, another company utilised video review at night to monitor the activities of replenishment staff: ‘We know the staff working night crew are aware that Big Brother is watching – the phone is ringing seconds later saying “hey why did you in the red coat walk out the door just now, you didn’t call me”’[RR4]. In both these cases, the much less dynamic and complex nature of the night-time store environment probably made the identification of ‘exceptional’ activities much more likely to be witnessed by those watching.
Another company provided real time monitoring when a store reported a particular type of incident, which would then be managed centrally with the SOC communicating with local loss prevention teams to deploy resources where necessary and collect evidence. Interestingly, one company described how their SOC used real time monitoring to provide reassurance to staff feeling concerned about their safety: ‘[Store] Staff can phone the SOC if they feel vulnerable entering and leaving a store so that they can keep an eye on them’[RR2].
Finally, another respondent described how they used real time centralised monitoring to undertake ‘virtual’ store audits in some of their stores to check on process compliance. It was stressed that this was done in a ‘positive’ way – providing constructive advice and feedback – rather than as a draconian ‘big brother’ intervention, because of concerns about staff morale and accusations of HQ ‘spying’ on stores.
In the not too distant past real time in-store monitoring was probably one of the biggest uses of video systems – a security guard either in a darkened room watching multiple screens, or at a CCTV podium situated near the entrance of the store. While the latter remains a relatively common sight in larger retail stores, the reality for some respondents was that this was as much about creating a deterrent effect as it was about actually enabling guards to identify miscreant behaviour in real time – part of the development of an illusionary ‘theatre of security’ to amplify a sense of risk (33).
In an effort to make video feeds more accessible as the role of the guard changes, one company was trying out new ways to do this: ‘We are going to try out “mobile podiums” – the guard will carry around a mobile device that will give them access to the CCTV store system’[RR14].
No doubt in certain locations and types of retail environment there may well continue to be a case for real time monitoring of multiple video feeds, but the reality of ever growing staffing costs, increasingly complex retail environments, and the rapidly developing capabilities of video analytics, may well mean that this use case becomes less evident in the retail environment of the future.
As the application of video systems in a wide range of environments expands around the world, some retailers have increasingly found that their use is no longer a business choice but rather a statutory requirement for them to be able to either operate in particular locations or trade certain types of goods (34). In addition, and as detailed earlier, some retailers have also pointed towards the way in which the use of video systems can encourage greater compliance on the part of employees. The issue of Compliance, therefore, is the fourth use case for video systems.
Like many advancements in technologies, the early use of video systems largely developed within a regulatory vacuum; users had little guidance or control over how and where they should or should not be used. Since then, many countries have developed laws and regulations covering the use of video systems in public and indeed private spaces, with some setting very strict limitations on who and what can be surveilled and for how long images can be retained (35).
However, in some countries, there are also statutory requirements for the use of video systems by retailers when operating in what are regarded as high crime/risk areas and when selling certain agerestricted products such as alcohol and cigarettes. While some respondents were not fully clear exactly why some local authorities tied the issuance of alcohol licences to the installation of CCTV: ‘we don’t know why they want us to have it…’[RR17, it was evident that this consideration was part of the rationale for its introduction. As with other use cases, it was not possible to identify the monetary value that this brought to any given retailer, although it can be imagined that for some grocers in particular, not having the opportunity to sell alcohol would create a major dent in sales and profits.
One other issue relating to the way in which video systems were used to ensure compliance related to their impact upon staff behaviour. As was discussed earlier, one retailer who was introducing video into their stores felt that it had a positive effect upon staff compliance to company processes. For another respondent they thought it impacted not only in this respect (process compliance) but also in terms of the way in which video mitigated staff responses to incidents of customer theft: ‘… staff won’t go out on a limb and ignore company policy, such as tackling a shoplifter outside the store if they are going to be caught on cameras as well’[RR17].
This is a similar moderator effect as described by some respondents to the use of Body Worn Cameras where the recorded evidence could be damning for both the offender and the wearer of the device. Of course, this type of effect (staff compliance) overlaps with the issue of deterrence as discussed earlier and requires staff to be not only aware of the presence of video systems but also believe they are effective and will invoke a reaction.
Four of the participating companies had developed a particular use case for their video systems that utilised them to create a centralised response to incidents of violence against store staff.
In all cases, some form of panic alarm device had been installed in retail stores which when triggered, would raise an alert at a Security Operations Centre (SOC). Staff in these centres would then begin viewing the live video feed from the store activating the alarm and take some type of action to try and encourage the offender to desist. This could take the form of a recorded audio message or a real time announcement addressing the offender directly. One respondent graphically described how they thought this worked:
Some call it the ‘voice of God’ effect – shocks offenders into changing their behaviour; it is also a real voice and not a recording – can make the message context specific: ‘you in the red jacket…’ [RR5].
Another respondent stressed the importance of having the capability to talk directly to the offender: ‘operators have the licence to decide what they broadcast given what they can see and what they think might put them off’[RR19]. They also stressed the importance of ensuring that those who do respond are given good training and support to deal with what could be stressful situations.
Another retailer using this type of system also highlighted the way in which it fostered a better relationship and response from the police: ‘they are more likely to respond if you have a visual verification of what is happening’[RR17].
Retailers offering this ‘service’ to their stores could only provide limited evidence on the extent to which it was impacting on levels of store violence although all thought it had been very effective in not only reducing the number of incidents but also acting as a form of reassurance to store staff: ‘they feel better that somebody is there; we get staff to point out the cameras to offenders’[RR19].
Undoubtedly, this is not a ‘cheap’ utilisation of a video system – it requires a centralised, adequately staffed and networked facility to enable it to work. But, as will be discussed below, if a retail business has taken the decision to invest in a SOC for other primary reasons, this could be a further application to bolster the business case, especially for those retailers that have stores in high risk areas and/or sell products that make them more prone to violent incidents occurring, such as pharmacy, alcohol and cigarettes.
As discussed earlier, one of the challenges of measuring the ‘value’ of video systems is that it can be tasked with delivering outcomes that are often hard or impossible to quantify financially – the intangible benefits that are much discussed and appreciated, but rarely capable of being included on a business’ P&L. A good example of this is the sixth use case identified by respondents, which is Providing Reassurance.
As one industry sage has often put it: ‘a video camera won’t leap off the wall and stop a member of staff being attacked’, which is true, and yet frequent security surveys have shown that staff and members of the public are often reassured simply by the presence of CCTV (37). So how does this work? It could be that the video system provides a visual reminder of four things: security is being taken seriously by the business; somebody is looking out for me; somebody can send help if anything were to happen; and if something does happen, whoever did it is more likely to get caught and won’t be able to do it again. One of the respondents summarised how they viewed this:
From a colleague perspective, I feel reassured that if anything happens then there will be evidence available. Staff feel reassured because they think the cameras will generate a response. The reassurance comes from the fact that evidence will be available after the fact and that will increase the likelihood of the offender being arrested and prosecuted. As long as staff perceive it like that, then that’s good [RR17].
Of course, the reality for most staff who are operating in retail stores not offering the panic button-based system described above, is that their store video system is essentially just a symbol of organisational commitment and potentially a means to collect post-hoc evidence should violent incidents occur (38).
For customers, the presence of video systems in retail stores is now very much a given, perceived as an integral part of the fabric of modern retail environments and one part of a societal infrastructure of security technologies increasingly visible across many urban spaces.
As such, retailers that choose not to participate in this ‘theatre of security’ may be open to accusations of not taking the safety of their customers ‘seriously’, a charge that is likely to be alarmingly amplified through social media when incidents take place away from the gaze of a video camera. As such, while enabling the delivery of reassurance is another one of the intangible benefits of video systems, it may be one that retailers will struggle to ignore in an increasingly connected and surveilled world.
While the previous six use cases for video have had a predominantly security and safety focus, the seventh is related to how it can be used to make retail organisations more operationally effective and profitable. This broadening of the use palette beyond loss prevention is certainly something which has been much trumpeted by the video industry although to date few of the respondents to this research could point towards concrete examples within their businesses.
However, as will be detailed below, the growth in the use of video analytics may herald a brighter future for this particular use case. In this respect, this section only looks at examples of enabling and informing businesses through video systems that do not rely on some form of automated analytic capability.
Only two of the companies responding to this research provided examples of how their businesses were using their existing video systems to better understand customer behaviours (39). The first was driven by their Customer Experience team and was based upon them taking a video feed from a selection of stores to review aspects of how stores are operating. This is an example of using video data to understand and research how business decisions affect the customer experience. A respondent described how this worked:
The Customer Experience team is driving the initiative, so they get a feed to their desks from the stores. It’s pretty expensive, so it is not scalable in all stores but may be useful when using a sample of stores from different parts of the country. It’s a research activity rather than BAU [Business as Usual] [RR7]
Another respondent described how they had used a similar process but, in this case, used it to analyse the behaviour of customers when they arrived to pick up their orders and whether this had driven any upselling behaviour. Both examples identify how a non-security function had begun to utilise the potential of video to inform their business decisions, albeit in a very manual and limited fashion. In neither example were companies able to offer what the financial value of these activities had been.
One other area where video was used to inform and enable business decisions was in the review of store designs. As with the previous examples, this was very much a post hoc manual review process looking at how the flow of customers was occurring and how changes in product location had an effect. Through a detailed research process analysing video and sales data, one company had been able to make changes to their store design to reduce bottlenecks in certain parts of their stores at peak times. This was very much an example of one-off highly focussed uses of video by non-security functions rather than the systematic and routinised use that will be considered in the next section.
The final use case proposed by respondents to this research was focused upon the use of video systems to undertake some form of analytic enabling the business to automatically identify and detect events of interest. This is regarded by many as the ‘new frontier’ of video systems although some of this capability has been in operation for many years, such as the detection of unauthorised movements in the proximity of buildings (40). In other areas, the technology is much more cutting edge, such as developments in facial recognition and the automatic identification of shop thieves. Given its importance, the next chapter of this report will focus exclusively on this topic, looking at how it is being used and some of the challenges it presents to those looking to deploy it within their businesses.
When summarising how and why retailers have traditionally used video systems within their businesses, they can be grouped into eight distinct areas although some are far more prevalent than others (Figure 2). What can also be see is that they can be further categorised into 4 distinct modes of use: Passive, Reactive, Active, and Proactive.
Passive uses of video rely upon its presence to generate some form of response or enable an activity to happen – deter thieves from committing a crime, encourage staff to comply with business procedures and/or ensure businesses met legal requirements. Reactive uses focus upon the post hoc review of video data to enable undesirable events and players to be identified and/or business decisions to be informed and improved. Active use is very much concerned with utilising video systems in real time to instigate a response and/or identify miscreant behaviour. Finally, Proactive use is concerned with how video can be used to automate, speed up and improve many of those activities currently undertaken through the Reactive and Active uses thus far. It is this use case that the next section of the report will now focus upon.
Taken together, this model goes some way towards expressing the sheer breadth of use cases and potential value video systems can bring to a retail environment – it is indeed a multi-faceted technology.
This next section of the report looks in detail at the issue of video analytics and how they are being used in retailing. As a term, ‘video analytics’ has become the next ‘big’ thing to supposedly transform the retail landscape. Hearing promises that Artificial Intelligence (AI) and Machine Learning (ML) will revolutionise the industry are not hard to find although pinning down how this will materialise remains more challenging (41). All the respondents to this research had had some form of experience of utilising various types of video analytic, most with varying degrees of success.
Depending upon who you talk to, video analytics can be defined in many ways. Indeed, some prefer to call it something different, such as Computer Vision or Video Insights. One retailer respondent offered their perspective on how the term video analytics might be understood:
Taking video and interpreting it into something that you can derive an action out of that would prompt some type of either associate engagement and/or backend system to operate differently in order to satisfy whatever the focus is [RR12].
This certainly seems to capture the key elements of what video analytics are designed to achieve although what is missing is the sense of computerised automation: taking out the supposed fallibilities of the human tasked to watch and review. In this respect, a more succinct definition of video analytics might be: ‘The capacity of computers to automatically interpret digital images in order to provide insights of value to the user’ (42). The key of course is whether they generate value – video technologies are not necessarily cheap, and as will be discussed in detail below, ensuring that they generate a ROI can be a considerable challenge.
For those with long memories of the world of retail loss prevention, the current promises offered by some providers of video systems addresses a never-ending itch to finally discover the technological panacea that will solve the shrinkage and loss problem. In many respects, video analytics fits this bill perfectly, offering a technology that will, amongst other things, transform the offending landscape – identify shop thieves before they even steal – finally delivering the full Minority Report ‘dream’!
The retail industry has of course been here before, not least with RFID and promises in the early 2000s that it would bring an end to shop theft through the creation of a completely transparent retail supply chain where every product could be uniquely tracked in real time. We all know how that ‘Peak of Expectation’ ended – in the ‘Trough of Disillusionment’ before the technology gradually matured and much more feasible, limited and achievable goals were eventually realised (43).
This concern of over selling the potential of video analytics was made very clear by a number of respondents to this research: ‘The thing with video analytics is that there have been claims, promises, suggestions for years and years that it can do all kinds of things and inevitably with lots of it, it ends up being Emperor’s new clothes’[RR3]. Another felt that the companies offering video analytics were simply out of step with what the retail industry now needed: ‘It’s providing solutions to problems we had 10 years ago that we don’t want to fix anymore!’[RR7]. For others there was a problem with what was being promised and what could be delivered in the real world: ‘… always works great in the lab but different story when you put it in a store’[RR10]. Even representatives from the video industry itself were concerned about some of the claims being made about what video analytics could deliver:
There has been an explosion of claims about what these systems can do, but the key is what problems are retailers trying to resolve and how might these systems be leveraged to make an impact upon them? And how will you know if you have made an impact? [RR4].
Given these types of comments, it would seem sensible for video analytic providers to temper their claims and look more closely at exactly what retailers are concerned about the most, how their technologies will address these needs and operate in the ‘real’ world, before developing their systems further.
More positively, for some of those working in the video industry, there was a growing realisation that the field of video analytics was now beginning to mature and was getting to the point of where it might begin to be scalable: ‘We are right at the cusp of discovering just what can we do with these systems; there are a lot of technologies that are emerging to the point where they can begin to scale’[RR4]. For another provider, there were still real issues about how you might make this technology pay for itself: ‘… people seem to be struggling to make the business case’[RR9]. The views of retail respondents varied considerably on the extent to which they thought video analytics could make a difference at this time, although the following summarised neatly what many said:
It’s nowhere near as mature as people think it is; the general populace believes that magic can happen with video analytics, that with cameras and computers and enough processing you can do all kinds of things. The reality is that it’s still a camera looking at something and trying to make interpretations of pixels [RR21].
What was clear from the both retail and video industry respondents was a growing realisation that like a number of other technologies, on its own, video analytics is simply a means to an end rather than an end in itself – it will not work unless it is interpreted, integrated and acted upon:
The video needs to be translated into data points to make sense of what it’s saying and that can be hard when you try to automate it. People seem to jump to analytics very quickly thinking it will provide answers/insights just by installing cameras and software and it just doesn’t work like that [RR9].
However, for some retailers it offered a way to begin to understand and respond to events happening in their stores that would be difficult to achieve in any other way: ‘We work on the principle that all movement can be tracked and rules built to identify when that movement journey is potentially illicit – movement of burglars in the back of the store, movement of cash from the till to a member of staff’s pocket, movement of products from the shelf but not going through the till’[RR17].
For another the key to utilising this technology was capturing its value as quickly as possible: ‘I’ve always found that the biggest advantage is when you can grab that video and derive some immediate action out of it that changes how you are conducting business real time’[RR12]. For another retail respondent, their concern was about ensuring others in the business did not begin to expect too much too soon from the technology: ‘Part of the challenge we have is controlling what the expectations are – people begin to say that it could do something and it can, but at what level of accuracy?’[RR21].
What follows is not meant to be a comprehensive review of all of the video analytics that are available in the market at the moment – it is merely a snapshot of systems either currently in use or under trial in the companies that were contacted as part of this research (albeit some of the largest retailers in the world). The analytics have been grouped into those that have a security and safety focus and those that are designed to provide business intelligence. It would be fair to say that compared with the extensive use of more traditional forms of video systems described in the previous section of this report, there were relatively few examples of video analytics currently being used by respondents to this research although there was a considerable amount of planned testing and trialling under consideration.
Using video analytics to automatically trigger an alert when movement occurs was the most used system to address security and safety issues. These triggers were focussed either on specific locations or products.
Respondents to this research highlighted several locations that they monitored with motion alert analytics: store perimeters; specialist store counters and display cabinets stocking high risk products; store backroom areas; fire exits; store cash offices; and store entrances/exits.
The protection of store perimeters against the threat of burglaries utilised a range of video analytic technologies, but the primary purpose was to provide an alert when unauthorised people entered a particular space, giving the company the opportunity to contact the police and/or communicate with the would-be burglars to try and deter them.
Depending upon the technology used, the efficacy of these intruder alerts varied, mainly due to the challenges of the context in which they were used – weather, lighting, movement of animals could all generate false positives, which had led some respondents to invest in Thermal cameras to try and improve the reliability of the analytic (44). The issue of using video systems by SOCs to address the threat of burglaries will be considered in more detail in the next section of this report.
Another use of motion triggered analytical alerts related to out of hours stock deliveries to retail stores. For several respondents to this research, this was a growing area of interest as road congestion and store optimisation strategies were increasing the frequency of these types of activities. One retailer explained how this worked for them:
We now use it [video analytic] for out of hours deliveries – drivers have a fob that deactivates part of the store alarm system and motion detectors then trigger the CCTV if they go beyond designated areas – watched by staff in the SOC [RR11]
As with this example and others, retailers using motion triggered analytics generally limited their use to low/non-use spaces and times to try and deal with the issue of monitoring staff receiving too many alerts/false positives, something which will be discussed in more detail below. For instance, alerts relating to movements across a particular retail counter were limited to a certain time of day: ‘95% of our motion alarms are set up for overnight use – low customer volume times’[RR14]; ‘We use it mainly to monitor excluded spaces such as the back of store when nobody should be there[RR21]’.
Several respondents were actively trialling a range of motion-based trigger analytics that sought to identify when a certain number of high-risk products were removed from a shelf by the same customer, for instance, three or more bottles of high value alcohol. Compared with location-based triggers, this is a much more complex and demanding environment within which to utilise a video analytic and this was very much reflected by the ongoing challenges they faced in achieving a tolerable level of alerts that could be regarded as ‘useful’.
In two cases, the analytic was designed to send an alert to a designated guardian, typically a security guard, who was then tasked to monitor the customer that had triggered the alert to ensure that they eventually purchased the goods they had removed from the shelf.
In another example, the video images were simply recorded and potentially reviewed later. With regard to the former example, this is certainly an interesting analytic in terms of trying to have a degree of forewarning of potentially deviant behaviour; the challenge of course is having a response resource available, communicating the alert to that resource in a timely fashion, and ensuring that the resource is not overwhelmed by the number of alerts they receive, leading to response fatigue. At the time of writing, this remained a critical issue and will be discussed in more detail below
In terms of the latter example, then it is questionable what value collecting this type of information might bring unless each alert is researched to ascertain whether selected products were eventually stolen or not. This information could then potentially be used to help with the identification of persistent offenders, and/or create a better understanding of the roots causes of losses that are initially coded as shrinkage/unknown loss. If done in a timely fashion, it might also be useful in better managing stock files to reduce the problem of out of stocks – more quickly replacing items that have been genuinely stolen. However, unless the number of alerts is kept to a manageable number, in retail chains with many outlets, it would inevitably require a very substantial human resource to undertake this type of process.
The use of video analytics to try and identify incidents of violence or the threat of violence were much more in the experimental stage of development and use. For some US-based retailers, the threat of lone shooter and others forms of incidents involving guns had driven their interest in trying to develop video analytics that might be able to provide some form of advanced warning of these events.
For instance, some were looking at whether analytics could ‘identify’ when a person had taken up a stance that might indicate they had a gun, or indeed identify the gun itself. Other examples included the sudden movement of groups of people (possibly fleeing due to an elevated risk), people falling over, erratic driving behaviour in the car park; when two human objects became one (possible evidence of an attack); and elevated voices (indicating agitation).
However, ensuring that these alerts were ‘accurate’ was a primary concern, as was the capacity to actually respond to what are often very rapidly developing incidents: ‘this is not easy at all and the price of getting it wrong could be very costly – you can’t have false positives with this type of scenario – this is really hard stuff’[RR12]
As will be discussed below, there seems to be an inverse relationship between the accuracy of video analytics and the complexity of the environment within which it is used – the more complex the inputs, the more likely the outputs will be questionable and indeed actionable.
When it comes to creating video analytics to automatically identify the initial indicators of a violent incident this is certainly the case and is further compounded by the consequential effects of getting it wrong, such as customer injuries through panic and business disruption. As such, few respondents developing these types of analytics felt they would be rolled out anytime soon.
At the start of this research, a much cited, albeit anecdotal utilisation of video analytics, was the identification of refund frauds by staff where a customer was not present. The idea is that when a member of staff begins to process a refund, some form of video-based analytic would analyse the environment in which it is occurring and generate an alert when it was deemed that no customer was in the vicinity, indicating that a fraudulent event was underway (45).
Several respondents had tried this analytic, but none had been able to make it work sufficiently well to warrant moving beyond an initial trial. The main difficulty was the dynamic nature of the environments within which refunding activity often took place, which lead to a unacceptable number of alerts and false positives: ‘Difficult to get customer not present to work in most store formats – people walking by all the time in small stores’[RR19]; ‘tricky to do customer not present because of the design of the till areas – hard to get CCTV coverage that would work’[RR2]. For another company, the complexity of their business model undermined the analytic: ‘… it’s not working well at the moment – processing online returns mean that a customer will not be present, but it will trigger the alert’[RR16].
No doubt in the ‘right’ retail environment where complexity can be minimised, then this analytic may offer some potential, but for those taking part in this research, it was, in many respects, an analytic that was not currently fit for purpose.
The rapid growth in recent years in the use of a range of self-scan technologies (SCO), particularly in grocery retailing, has seen a significant increase in losses associated with this type of checkout option (46). Of particular concern are losses which are generated at Fixed SCO devices where customers maliciously or non-maliciously not scan some of their items, engage in mis-scanning behaviours where product substitution takes place (grapes for onions scams), or where customers scan all their items but simply walk away without paying for them (47).
These are challenging problems to address, not least because of the sheer number of transactions now being processed through SCO, the range of relatively risk-free opportunities that have been created, and the difficulty in identifying these types of errant behaviour (48).
It is perhaps, therefore, not surprising that technology providers and retailers are actively exploring ways in which video-based analytics might be developed to address these issues. Three main types of video analytic are evident: identifying when a SCO user has not scanned an item, identifying when a SCO user is misrepresenting what an item is, particularly when using the produce weigh scale, and identifying when a SCO user has walked away from a SCO machine without paying for the items they have scanned. At the moment, it is not possible to provide any definitive assessment of the effectiveness of any of these analytic interventions at this time – future research is planned to focus specifically on this area.
This type of video analytic has been in development for a considerable period, initially focussed upon trying to identify errant behaviour at staffed checkouts, such as sweethearting (49), before being adapted for use on SCO. The idea is that the movement of objects around a point of focus can be tracked and linked to triggers/non triggers of the Point of Sale (PoS) software. For instance, if a customer moves a product across a checkout station but does not trigger a PoS transaction, then this will generate an alert. Other iterations include analytics which seek to identify objects that are left on the ‘non-purchased’ side of the checkout station when a consumer begins the payment process, or when objects are left in a trolley or basket.
These video-based alerts can then suspend the checkout process until a member of staff has verified the activity. In early iterations of this analytic, the relative complexity of the environment (such as differentiating between a consumer’s hand and a product, multiple items stacked on top of each other, items owned by the consumer, such as hand bags etc.) meant that ambiguous alerts where often sent for review by humans, making an immediate response largely unworkable. However, recent developments seem to have improved their capability to be more accurate and timelier, although further research is required in this area.
Attempting to identify when a SCO user is misrepresenting products is arguably a much more challenging analytic to deliver and this is reflected in the small number of technology providers currently able to offer this capability at scale. Part of the problem is that the analytic has to be able to identify what the object is that is being presented for purchase – it potentially has to ‘know’ the difference between say a banana and an apple, or a bottle of wine and a bottle of cooking oil. In addition, it must be able to do this without creating an unacceptable delay in the checkout process. Given that some of the larger retail grocers may stock in excess of 50,000 stock keeping units (SKUs), this is a potentially daunting task for any video analytic provider to deliver at scale, with tolerable levels of accuracy, and ensure customers are not overly inconvenienced (50).
Until this capability is fully deliverable, some analytics providers have developed a more limited approach – aiming to identify when an object is not something that is known. In this approach, the analytic is trained to recognise only a small number of items that are often selected to represent more expensive and desirable products. For instance, mis-scanners will typically choose from a relatively small range of low-cost fresh produce such as onions, carrots and potatoes to ‘represent’ more expensive items such as avocados and soft fruits. In this scenario, the video analytic ‘knows’ what an onion or a carrot looks like and so it therefore ‘knows’ that when another object is placed on the scale (say an avocado), it is not one of these identifiable objects. If the mis-scanner presses the button representing onions, the analytic will not know that it is in fact an avocado, but it will know that it is not an onion and therefore generate some form of alert.
In many respects this a clever way to deal with a challenging issue although there are limitations, not least that scammers may begin to adapt their product choices to defeat the analytic. But it is also a good example of the challenges of delivering analytics in the real world – having the capacity to identify products would undoubtedly be a major step forward in resolving the mis-scanning issue as SCOs, but the reality is that the computing capacity to do this in real time, reliably, consistently and at scale, is simply not available at the moment for the vast majority of retailers.
The third area where there was evidence of using video analytics to address a SCO related loss problem was around the issue of ‘walkaways’; consumers scanning their items but then leaving without paying. Again, this is a challenging issue to address – how do you differentiate between the consumer that simply takes a long time to complete the payment process and those that deliberately walk off without paying? What if a consumer starts their transaction on one Fixed SCO machine and then decides to complete it on another? What if a consumer simply decides to abandon their shop halfway through the checkout process, leaving all the products behind? These are complex scenarios for a video analytic to address accurately.
One company had been using a video analytic to address this issue to some degree. Users entering the Fixed SCO area are ‘tracked’ to a particular machine and then must be linked with a payment transaction before the exit gates will open to allow them to leave.
While the trial had not been without problems, not least when the area gets very busy and therefore the exit gates are open almost all the time, and there is of course no guarantee that a user has scanned all the items they have brought to the SCO area (in theory they only need to purchase one item for the gates to open), the participating retailer was encouraged by the ‘sense of control’ that it portrayed.
As with all these new and rapidly developing video analytics, much work is required to understand how well they will address the issue of SCO-related retail losses, but these early iterations certainly seem to offer promise.
Perhaps one of the most contentious video analytics now is Facial Recognition (FR). Across the globe it has sparked considerable debate about whether it is a force for good or one of the more sinister surveillant technologies developed thus far (51). Respondents to this research fell into three overlapping camps when it came to the use of FR technologies in retailing. There were those that considered it to be one of the most important crime prevention developments in recent years, those that were deeply concerned about the negative impact its use could have on their company’s brand, and those that were struggling to understand how it could be made to work cost effectively in a retail environment.
A majority of respondents to this research were extremely interested in what FR could potentially do to help them manage their problem of external theft by persistent shop thieves: ‘I am mad interested in what this could do’[RR6]; ‘…from a security and safety perspective, I would love to be able to use this technology’[RR18]; ‘It would be great to identify fraudsters when they enter a store …’[RR16]; ‘It would be amazing and enable us to track people around the country – open up a whole new method of doing this which is currently very manual’[RR11].
Indeed, one respondent suggested that it was the first real step change in loss prevention technologies in many years: ‘… it gets me excited again about what we might be able to do in loss prevention’[RR10].
In many respects it is easy to see why respondents would be very interested in this technology. Trying to identify when a persistent offender has entered a retail store is not an easy task – existing strategies typically rely upon staff present at the time remembering what they look like and then taking appropriate action. This is even more difficult when offenders travel to different locations – retail staff will often be reliant upon grainy CCTV-derived photos that have been previously circulated to try and spot them.
In addition, building case files against these types of offenders is currently a highly manual and time-consuming activity – piecing together video images as and when shop thefts occur based upon local reporting. Therefore, the allure of FR is readily apparent – the ‘automatic’ identification of known offenders as they enter any property covered by the technology – as the saying goes: ‘what’s not to like about that?’ Of course, the reality is much more nuanced and of considerable concern is the extent to which FR will be viewed by customers entering stores where it is in operation as an overly intrusive surveillant technology
This concern was the biggest factor that was currently holding back the use of FR by many of the respondents to this research: ‘There is a diktat that we are not going anywhere near facial recognition because of reputational issues’[RR3]; ‘the reality is that we are not going to be the first to do it because the brand reputation aspect is so damaging’[RR18]; ‘… the business is terrified of it’[RR6]; ‘the Lawyers have said no …’ [RR16].
These widespread concerns spring from a considerable amount of negative publicity that FR has garnered in several countries in recent times (52). It has also undoubtedly been influenced by the way it is currently used by the Chinese Government: ‘I think people get creeped out by it – China is poisoning the well of acceptability of Facial Recognition’[RR22].
In many respects, FR is like several other surveillant technologies that have been introduced before, which in their time equally challenged existing norms. FR is currently at the ‘frontier’ of public acceptability, like when the use of CCTV in public spaces was rolled out in some countries in the early 1990s and the initial use of RFID began in the early 2000s. Both technologies provoked considerable public debate and at times outrage – prompting calls that they were a serious infringement upon privacy and an unacceptable expansion of a govt/business surveillant web (53). As we now know, over time these concerns have subsided, and they are no longer viewed as at the frontier of acceptability but simply just another part of the ‘modern’ world.
Part of the current ‘acceptability’ problem for FR is the extent to which it is viewed as a step change in video surveillance, moving from just monitoring and recording anonymous individuals to actively identifying people and logging their movements and activities without their consent. It is also seen to do this for everybody caught in its gaze rather than just those designated to be of interest, fostering concerns about the widening and deepening of the surveillant web (54). Consequently, this has raised numerous concerns about the legality of its use by public and private bodies and the purposes for which it can be used (55).
It may well be that over time the ‘frontier’ of acceptability will shift, as it has done with other technologies, and these debates about FR will wane. But for the moment, its contentiousness is making most retailers sufficiently anxious to step away from considering its use currently.
While these concerns had led most to not even begin using FR, some respondents had undertaken trials in retail stores and reported mixed views on its applicability and use. Perhaps the most pressing issue related to the challenges of developing appropriate processes and practices to not only collect and collate the data but also respond to the alerts generated by FR. One respondent highlighted these issues:
We are running it in x number of stores in y areas – the challenge is that you have got to have an input to make it work – need to define the rules that will decide who goes on the database and then how the limited amount of labour in the store will use it [RR14].
For others interested in its use, similar concerns were raised: ‘The real question is what do you do with the information – what do you want staff to do with the information?[RR6]’. For another who had tried using FR, getting the alerts to staff in a timely fashion had proved challenging in their retail environment:
We couldn’t get a reliable process in place for getting alerts to the store team – had to set it up to email which just caused an unacceptable delay – if they are not looking at email then nobody will get the alert in time [RR10].
For another company that had tried using it, concerns about the veracity of shared ‘watch lists were a real problem: ‘when accessing networked images, I am taking the liability of whoever has inputted that data – what criteria did they use? [RR14]’. For another there were issues about deciding how long images should be retained and what liabilities there may be associated with this process. Finally, two respondents who had tried using FR highlighted the challenges they had faced trying to generate an ROI: ‘We think it is super expensive – hard to get the ROI at the moment’[RR14]; ‘We don’t think that the capital expenditure is worth it in our business for the problem it reckons to solve’[RR15].
Certainly the issue of how to operationalise the data coming from FR systems is something that future users will need to address – which store staff should receive the alerts, how should they respond, and how do the answers to these questions fit with a company’s stated policies on staff safety and security? As more and more retail companies review the extent to which they want their store staff to engage in incidents that may generate a violent outcome, the potential role of FR technologies will need to be reviewed and determined.
More positively, one respondent to this research shared their experiences of using FR in a non-store environment – controlling staff access to a distribution centre – and was extremely enthusiastic about the contribution it was making: ‘really speeded up the process, eliminated clocking in frauds, and all staff bar one fully accepted it!’ [RR15]. It could well be that in more controlled environments, particularly where overt consent for its use can be ascertained, it may be able to bring significant value and largely avoid the currently swirling claims of ‘big brother’ is watching you that are undoubtedly halting its use in most public spaces at the moment.
While much of the historical focus of the use of video systems in retailing has been upon safety and security issues, some companies have also tried to utilise it to inform and enable their business decisions, albeit to a much more limited extent. This section of the report reflects further upon this issue and considers how some respondents have been using video analytics to deliver various types of business intelligence to improve service levels and profitability. The list of potential analytics is long – one respondent shared over 35 different metrics one provider could generate from their system, but the reality for most respondents to this research was a very limited palette of use cases thus far.
Two respondents offered examples of how their companies had begun to use a video analytic to improve the way in which customers were served in their stores. Both worked on a similar premise which is that when a customer has to wait an unacceptably long time to receive help from a member of staff, this can negatively impact upon sales, either through the customer simply walking away without purchasing anything, or deciding to choose a cheaper option with a lower margin because they cannot access what they want.
In the first example, a camera analytic generated an alert when a customer was present at a food counter but there was not a member of staff present to serve them. The alert is sent directly to a member of staff who is responsible for the counter, via a personal pager (they may be away from the counter carrying out some other activity). While not prepared to share precise details of the financial impact of this intervention, the respondent was clear that it had driven a significant uptick in over-thecounter sales: ‘it was good for sales but also it was good for customer service which is a key part of what we want to be known for’[RR5]. They are now looking to see how they can roll this out to all their stores and utilise it in other store settings.
The second example of using an analytic to improve customer service was the monitoring of products to ensure they did not go out of stock. This is certainly a growth area in terms of technology companies developing systems that will monitor store shelves and create alerts should stock levels drop below agreed levels. In the examples shared by respondents to this research the scale of the analytic was relatively limited to just a small number of fast-moving SKUs, such as the monitoring of cooked chickens in a rotisserie. However, initial feedback on this utilisation was promising, and given the damaging impact out of stocks can have on retail profits, the use of video analytics in this area would seem to be an interesting and potentially valuable proposition (56).
An area where the non-security use of analytics has been vigorously marketed for some time by the video industry is the monitoring of customer dwell times and the associated store heat maps to show this and other data points. In and of itself this type of data is potentially interesting, but the challenge faced by many of the respondents that had tried this analytic was making it generate a measurable ROI. One respondent described their experience:
Difficult to see the financial return – we have tried to use it to drive sales but really hard to measure its impact – can you clearly attribute the changes based upon heat mapping to the benefit? Hard to know so it’s dropped off the agenda – we’ve got the data but we don’t know what to do with it [RR7].
Some of the video technology respondents were equally perplexed in how to justify expenditure on this type of analytic:
The ROI is tough to put your thumb on, but a lot of goodness can come from it – reorganising stores based upon customer flows and shopping behaviour. We couldn’t come up with an ROI for Heat Maps – it provides a lot of nice to know but you can’t sell a solution based upon that [RR22].
Even those respondents who could be described as at the forefront of trying to utilise a range of video analytics were rather sceptical about how this type of data might make a difference: ‘In terms of the dwell time component [of their video analytic system], no, we’ve not been able to put a ROI model together for that’[RR12].
It may well be that on their own, video analytics purely focussed upon the monitoring of customer dwell times may not at this time make financial sense for a retailer, but it could be that if they are part of a broader package of analytics, then the ‘goodness’ it might provide could be viewed as a reasonable ‘value added’.
A relatively well-established use of video analytics for business intelligence purposes has been the counting of customers entering and leaving retail stores and the management of queuing, although the latter is perhaps less common. But, as with Heat Mapping and Dwell Time analysis, respondents were unsure as to the extent to which they were able to measure the related ROI:
‘People counting, hard to show the pay back. I guess it is good information to have but doesn’t necessarily generate a direct benefit out of knowing this information. It’s hard to peg some of this down to a number [RR12].
Other respondents were equally sceptical about measuring the value, but could see some benefits: ‘good way to check the amount of people in one area to avoid overcrowding’[RR4]; ‘we have used it to adjust the air conditioning in stores – does it need to be on given how many customers have entered?’[RR6]. It was also seen as a good way of generating information on rates of conversion, which can be a useful metric for some retailers to review.
In terms of Queue Monitoring, relatively few respondents had experience of using this type of video analytic but those that had regarded it as ‘useful’ in the right environment, so long as it was carefully tuned to the context within which it was used: ‘works OK until somebody changes the store layout and doesn’t shift the cameras at the same time!’ [RR19]. Only one respondent was using it in real time to monitor queues and alert store staff accordingly – for the others it was more a retrospective analytic to review staffing rotas and store layout designs. As will be discussed below, there are other technologies that can be used for people counting and therefore the issue of ROI when utilising video technologies remains a pertinent issue.
More recently, as a consequence of the COVID-19 Pandemic in early 2020, and the need for enforcing social distancing in retail stores, especially limiting the number of customers who can enter, video analytics have been deployed to manage this process. These systems track the number of customers entering and leaving a store, often providing prompts to customers waiting to gain access. At the time of publication, it is not clear for how long these systems will be required and whether they are more cost effective than employing a member of staff to carry out the same function.
Finally, there were two examples of trying to use video analytics in the supply chain. The first was a respondent that was exploring the use of video analytics to automate the identification of deliveries to retail stores. While this was described as still very much in the ‘experimental’ phase, the idea is that cameras could be used to ‘identify’ QR codes on arriving pallets in order to speed up the receiving process. The other example related to distribution pick accuracy: ‘We are going to be using the [name of provider] video system so that when a picker goes to location where they haven’t been sent there will be an alert [RR15]’.
In addition, to sharing some of the ways in which they have been utilising video analytics, respondents were also able to elaborate on many of the challenges they have faced getting them to work (or not) in their retail environment.
One of the key attributes of video analytics is their capacity to automatically generate alerts and triggers to help retailers filter the vital few events of interest from what can be a veritable Tsunami of images generated by their video systems. However, ensuring the veracity of these alerts is vital and at the same time challenging. There are four main outcomes that are possible from video analytic-based alerts/ triggers, summarised in Figure 3.
Table 1 provides some examples of how these outcomes may come about from various types of video analytic trigger objectives.
Ideally, a video analytic works as designed and only generates an alert when a predefined and proscribed activity or event takes place – a True Positive. The opposite is where the system does not trigger when such an event or activity has taken place – a False Negative. There are then triggers which generate False Positives – the system thinks it has identified an event or activity that meets the proscribed requirements, but it is in fact incorrect. Finally, there are what can be described as Overload Positives where, due to ill-defined and overly inclusive parameters being set, the system generates an unacceptably high number of trigger events that are technically correct.
While there are certainly overlaps between False Positives and Overload Positives, there is an important distinction to be made between the two. The former can be considered as a largely inevitable and understandable outcome of the relative complexity of the proscribed trigger, while the latter can be viewed as mainly due to insufficient rigour in defining the purpose of the trigger and the environment within which it will operate. Another way to potentially differentiate between the two is to think of False Positives as events that can be better managed through system refinement (teaching) while Overload Positives probably require a more substantive review of the originally intended purpose of the video analytic (reconceptualisation).
Several respondents to this research had detailed experience of trying to get a range of video analytics to work ‘correctly’, that is generate only True Positives, but their reality was often much more about managing the volume and consequences of False and Overload Positives. Indeed, as will be described below, to date, the Achilles Heel of most video analytic systems would seem to be the unacceptably high levels of these types of alerts, especially when utilised to try and capture hard to identify behaviours in challenging retail environments.
By far and away the biggest issue raised by respondents who had tried various types of video analytic was getting the number of alerts and data points that they generated down to an acceptable number that could then be actioned. One respondent provided a graphic example of how this can pan out:
We had an analytic set up that if you dwell for a period of time then it would send an alert to the SOC and then the video would be reviewed retrospectively – look for an association with sales or no sales; perhaps follow them around the store and see if they steal etc. Sounded like a good idea but this ended up generating between 1,000-1,400 events per store per day. Now we have over 670 stores where this is installed, so over 900,000 alerts a day...! [RR18].
This is clearly an example of an Overload Positive – beyond employing an army of staff to review all these events, the analytic as proscribed, is simply impractical for any retailer to utilise. The analytic is technically alerting correctly – recording whenever a customer dwells in front of certain products for a given period. The problem is the lack of a sufficiently nuanced link between a desired objective (identify shop thieves) and a verifiable indicator trigger (dwelling in front of high-risk items). The latter, while undoubtedly a potential sign of forthcoming errant activity, is also a very common behaviour of non-errant shoppers as well. Unless the link between objective and trigger can be more finely tuned, perhaps through using a series of indicative triggers, then the result will be a surfeit of Overload Positives.
Other respondents had similar issues about not only the sheer volume of data that their video analytic systems can produce but more importantly, the quality of the data: ‘Big part of this is how do we filter the alerts down to get say 10 a day rather than 100?; at the moment, the false positives are way too high – only 5% of alerts are genuine’[RR17]. In recognition of this, one retailer had put in place a review process to try and deal with the analytics that became overly burdensome: ‘We do regular reviews to identify high volume alert areas – we are pretty proactive about making changes to ensure we don’t burden the team with too many false alarms’[RR14].
A good example of an analytic that simply generated too many alerts was where it was connected to an existing Electronic Article Surveillance (EAS) system, namely activations of the alarm at a store exit. Several respondents had tried to do this but had given up due to the sheer volume of alerts it had generated: ‘We did have EAS activations connected to the SOC but we have turned it off because there were so many false positives – we don’t have the time to deal with them’[RR19]; ‘would be overwhelming to look at those in real time’[RR2]; ‘the alarms went off all the time – people just ended up ignoring them’[RR11].
As an industry, retailing has traditionally struggled to deal with the credibility issue associated with EAS alarms – the rate of false positives has led to most activations simply being ignored by staff, to the point where the sound of EAS alarms has almost become shoppers’ standard sonic soundtrack as they navigate through the retail landscape. Evidence from this research would suggest that introducing a video analytic to monitor these activations is not a practical option at this stage – it would seem to be incapable of filtering the vital few from the trivial many.
In addition to issues about the veracity of the data generated by video analytics, another concern was the significant challenges users faced when trying to scale up their use, not least because of the significant impact of context on the efficacy of the system:
Moving from proof of concept to full deployment can be much more difficult because of the contextual variations. Older systems were not terribly sophisticated to deploy – running some cables and linking up some cameras and flicking a switch – easy to replicate from store to store [RR4].
A number of retail respondents had found particular issues with the vagaries of store context and how it undermined their capability to utilise video analytics: ‘All these analytics have flaws – if the environment changes then the parameters for the edge device [camera] have to be adjusted – queue management – what happens if the design of the space changes?’[RR10]. Others agreed, pointing to the difficulties of monitoring and managing these changes across an entire estate: ‘OK for one store but making these adjustments constantly across an entire estate is much more difficult to deliver; can’t rely upon the store manager to make these adjustments’[RR10].
For another retailer, the variation in layouts between retail stores also caused considerable challenges in scaling up the use of video analytics: ‘You have got to do your homework before putting it [analytics] in; you need to know what you will have to do differently in each install’[RR12]. The same retailer also flagged up the need for greater cross functional awareness and support if video analytics were to work successfully in a store environment:
Every group needs to understand what you are doing so that they are not interfering with it and/or make adjustments based upon the changes they are making. Everybody needs to be aligned around it and supporting it to get the end result [RR12].
For another retailer, knowing about their ever changing retail environment was a considerable challenge to getting any form of in-store analytic to work consistently well: ‘when things get moved around, we need to know so that we can adjust them [the cameras], but how will we know – who will tell us that a store layout has changed in say [store name]? [RR11]’.
For some, this extended process of trying to fine tune a video analytic to make it of value had ended up being a major distraction from resolving the underlying issue it was originally intended to address.
For example, one respondent offered a detailed description of their experience:
It [the video analytic] is down as one of the biggest distractions we ever had. It was a distraction from what we were trying to do. When the supplier showed it in trial mode it was all great, but when you are responsible for setting it up away from the trial environment, it’s a different story. We found the analytics were too unreliable – they were giving too many false alarms. The configuration was too complicated and context specific [RR20].
Others described their experiences in similar terms: ‘All the analytics trials added up to an awful lot of distractions from the core activities of managing loss’[RR18]; ‘Exception reporting can take you away from what you need to do because of the amount of time it takes to set it up’[RR9].
The experience of trying to utilise several different video analytics had led several respondents to reflect more broadly about their applicability and whether other, less ‘distracting’ and easier to manage interventions may be more appropriate to resolve the problems trying to be addressed. In addition, some respondents had found that existing (non-video) data sources could be more easily accessed and analysed to understand some of the issues to which video analytics where being applied: ‘by the time we had got it [analytic] working, we found that we could pretty much get the same insights from analysing the PoS – much easier and cheaper’[RR18].
It is of course easy to be beguiled by the ‘new’, to resolve that the appliance of science is the answer to all modern problems. Certainly, this research has identified examples of where the use of video technologies in general, and video analytics in particular, have offered by far and away the best way to address some of the problems retailers face. But it is important that prospective users always undertake a careful review to ensure that just because a new technology exists does not automatically mean it is naturally the most applicable intervention to employ.
This challenge of ensuring that any installation of a video analytic was astutely tuned to the environment within which it was being used raised issues about the selection and competence of those contracted to install and maintain the equipment. It was also seen as impacting upon the time and cost of these installations as well: these are not simple plug and play systems – they require detailed tuning to ensure that they work optimally in their given environment.
One respondent reflected upon the changing nature of the skill set that is now required: ‘finding competent guys to pull cables, pretty easy, but finding somebody who can accurately configure an integrated and complex system is more tricky – a much greater skill set is required – tolerances become more precise with these new systems – takes more time’[RR4]. For another, they had had to develop a much closer relationship with their installers to ensure the analytic would work as required: ‘There is now a closer relationship with the installers/technical team to ensure that false positives are eliminated wherever possible – need to have a learning phase when it is first installed’[RR19].
While a key attraction of video analytics is their capacity to automate the often tedious and timeconsuming process of watching and filtering video images to identify events of interest, for the most part, at some point a human response will still be required. For instance, an alert identifying suspicious behaviour in the vicinity of a store will require a human intervention to first check the veracity of the alert, and then secondly engage in actions to mitigate any identified risk.
As detailed above, the challenge comes when the video analytic generates an unacceptably high level of False or Overload Positives, which can dramatically drive down the confidence levels of those tasked with responding. As detailed earlier, the retail industry has seen this happen before with the use of EAS alarms where the general presumption is that the alert is false and therefore little, or no action is routinely taken. The danger for many emerging video analytics is that unless they can get to grips with the thorny issue of unacceptably high levels of False and Overload Positives, they could become the new ‘EAS’ – generating alarm fatigue in those tasked with responding to the alerts – seriously undermining their capacity to deliver value to the retail industry. Indeed, one of the industry respondents interviewed as part of this research reflected upon this issue:
It’s OK not to be 100% accurate but when you drop below 50/60% or even worse, which a lot of these systems do, then you start getting alarm fatigue. My worry is that they may never be reliable enough to be beneficial and so people just ignore them [RR9].
For some retail respondents, getting the labour in place to manage these alerts was a big concern: ‘Most retailers do not have the labour in place to respond to the alerts’[RR22], while others felt the additional work could prove problematic: ‘The challenge is the more you give the guard to do the more difficult the role becomes’[RR3]. Either way, as retailers and their technology suppliers reflect upon how they may begin to utilise various types of video analytic, understanding how and by whom the ensuing data will be actioned is a key consideration to be addressed.
“ Most retailers do not have the labour in place to respond to the alerts. ”
While the cost of computer processing continues to decrease in real terms and many countries are now rolling out much more capable and wide-ranging Communications Networks, several respondents to this research flagged up concerns about these issues relating to their use of video analytics. Currently, there are four main points at which the processing of video analytical data can take place: a centralised business processing centre; in the ‘cloud’ via a third-party provider; at the store/location; or on the ‘edge’ device, such as a video camera. Each of these have their own limitations and advantages, and it is not the purpose of this report to offer a detailed critique of each, but respondents did raise issues about how their bandwidth and processing capabilities impacted upon their use of video analytics. For one technology provider, where the processing occurred was a key element in trying to deliver one form of video analytic:
If you want to continually track a person and make sure they retain the same PID [Personal Identification] then that takes a lot of computational power and resource. You can’t push this to the cloud – it has to be done at the store/on the edge [RR22].
Another provider also raised concerns about the amount of processing power that some analytics required to work effectively: ‘The processing power needed to do this [the analytic] in real time in a dynamic retail space is still quite high and restrictive; in 5-10 years the processing capability will be in the camera, but not yet’[RR4]. For a third provider, the cost of this type of processing was a real issue: ‘At the moment, adding video doubles the cost of the solution – it is the server costs and processing that add a lot of cost’[RR4].
No doubt as technologies continue to develop these issues will become less pressing and limiting for retailers, but for the moment, managing the computational and communication demands of many video analytic systems will remain a considerable concern.
As with many other interventions designed to positively address retail losses and profitability, getting to grips with measuring their impact is a recurring concern. As discussed previously, this has been a perennial problem for a wide range of video-based interventions, particularly where the outcomes are often intangible and yet highly desirable, such as staff safety. Respondents were acutely aware of the challenges of measuring the difference various types of analytics might make, which in turn affected their ability to prove a ROI. For instance, an analytic that flagged up opportunities for improving customer service, such as recognising overly long dwell times in front of designated products, would be very hard to directly measure in terms of say improved sales or customer satisfaction.
But it will be important to begin to identify what the appropriate measures/KPIs of various video analytics might be and put in place well designed processes to ensure the data is collected. However, for one respondent, this in itself presented a challenge to making the case for using video analytics: ‘The challenge is prioritising IT to get the reporting elements working to prove ROI; problem is you have to put an ROI together to make the case for calculating the ROI on the video analytic!’ [RR14]. While slightly tongue in cheek, this response does highlight the need for a co-ordinated and considered approach to measuring how and why any given video analytic impacts upon a retail business.
What seems clear from the respondents to this research is that the efficacy of most video analytical systems is compromised by two intertwined factors – the clarity with which the objective of the analytic can be defined and the degree of retail complexity within which it will be asked to operate. As the retail environment becomes more complex and demanding, and the link between a stated objective and an outcome trigger becomes blurrier and more ill-defined, then the likelihood of False and Overload Positives becomes more of a reality (Figure 4).
For example, the objective of identifying shop thieves in certain retail environments from their behaviour is complex and difficult – many of the ways in which they act increasingly mirror the activities of normal shoppers. In addition, the growing use of different types of self-scan systems and the encouragement of shoppers to bring their own shopping bags makes the identification of errant behaviour extremely challenging. Designing a reliable video analytic that will generate a high proportion of True Positives in this type of scenario seems at best optimistic. More positively, where the link between objective and trigger can be more clearly identified and the context simplified, then the prospects are more promising. For instance, identifying when a person enters an unauthorised space that is currently not in use is a relatively easy situation for an analytic to be successfully deployed.
It would seem sensible, therefore, that both those designing video analytics and those tasked with utilising them in a retail environment look carefully at where they will be used and for what purpose. The adage of keep it simple would seem, in this context, to be highly appropriate (57).
For those looking to utilise a video analytic, detailed in Table 2 below are 20 questions that may be useful to inform decision-making, planning, selection, operationalisation and review. A worked example is also available in Appendix 2.
This penultimate section of the report focusses upon some key trends and insights that emerged from the review of the way in which respondents to this research were utilising video technologies. The first looks at the issue of how retail organisations manage the use of video systems and the increasing need for the development of a more overarching strategic approach. The second issue looks at the way in which some companies are trying to develop are more cross-functional approach to the use of their video systems. The third reviews the growing trend amongst some retailers to develop a centralised command centre to utilise their video systems. The fourth considers the need for developing a clear strategy to integrate video data with other sources of information, and the final trend relates to the growing importance of ensuring that video system designs are driven by a clear and coherent organisation-wide understanding of its overall purpose.
As the potential for video technologies to make a broader impact upon retail businesses increases, particularly through the greater use of networking, increased accessibility to video data and the growth in video analytic capabilities, it became clear from this research that a more overarching and cross-organisational approach to developing and managing its use is required.
Given the historical focus of video systems on issues of safety and security, it is not surprising that the Loss Prevention function within retail businesses has traditionally been tasked with its design, procurement, installation, management and control. But as the reach of video has expanded, particularly into areas such as business intelligence, some respondents questioned whether this function was the appropriate guardian for ensuring that there was a co-ordinated business-wide strategy in place.
Indeed, some loss prevention respondents were rather reluctant to take on this role: ‘technically, we own video for security purposes, but I don’t want to have responsibility for all the other use cases that are being developed’[RR18. For another, at best they wanted to be kept ‘informed’ about what was going on: ‘our [video] supplier can go off and talk to any other part of the business they want to without coming through Loss Prevention; we would like to be made aware but we are not involved in the conception of the analytics and their use’[RR5]. Of course, the danger with this type of approach is that it can foster a fragmented, disconnected, overlapping, overly expensive and piecemeal approach to the use of video systems.
However, others had begun to recognise the need for a much more joined up approach and for some the Loss Prevention function was capable of taking on this strategic role: ‘Because LP are driving and pushing most of the developments, it made sense for LP to be the hub/lead; there isn’t really anybody else in the company as knowledgeable about it as we are’[RR1]. This respondent went on to further make the case:
We have a lot of people trying to spin up lots of different efforts but we have all of them funnelling through us because what we don’t want is a bunch of sporadic types of solutions out there. We can have multiple solutions but we want to integrate it all together so we don’t have disparate things reporting on one type of analytic here and another over there [RR1].
Other respondents agreed, particularly as systems became more centralised and networked: ‘Loss Prevention now own the video strategy although IT are heavily involved; now that it’s all coming back to one place there is a different demand upon the investment and the need for a co-ordinated strategy’[RR8].
Across the 22 retail companies taking part in this research, it was interesting the extent to which the question ‘who has overall responsibility for the strategic oversight of video systems across your business’ generated so few clear-cut answers. As described above, partly this is undoubtedly a historic legacy based upon its primary focus and use, but as the capacity and capability of video systems continues to grow and mature, then retail businesses do need to begin to think seriously about who should take responsibility for ‘owning’ it. If not, then not only will the technology providers struggle to identify the ‘correct’ point of contact within retail businesses, but perhaps more importantly, it could lead to a fragmented and disjointed approach to its use, particularly at a time when the benefits of greater system integration are becoming more and more apparent.
Making the case for any investment in retailing is increasingly subject to its applicability across an organisation – how might multiple business functions benefit? The application and use of video systems is no different – how might it provide value beyond the traditional confines of Loss Prevention? A number of respondents increasingly recognised this issue as they looked to make their case for investment in video technologies: ‘We know we can’t just put cameras in anymore for security and loss prevention; marketing has been quite a saving grace for us by opening up a revenue stream to support its installation’[RR5]. Others agreed in the need for a more cross-organisational case to be made: ‘Investments [in video] now need to impact on several areas – be multi-functional; we are all fighting for IT resources and so if you don’t have multiple stakeholders involved it can be hard to make the case’[RR3].
While it is hard to argue against a more inclusive strategy for the utilisation of video systems, there was evidence from some respondents of a degree of reluctance to enable greater access and use of ‘their’ video system. Partly this was justifiable concerns about inappropriate use of the technology that may bring the business into breach of agreed codes of practice, but it was also evidence of a degree of ‘empire’ defending.
In terms of the former, some respondents were concerned about ensuring data security when more functions were given access to the video system: ‘we can secure access to a laptop, but we can’t secure where that laptop might be used and who might be able to see it’[RR19]. Another concern related to video feeds being used inappropriately: ‘we have got to be careful that different parts of the business don’t begin to use the system inappropriately, especially for performance management/ discipline’[RR21].
In terms of the latter, a small number of loss prevention-related respondents put forward arguments for limiting access due to concerns about how it may undermine the business case for their function:
We built the SOC – if we wanted everybody to have remote access, why did we build this room? There is a certain amount of remote access we would want but we don’t want every manager to have access to all this functionality on their laptops. If Sales want to monitor promotions, they could come here and view the stores [RR13].
Perhaps not surprisingly, the ‘offer’ to come to a central location to view video streams from stores was not taken up too often: ‘Not been a really popular service that we offer …’ [RR13]. For a technology provider, this was not an uncommon situation that they had also experienced: ‘Loss prevention are often seen as the “secret squirrels” – you don’t know what is going on back there and they don’t want to share anything and so you draw your own conclusions!’ [RR15].
While there are certainly legitimate concerns about security and the appropriate use of video images that need to be addressed, it would seem that in order to enable the maximum value to be achieved from investments in video systems, parochialism should not be allowed to prevail when it comes to accessibility and use. Moreover, as stated above, unless a much more cross-functional business case is developed, then it will become increasingly difficult to secure the necessary investment: ‘It’s about getting buy in from other groups to leverage the value’[RR3].
A key development that became apparent as this research progressed was not only the growing use of centralised video monitoring stations, but also the increasing breadth of the activities being undertaken by these facilities. In and of themselves, centralised monitoring stations are not necessarily a new development – distributed fire and burglar alarm systems in retail buildings have been brought into Alarm Response Centres (ARCs), often provided by third party companies, for many years. In addition, some larger retailers have operated full time security centres where they monitor and respond to incidents occurring across their estates. However, the growing availability and use of networked video systems has encouraged more retailers to begin to establish video-based centralised command centres within their own businesses.
While they are most frequently referred to as Security Operations Centres (SOCs), not least because this typically describes the relatively limited scope of their responsibilities, a growing number of retail companies are using a more broad-ranging nomenclature to more accurately portray the increasingly cross-functional nature of the activities they undertake. For instance, other names include: CCTV
Monitoring Centre; Central Control Hub; Incident Management Centre; Retail Command Centre; Retail Operations Centre; Retail Integration Hub; Fusion Hub; and Retail Service Intelligence Facility. For the sake of brevity, this report will simply refer to them as SOCs.
Respondents varied considerably in the range and scope of activities undertaken by their version of a SOC, but the central premise for all was the capacity to view and review video data from a range of sites across their estate. In addition, many retail users were also bringing in other data feeds to augment their capability, such as PoS data, incident reporting and social media feeds. As detailed earlier in this report, eight types of use of video technologies have been identified and SOCs could be found to be involved in the delivery of four of these: Generating a Response, Undertaking Reviews, Informing and Enabling, and Anticipating, Detecting and Alerting. While the use of SOCs to deliver much of this functionality has already been discussed earlier in this report, their role in responding to alarms requires particular attention.
Where SOCs were seen to be capable of playing a very important role was in the response to various types of alarms, in particular burglar alarms. This type of alarm has at best a chequered history of accuracy and is one of the main reasons for the development of ARCs to try and put in place mechanisms and processes to limit the impact of false alarms on law enforcement and other security responders. The role of an ARC is primarily to provide a degree of verification of the likely veracity of an alarm activation and then activate a suitable response if deemed necessary.
Of course, proving the veracity of an alarm activation remotely is not easy and so various protocols have been developed to try and achieve this, such as the requirement for multiple alarm activations to be recorded in the same site within a given period of time before a response is generated. Many respondents to this research highlighted the significant problem they had with false alarms, despite the use of ARCs: ‘We were getting upwards of 70,000 false alarms a week from our burglar alarm systems in our 1,800 sites; we found that air conditioning is a particular trigger of false alarms, as is light reflection and internal doors flapping’[RR9].
Of course, a preferable way to verify the veracity of these types of alarm would be the capacity to be able to ‘see’ whether there is in fact a burglary underway – can intruders be seen attempting to enter or are already inside a property, or is it simply a case of a flapping door triggering the alarm? Networked video technologies offer this capacity – when an alarm is triggered, a video operator can dial into the local video system and review what is happening in real time. Consequentially, a much more informed decision can then be made about raising a response.
What this research has found is that there is a growing trend amongst retailers to begin to undertake this verification process through their SOC, often prompted by a call from an ARC: ‘our SOC gets a call from them [contracted ARC] and they have a look at the video feed and then tell them whether to call the police or not’[RR9]. For some this has had a profound reduction in the number of false call outs: ‘Since November [2 months] we have had 500 calls and only 2 were an actual burglary – mainly in-store decorations setting off the alarm – it has shown how bad our system is and the number of false alarms we are having’[RR13].
For some, it has also significantly improved the response times of the police through increased confidence in the quality of the information: ‘this has increased the police response time significantly, once they have got a true positive – they are so used to 80% of calls being false – but once you have that visual confirmation then the response time improves significantly’[RR1].
Given this growing capacity to remotely access video feeds from retail sites and use them to review the veracity of burglar alarms, it would seem logical to make this feed available to ARCs so that they could do this without calling upon the SOC. However, only one retailer had taken this step, for the rest, significant concerns about giving third-party access to this data feed meant that they were simply not prepared to take this step: ‘We don’t want to give a third party our video feed – we don’t trust anybody with our data – it’s too risky’[RR18].
Others aired similar concerns: ‘We need to look after the data and ensure it is secure – everyone is scared of GDPR and therefore giving out access makes people feel a little twitchy at the moment’[RR8]; ‘Data compliance issues mean that we have locked down our data and so will not share the feeds with [name of ARC provider]’[RR5].
So, rather than providing ARCs with this capability, many respondents were strongly considering, or were currently in the process of doing away with their ARC services and doing it themselves:
Compared with a third-party ARC, we are quicker and faster and more dedicated. We are now in a place where we are starting to look at bringing all of our intruder alarms into here [SOC] so that we can monitor everything [RR18].
Others echoed this view, regarding the cost savings of not using an ARC and the additional functionality a SCO can provide as a winning combination: ‘ARCs feel very unidimensional – respond to alarms and that’s it – what is the value added?’ [RR8]; ‘We believe we have the skills and expertise in our own team to deliver this’[RR5]. An example of this added functionality was the capacity for some SOCs to actively respond when a burglary was underway, either by triggering pre-recorded messages or alarms, or issuing context-specific warnings in real time: ‘… has had a massive effect in terms of the amount of attempted burglaries we have had; we think we are disrupting about 70% of break-ins; the SOC staff can talk to the burglars and tell them they are being observed’[RR18].
Another was confident about the ROI for their SOC taking on alarm monitoring: ‘The ROI on ours [SOC] is 18 months, that’s just based upon binning alarm monitoring centres, and then you add on Health and Safety and Marketing opportunities and suddenly the room pays for itself in 6 months’[RR12].
In several countries, a wide range of standards and best practices have been developed which cover the design and use of ARCs, not least relatively rigorous building designs and security requirements (58). In addition, agreed protocols and working practices have been established between third-party ARC providers and police forces covering calls for a response and penalties for persistent false alarms. These have typically made the business case for retailers to use an ARC persuasive and discouraged them from establishing an in-house variant.
However, for several respondents to this research, there were concerns that this had become a ‘closed shop’, operating restrictive practices and limiting the capacity for retailers to innovate in this area. The capacity for many of the larger retailers to now monitor their stores remotely via video was seen by some as a game changer, requiring the alarm monitoring business to be re-evaluated:
I think retailers are beginning to disrupt the ARC monopoly – the big difference is that video provides the visual conformation that an ARC cannot do. They [ARC providers] are very nervous about the possibility that they might lose business [RR8].
Others were equally strident in how they viewed the current situation:
The current ARC standards are a challenge for us, but do we really need these standards? We have a written a code of practice and what the building requirements should be. Because I’m not selling this as a service to others – we are taking the risks ourselves, does it need to be bomb proof, flood proof etc. Actually, no it doesn’t, but the commercial ARCs have now seen what our capability is and are doing everything they can to try and stop this evolution [RR18].
Certainly in the UK, a number of retailers are now actively liaising with the police authorities to persuade them of their capacity to deliver ARC-like monitoring: ‘So the ARC industry hate this but we are saying to the police, we are going to cut calls to you for alarms by 98.6% just by visually verifying our alarms’[RR8]. While in a trial phase now, a considerable number of respondents to this research were monitoring these developments closely as they continue to develop their SOCs and associated business cases.
Those retailers that were utilising some form of SOC facility were generally of the view that it should be operated by staff directly employed by the business rather than third parties. Partly this was driven by security issues and partly by the nature of the role they were undertaking: ‘Our people know our stores and they have visibility to our technologies’[RR19; ‘We want to have the talent in-house; be able to move quickly, change and adapt to things which we don’t think we could do with a third party … also have a lot of internal systems that it would be difficult to teach to outsiders’[RR13]. This view was echoed by some of the technology providers who were interviewed as part of this research:
I think SOCs should have internal staff – the difficulty with third parties is that you get what you pay for – often turnover is high, they are not very well trained, low pay role; they are thrown into a uniform and when you give them technology they don’t tend to perform very well. When you hire your own staff, you can train them yourselves, give them career pathways and things like that. You tend to get a higher level of engagement [RR15].
Of course, the debate between whether to employ in-house or third-party staff is not new and has been endlessly discussed throughout the security industry with either side presenting benefits and challenges. While not completely unanimous, respondents to this research were largely of the view that because of the relatively broad-ranging and in some cases, highly company-specific nature of the work being undertaken in their SOCs, internal staff were regarded as a better option.
As illustrated at various points throughout this report, video systems are increasingly being viewed as one of several data sources that retailers can access and analyse to improve their operations and business profitability. As such, the value of integrating video data with other data feeds to improve value was a readily apparent trend in those taking part in this research.
For some this was part of the growth of the Internet of Things (IoT) (59) and the value that can be had from enabling various objects to communicate, including video technologies. For others it was the way in which better decisions could be made when multiple data sources could be accessed, combined and analysed. For instance, one technology provider was looking at how RFID data could be combined with video data to offer more valuable insights into the movement of products and people. In this respect, by combining these data sources, inherent foibles present in each data source could be ameliorated to improve the efficacy of the eventual outcome.
Improving data integration can also reduce the risk of sensory overload – too many separate data points can easily overwhelm the reviewer leading to errors and disregard: ‘We need all the data to come together through a single pane of glass otherwise it won’t get used’[RR3].
As detailed earlier, several of the review-based utilisations of video technologies were premised upon its use in a confirmatory rather than initiator role, providing a mechanism to enable exceptions from other data sources to be investigated further. Indeed, without this data linking, the investigatory process would be much more time consuming and limited.
Certainly when it comes to the growing utilisation of video analytics, especially in complex environments where the likelihood of Overload Positives are high, or the risk of a False Positive is unacceptable, then combining video data with other data sources would seem a useful strategy: ‘When you get data integration with video – PoS, RFID, EAS, real time location data, various sensors in the environment – when combined with other data it can make the video analytic much more reliable’[RR15]. This was particularly apparent to those retailers that were trying to develop analytics to pre-emptively identify the signals of violent behaviour: ‘If you get this wrong it can be difficult, that’s why you need multiple inputs together to give you a real picture – you can’t have false positives with this type of scenario’[RR12].
A key trend, therefore, will be technology providers working with their retail partners to ensure that, wherever possible, video data is fully integrated into the broader organisational information web to maximise impact and value.
A final key trend that emerged from this research was a growing realisation that unless the retail business fully understands how it wants to benefit from an investment in video technologies, then system designs will continue to be piecemeal, partial and poorly configured.
In many respects, this is linked to the need for somebody to take overall cross-organizational ownership of the video strategy – be a Video Tsar – to ensure that desired outcomes are matched against system requirements. This was evident from some of the responses to this research: ‘This use case is now driving the design of systems – why have moveable cameras when nobody is watching – just have lots of static cameras that enable maximum coverage; also easier to maintain than PZTs [Pan, Zoom, Tilt Camera]’ [RR7]. A video technology provider echoed this view: ‘If you don’t know what you want the system to deliver, how can you design a system in the first place?’ [RR20].
Others lamented the outcome of not adopting a systematic approach to the procurement and use of video systems: ‘… our store ceilings look like a showroom for video cameras – so many different unconnected systems; just looks a mess’[RR19]. Another respondent gave a good example of the gulf between system design and desired use case, driven by a lack of strategic oversight: ‘business now wants to use it [video system] to check on all slips and falls, but we haven’t got full [video] coverage in the stores and you can bet that a slip will happen where there isn’t a camera!’ [RR11].’
Part of this process is not only understanding what the technology can deliver now and to a degree in the future, but also engaging in an educational/listening/training exercise with the rest of the business. Does the rest of the organisation understand the potential of what video technologies might deliver, and what are the future priorities of various retail functions? Some respondents had begun this journey: ‘We are inviting lots of different business functions to the SOC to see what it can do – look at the possibilities, open their eyes to the possibility’[RR8].
While future proofing is never an easy task, developing a clear and considered organisational strategy for how video technologies may be used to benefit the profitability and productivity of a business will be a key first step in ensuring that any proposed video system design is fit for purpose.
As a technology, video has long been part of the retail loss prevention ‘armoury’ although, as this research has shown, developing a clear and consistent picture of the role it is playing and how its value can be calculated, is neither obvious nor straightforward. In addition, as developments in the technology accelerate, claims of what it can deliver are becoming ever more inflated – a trend not unfamiliar to those who have worked in the industry for any period.
This study set out to develop a more overarching understanding of not only why and how retailers are utilising various types of video technology, but also explore the opportunities and pitfalls associated with video analytics. It also sought to summarise some of the key trends that can be found in the way in which this technology is operating in the retail sector. This final section brings together some of the key points unearthed by this research.
The deceptively simple question: ‘what is the role of video technologies in retailing?’ generated a plethora of use cases that enabled this research to develop a more overarching and systematic categorisation of its role. This utilisation model summarised the use cases into four modes and eight distinct functions.
While traditional interpretations of the role of video systems have largely focussed upon their capacity to detect, deter and reassure, this research has shown that within the retail environment, its role is much more expansive, incorporating a far greater range of opportunities. In addition, as the technology evolves further, its use could also grow to not only encompass more tasks, but also assist a broader range of retail functions. However, its application and use within many parts of retailing can at best be described as: piecemeal – lacking an overarching strategic objective; myopic – focussed mainly on issues of security and safety; and presumptive – largely premised on assumption rather than fact.
For the most part, this research found a general lack of a co-ordinated and cross-organisational approach to the strategic use of video technologies in retail companies. While the Loss Prevention function was typically the titular head, this was often more a consequence of historical legacy rather than a considered corporate mandate. As such, establishing why and how video technologies should be employed across a retail business to facilitate the meeting of key company goals is not easy. Too often, video investments are unnecessarily uni-dimensional, their potential poorly communicated, and insufficient attention given to maximising the value that could be derived from a more coherent integration strategy. This inevitably leads to technological overlap, redundancy and under performance.
It seems clear, therefore, that retail organisations should actively anoint a ‘video technology tsar’ to positively and proactively lead on the current and future use of these systems. Their role would involve at least five interrelated activities. First, develop a clear and co-ordinated strategy for a panorganisational utilisation of video technologies. Secondly, ensure that the business speaks with ‘one voice’ to avoid duplication of effort and investment. Thirdly, ensure that all parts of the business not only understand the potential of what video systems may offer, but also actively facilitate their access to them. Fourthly, establish clear parameters and methodologies for how the value of investing in video technologies can be measured and understood. Finally, take full accountability for maximising the ROI for any and all video technologies procured by the business.
While the Loss Prevention function will no doubt remain the dominant user of video systems within a retail business for the foreseeable future, it does not necessarily follow that they must take on this role. The move towards greater use of IP-based video systems and the value and importance of system integration, may mean that the IT function increasingly takes a much more involved role and could therefore take on this leadership responsibility. They may well adopt a more dispassionate approach to what it could be used for and who owns it, as well as ensure greater connectivity. No doubt organisational culture, localised specialisms and historical responsibilities will all need to be considered when making this decision, but the key is that the role of Video Tsar is established, recognised and empowered.
As detailed earlier in this report, video technologies have been, and largely continue to be, focussed primarily on issues relating to safety and security. These topics probably account for over 90% of the utilisation of current video systems installed in retail environments. While this is unlikely to change much in the foreseeable future, what is likely is that the overall use of video technologies will grow.
This will be influenced by three factors. First, the retail context is likely to encourage more utilisation and not less. Pressures brought about by growing competition, rising costs (particularly labour) and shrinking margins will see retailers looking to a range of technologies to meet these challenges. Secondly, the applicability of video technologies will further grow. This can be seen in the way in which retail developments such as self-scan technologies have created a new set of risk challenges that may be ameliorated by the application of video technologies and analytics. Finally, the growing capability of video technologies will mean that they can begin to be utilised in ways that were not previously considered possible or appropriate. For instance, the growing networkification and centralisation of retail video systems is enabling innovation in the way in which the problems of burglary and violence may be addressed.
While the presence of video systems in retailing is now almost ubiquitous, it is a technology that can be difficult to justify purely in terms of a definitively identifiable ROI. Part of the challenge is the intangibility of some of the desired outcomes of using these systems, such as customer and staff reassurance and the deterrence of crime. It can be hard to put a concrete monetary ‘value’ on what these are worth, and in some respects, it may not be desirable to try. However, this research has singularly struggled to identify many retail companies that have developed a systematic approach to capturing the various ways in which their investments in video systems have secured value. Too often, the approach is piecemeal, partial and incomplete, driven in part by a lack of strategic oversight as detailed earlier, but also by the disparate ownership of various systems with a retail business.
In addition, it is driven by a lack of a coherent understanding of the overall purpose of any given investment in video technologies. When the use case is wrapped in blurry and imprecise expectations, such as reducing crime and detecting persistent offenders, then it should come as no surprise that the key performance indicators (KPIs) are equally fuzzy and unclear. More encouragingly, it is possible to identify a range of KPIs that can be measured to begin to assess the overall contribution of various video systems – the report earlier highlighted this in relation to incidents of fraudulent Health and Safety claims and reductions in corporate insurance premiums.
Without a clear cross-functional plan to consolidate the various ways in which video generates value to an organisation, the route to making a persuasive business case for future investment will continue to be challenging, undermining opportunities for further utilisation, innovation and integration.
Like numerous technologies swirling around the retail world, it is often easy to be beguiled by the purveyors of promises – offering enticing pathways in the never-ending search for the illusive silver bullet that will deliver the ultimate solution. But as we have seen time and time again, these pathways can all too often lead the traveller up the peak of Inflated Expectations, before quickly sending them into the Trough of Disillusionment (60). It is of course easy to understand why those offering any new technology will invariably be enthusiastic cheerleaders – championing the positives and downplaying the negatives – their job is ultimately to sell them.
However, respondents to this research were certainly familiar with much of the hyperbole that is often associated with the use of video analytics, with one eloquently describing it as the ‘Emperor’s New Clothes’. While this is probably unfair, there was certainly a high degree of scepticism about the applicability and scalability of many of the analytics currently being offered to retailers.
What seems clear thus far is that the capacity of many video analytics to deliver value is significantly affected by three inter-connected factors: contextual complexity, clarity of purpose and operational management. As the environment within which the analytic is required to operate becomes more complex then it becomes far more difficult to ensure that the system does not become overwhelmed by False Positives. Similarly, unless the purpose of the analytic is capable of being clearly defined then the system can suffer from a surfeit of Overload Positives.
Moreover, operationalising the scaling of video analytics is challenging – if a traditional video system could be regarded as an axe, then a video analytic is much more equivalent to a scalpel. It requires precise calibration and control in the way it is installed and used, heavily affected by its environmental context, and easily undermined unless carefully managed and maintained.
Unless these three factors of scalability, complexity and clarity are not carefully managed and monitored, then there is a real danger that many video analytics could become mired in the Crying Wolf Syndrome – subject to retail ridicule, labelled a costly distraction and generating mountains of unusable data. It is therefore imperative that developers of these technologies move cautiously, responsibly and realistically.
As we have seen with other emerging technologies, there is an eventual pathway out of the Trough of Disillusionment to a Plateau of Productivity, but it might be better for all concerned if a more direct route could be navigated, avoiding this diversion, enabling developers and users to work together to develop a realistic and considered approach to the successful application of a range of video analytics in the retail environment.
Interviews were carried out with representatives from the 22 retail companies taking part in this study, primarily those with responsibility for the management of various types of video systems. Inevitably, given the historic use of video technologies to address issues of security and safety, respondents were mainly based within the Loss Prevention function.
The retail companies taking part represent some of the largest retail businesses in the world with collective sales in 2019 of nearly €1 trillion, equivalent to approximately 12% of the total US and European retail market (61). In 2018/19, these companies operated in over 57,000 retail outlets as well as the majority having extensive online operations. In addition, interviews were carried out with representatives from five video technology providers. In total, the research is based upon nearly 30 hours of interviews. Where possible, visits were made to some of the retailers to get first-hand experience of their video systems in operation.
Those retailers that took part in the research were self-selecting – the researcher conducted an online survey, distributed widely through existing contacts, representative trade associations and social media platforms, asking respondents to describe their use of video technologies. Those completing this survey were then asked whether they would be prepared to take part in more detailed research interviews. In addition, through contacts made via the ECR Retail Loss Group and the Retail Industry Leaders Association’s Asset Protection Leaders Council in the US, additional retail companies were approached for interview.
While this study is primarily interested in the views and experiences of retail users of video systems, some technology providers were also approached for their thoughts on aspects of the research. These were selected through existing contacts, size of operation and types of technology being developed.
Throughout this report, direct quotations are provided from the transcripts of interviews carried out with a range of representatives from the companies taking part in this research. Each quotation has been given an identifying case-study number, but due to the relatively small number of companies taking part and to avoid any respondent being identified across multiple quotations, this identifier has been changed for each section of the report. So, for instance, code RR1 refers to a different company in each of the sections in the report (where interviews were carried out with multiple people present, all subsequent quotes are referred to as the same company). Where respondents have provided quantitative data, this has also been anonymised and checked with the companies that agreed to take part.
This research does not claim to be based upon a detailed and representative sample of the retail industry nor the companies offering video technologies. As such, the results from this research need to be treated with caution – they are only based upon what some companies are using and developing and it is recognised that, given the dynamic nature of this sector, there is likely to be wide range of other video technologies and use cases in existence that are not included.