Imagine walking down the snack aisle at your grocery store. With so many options to choose from, you find yourself instinctively reaching out for a particular bag of chips. What is it about that bag of Cheetos that you like? Does the packaging look more appealing, or do the chips taste relatively better than the others in the aisle? Do you look at the calorie count? Perhaps you’ve been hooked on Lizzo’s “Good As Hell” music video, and watching her open a bag of Cheetos got you craving one?

Brands realize that sometimes it’s not rationality that goes into making a purchase but some visual or emotional bias, which is why billions of dollars are spent each year on marketing. The growing field of neuromarketing leverages how instincts play into our subconscious decision-making processes by translating this information into biometric data such as brain activity, skin, eye tracking (attention), and blood flow.

This data could potentially empower brands to make informed decisions that increase their ROI on advertisements, lift conversion rates, and optimize engagement strategy. It comes as no surprise that giants like PepsiCo, Microsoft, Paramount, and Unilever started investing in these technologies.

AI-Informed Marketing

Artificial Intelligence (AI) is another field that has proved its mettle in the marketing space. AI models based on deep learning algorithms (a kind of neural network that mimics how the human brain process information) are designed to analyze large amounts of data, derive insights into consumer behavior, and optimize campaigns to empower better decision making. In particular, Video AI based on attention mechanisms look at building white-box frameworks that make the models more interpretable, which helps stakeholders build trust in the system.

This blog post looks at how one such neuromarketing tool—eye tracking—works, how it can be integrated into Video AI to build interpretable models, and how this can be a game changer for marketers and brands.

What is Eye Tracking?

Eye tracking is a technique that identifies and monitors a person’s visual attention in terms of location, objects, and duration. This is made possible by measuring eye movements, eye positions, and points of gaze. Here’s an example (without sound).

Eye Tracking Example


This is a recording of an eye-tracking experiment conducted on a YouTube video. The visualization used to represent the eye movement called a ‘gaze plot’ reveals the time sequence of the subject’s gaze or where he/she was looking and for how long. The numbers in the red circles denote the order in which the region of the video was looked at. Time spent looking, commonly expressed as fixation duration, is shown by the diameter of the fixation circles. The longer the look, the larger the circle.

Such visualizations can be used to provide unique insights into what really catches our attention and what information our brain processes, which is then used to understand what drives behavior, decision-making, and emotions.

The Video AI + Eye Tracking Combo

Eye tracking can be used as a tool to describe how the human brain processes input information. In the example above, the eye movements being tracked would visually represent what we understood of the video. If we were to train an AI model to summarize what the video was about, historically we wouldn’t be able to clearly explain why the model came up with a particular description.

Deep learning has often faced flack for being uninterpretable. As we begin to rely more on AI to assist in everyday tasks from simple autocompletion of text in messages and getting driving directions to more complicated ones like self-driving vehicles, there is a need to develop trust on the system that calls the shots. This trust is built when users feel empowered by knowing how the AI system came up with the decision.

Video AI attempts to address this concern by using frameworks to make AI models more interpretable from the human point-of-view and can be visualized using attention mechanisms. Empowering the model with interpretability and validating the model using eye tracking can help develop the next generation of intelligent systems that mimic human behavior more accurately.

For example, an attention mechanism can highlight video regions (heatmaps) used by the AI model to make predictions on a video sequence. The same video will then be shown to a group of subjects, with an eye tracker recording their eye movements, which can also be visualized as heatmaps. Both heatmap sets (model and human) can then be compared, and decisions can be informed for things like a marketing campaign and to improve the Video AI model. The great benefit of AI with attention mechanisms is the lack of necessity for recruiting subjects to perform behavioral experiments, which can be very time-consuming.

Eye Tracking in Entertainment and Beyond

The adoption of eye-tracking technologies into the entertainment industry results in more interactive and customizable products. In gaming, eye tracking aids in creating a more immersive experience in which natural eye movements can be used in the gameplay as a passive or active device in conjunction with keyboards, mice, and gamepads.

Another field beginning to experiment with this technology is filmmaking, where eye tracking enables directors and producers to explore the cinematic and aesthetic choices, giving them a unique look into what an audience is watching in a film.

Other fields that may benefit from the use of eye-tracking technology are automation and biomedical devices. As the demand for full driving autonomous automobiles increases, eye-tracking devices are being incorporated in driver assist systems to detect drowsy and distracted drivers. Eye tracking is also being used as an assistive technology that enables people with disabilities such as ALS, Cerebral Palsy, and Muscular Dystrophy to interact with the world. Users who are unable to control their body movement or speak can now speak, write, operate a computer, play video games, and control appliances simply by using their eyes.

Brands and marketers can make use of this technology to understand what grabs audience attention, what influences purchase behavior, and how consumers engage with their product by measuring visual attention to key messages in product advertisements, placement and branding, package design, and more.

What is BEN Doing with It?

BEN serves clients by predicting and continuously improving the success of brand integration campaigns across influencer marketing, streaming, TV, and film. Using human-driven data through eye-tracking experiments, we are performing R&D of new interpretable Video AI algorithms and models with attention techniques to move our proprietary AI technology to the next generation of explainable Video AI systems.

With that data, we can predict and measure core results and understand why content performs the way it does. This new technology will play an important role in answering cornerstone questions such as what makes a brand integration successful, what is the best way to engage an audience for increased ROI, and which integrations create emotional connections with audiences that you can’t get from traditional marketing channels.

Discovering behavioral data-driven answers to these questions will usher in the next era of brand marketing through more authentic and seamless experiences.