By Stefan Bildea, PhD

Big data in the new market of video-first data

Big Data is one of the fastest growing areas in IT: 90% of the world’s data has been created in just the last two years, and 80% of it is unstructured data such as photos or video.

In this new market, unstructured data from video and images demands much more attention than the popular Big Data projects centered around structured data (e.g. fraud detection, stock prediction based on historical transactions), and semi-structured data (including text data from services like Twitter and social graph data from the “Likes” of Facebook and LinkedIn).

In the earnings call from Q2 2016, Facebook CEO Mark Zuckerberg said: “Ten years ago, most of what we shared and consumed online was text. Now it’s photos, and soon most of it will be video. We see a world that is video first with video at the heart of all our apps and services.” One year later “video-first” has become a reality for both consumers and marketers.

Video data mining

Placed under the generic umbrella of data mining, the challenge of finding value in video data comes from the extraction of implicit knowledge, video data relationships, or other patterns not explicitly stored in the video databases. More specifically, in video data mining the goal is to automatically find and extract content and structure of video, features of moving objects, spatial or temporal correlations of those features, and discover patterns of video structure, object activities, video events from vast amounts of video data with no previous context of their contents.

One of the groundbreaking video analytics projects is the Deep Learning ‘Cat-face on YouTube’ project led by Google and Stanford a couple of years ago. Data scientists at Google built a neural network of 16,000 computer processors with one billion connections and let it browse YouTube for the perennially popular video subject of ‘cats’. As a result, it taught itself to recognize cats by defining, without supervision, its own notion of ‘cat’. Today it is clear that project was just a pointer to what can be done with big unstructured video and image data.

Mining Plotto’s video survey data

Plotto’s main mission is to gather actionable data that gives you deep insights about your products and your customers’ preferences/shopping experiences. With Plotto you can enrich your understanding of the science and nuance behind these insights by creating a good video enhanced survey.

Your journey begins by building a video survey with or without associated questions in the usual text-led survey format. Plotto makes it easy to receive video feedback from large pools of contributors, potentially resulting in hours of video and thousands of answered survey questions.

Once you get good at creating surveys, you can replicate the process on the platform so that you have a constant flow of formatted data consistent throughout your research.

To get the desired value from this vast data, Plotto relies on voice-to-text applications to automatically generate text, that in turn is analyzed again using Sentiment analysis to understand the overall feeling given to the questions in the survey. Facial Emotion analysis uses expressions to give deeper insight into emotions. Deeper analysis of the video content using the latest approaches in video data mining will bring new insights of the sentiment analysis as well as reactions to watched content. Currently, at Plotto we investigate applications of multimodal sentiment analysis and emotion recognition from text and visual modalities ([1][2][3][4]) in participant responses to marketing research surveys generated using the Plotto platform.

Segmenting your data

Segmentation is one of the most useful tools when analyzing data from online surveys. Going beyond the overarching goal of the survey, segmentation analysis helps your business identify opportunities for growth, target communication toward specific audiences, and reduce costs from having multiple survey campaigns. Data inherently contains variability with respect to observable metrics.

For example, when the HR department looks at churn risk, segmentation will help to group the employees such that the resulting split helps to explain the variation of the churn risk across different groups, thus allowing for a proactive approach to key employees at risk of leaving the company.  At Plotto we envision different types of data segmentation grouping options going from Custom Variables to Machine Learning based algorithms.

It is this unprecedented power to bridge unstructured video data to structured insights provided via segmentation of survey responses that makes data mining with Plotto one of the most exciting approaches to the analytics of market research.

Stefan Bildea, PhD, is Co-founder and Data Scientist at Plotto. He has a background in computer science and mathematics and currently works both in academia and as a commercial Data Scientist.