On this paper we describe how to extract consumer structured data from Twitter such as hobbies, favorite brands and venues and combine this with their musical tastes using semantic search and machine learning. The goal is to build a Target Audience API for music platforms to be purchased by online agencies to maximize their CTR by targeting the right audiences parameterized by musical tastes related to their target.
We believe that more and more Media Distribution platforms will create an environment to foster an ecosystem of third-partners products related to ads and data. We are reaching out to SoundCloud on an early stage to present our idea of an independent Target Audience API to be deployed later this year or 2016. Our goal is to secure access to future releases of a possible Ads API from SoundCloud and thus be able to sell premium advertisement services for our clients world-wide.
We have in place the software infrastructure to mine, classify and generate music-related target audiences. Furthermore we work with 40 Online Agencies in LATAM and Spain which already work with our Social Media tools and are the ideal users for running a pilot with SoundCloud.
What follow are the nuts and bolts of how such an API will work for media distributions platforms and in particular for SoundCloud.
Emulating Facebook successful customer data stratification
Facebook ads huge success is largely attributed to its 1 billion users alone. However a fact that is commonly overlooked is how advertisers are able to slice and dice Facebook users data picking the most fitting audience for each brand or product. The wealth of consumer information which the platform provides is only possible because of the structured data Facebook gathers from each person along the time. Besides the classic “like” button, a typical user post may also contain various fields such as location/venue, people she/he is with and even emotional disposition (“I´m feeling …” option).
In this paper we briefly sketch an approach to emulate Facebook data mining strategy for Twitter, a far more open platform to mine data. Furthermore we explain the possibilities of using this data for optimal ads placement on music distribution platforms such as Berlin-based SoundCloud.
The fundamental idea: Deep Profile in Twitter
When we ask who the relevant consumers to a specific brand are, we want to know this individual at the most possible deepest level. By looking at every tweet she made it public our goal will be to continuously produce a deep profile of each individual. Ideally this detailed profile of each consumer evolves every time she posts new information. Furthermore we will give as much as possible structured meaning to otherwise loose and unstructured public information about her. Let us look at an example from a fictional Twitter user @MarJules79:
Figure 1 – two fictional tweets from a Twitter user.
What can we tell from the two tweets above? From the first, that she carries insulin and from the second tweet, that she wants to visit Greece. What we want to do for all her relevant tweets is to store these indicators on a deep profile in order to recover them whenever we need. So for this case we would have @MarJules79 mentioned #Insulin and #Mykonos.
In order to produce deep profiles, we will store the whole timeline of a group of people. It is very important to grasp this distinction, because we have been conditioned for years of work to monitor brand mentions; this will be fundamentally different: we will be looking at complete timelines of our audience.
How do we link consumers timelines and music?
Hundreds of thousands of music fans are sharing their favorite songs on Twitter. On January 2015 alone tweet volume of the hashtag #soundCloud was 2.4 Million mentions (source: Buzzmonitor via Twitter GNIP). Below we see a typical sharing window from SoundCloud platform.
Figure 2 – Sharing a song from SoundCloud on Twitter.
By tracking the hashtag #soundCloud we are able to generate a database of Twitter users and classify them according to songs, artists and music genre. Things start to get interesting when we select a group of people from this database (say people who shared independent bands from California) and gather part of their timeline on Twitter, extracting attributes such as brands they mentioned, venues they checked in, TV shows, movies, sports, hobbies and whatever you might be interested in (see ). We end up with a big table of users listing their musical tastes according to SoundCloud plus their profile attributes of preferences, places and more .
A Machine learning API for targeting the right audience: the Magic finally happens
Once we have a big table of users filled with their musical tastes and other general preferences & attributes, we are ready to create a mapping of which song or musical genre goes together with a given set of attributes. A variety of algorithms are available to calculate affinity between data points . Using one of these off the shelf algorithms we can find out for example, that women listening to jazz and independent Californian bands are likely to mention yoga and organic food. This is a nice piece of knowledge for the likes of Wholefoods or Trader´s Joe for instance, if SoundClound allows them to target those specifically users.
Our team is currently working on an API that wraps up all the concepts above to work together with media distribution platforms such as SoundCloud. This API is intended to be as loosely coupled as possible to a given platform in order to minimize costs of adoption. The basic idea is to send ads placement requests parameterized by Music Genre, Artist or song. The music platform would then place the ads accordingly. We have access of all of Twitter Firehose since 2006 as well as the infrastructure to mine, classify and generate music-related target audiences.
The battle for optimizing CTR on online advertisement has become more and more sophisticated. We believe media distribution platforms will have a competitive advantage if they are able to offer better deals (namely better CTR) for advertisers. Using machine learning and external social data may just be one of the answers to this challenge.
 Basket Analysis
Jairson Vitorino, PhD. @jvitorino CTO at E.life Group