A US based media agency commissioned a platform to monitor keyword trends using text analytics (NLP)
The US based media agency aimed to bring the concept of high frequency share trading to the keyword bidding process. They wanted to identify the trending keywords based on user posts on social platforms, breaking news from news websites, products with high search volumes based on popular e-commerce platforms and web search trends on search engines. They would then identify and bid for keywords that are most likely to generate higher eyeballs for their clients’ digital ads.
Real-time holds the key
The media agency wanted a (near) real-time platform to process the enormous amount of data and generate keyword association and topic modeling to present the trend results.
- Predicted views in the next three hours using historical volumes
- Able to predict eight out of twenty trending topics on twitter on an average
- Developed a web based platform to combine prediction engine and reporting
The solution generated predictions with high degree of confidence between the predicted trends at T+1, T+2 and T+3 and historical chatter volume till time T.
High volume, high velocity challenge
The media client had conceptualised an innovative solution for serving it’s clients more effectively. The idea was to ride on the wave of popular keywords and place relevant ads so that the number of views are maximized. The scope of the data sources – social platforms, news websites, e-commerce platforms and web search was holistic, and covered almost everything to capture trends. The solution was conceived as a web based platform for ease of access.
There were multiple challenges in turning the idea to reality. The first challenge was the huge amount of data that was required to be processed in real-time. Another challenge was to process this data with text analytics driven algorithms that could process such high volume of data and generate the keywords, perform topic modeling and group similar keywords together – in (near) real-time.
Springboard for greater business outcomes
We designed a solution that could process more than 6GB of data per minute and extract the trend and topic insights with a lag of less than 5 minutes.
The solution was integrated on a platform, and the solution:
a. Predicted twitter trends at State level, capturing eight out of twenty keywords correctly, on an average
b. Twitter trends were captured at least 90 minutes before appearing in ‘Twitter Trends’
c. An accuracy of ~60% for ten defined categories of keywords
The media agency is now working on integrating the volume and bid price for digital ads.