Case Overview

A media client annotated and clustered digital ads using RNN based natural language descriptions

The media client wanted to create a generic solution to categorize all creatives used by different businesses. This would help them understand the brand positioning, target group and opportunities for their clients and competitors. OCR like capability was sought to extract text from the creatives. They were looking at a technology driven solution due to the explosion of digital marketing. It had become impossible for them to process digital ads manually.

Right tags for better results

To explain the overall messaging of a creative, a creative tagging system is very powerful, scalable and effective. We used attention representation to reflect the human visual system.

  • Reduced human effort of tagging creatives by 80%
  • Improved the number of creatives processed/day by 4x (further scalable)
  • Improved tagging with better description on qualitative validation of output

The number of parameters needed for the tagging algorithm is small with much lower computational complexity.

Lack of scalability & standardization

Display ads are required to be categorized for brand safety and sensitivity. Advertisers are required to mark their ads into several categories such as suggestive, violent or deceptive. As advertisers do not always have a clear understanding of these categories, creatives are manually labelled that can reduce the effectiveness of the ad and reach. Additionally, manual labelling is limited in terms of taxonomy.

A major issue with digital ads is multiple resolution and high variation of fonts, colours and layouts. Texts are integrated into the creative and any automated tagging exercise requires extracting both – the textual and the visual attributes from an image. We proposed using a recurrent neural network (RNN) based approach with attention representation to generate natural text tags.

Improved results with minimal human involvement

The solution we deployed was able to provide a more textual description of creatives, improving the tagging accuracy.  

When stemming is applied on captions generated with the algorithm, the distinctiveness is better expressed as they are mapped to the same word even if the tense and form are different. The solution provided the media agency with following benefits:

a. Improve the efficiency of manual categorization by suggesting a list of possible tags from which editors can choose the best categories

b. Use the solution to backfill the categories of ads which classification was not completed

The features from the creatives is used as a channel of attributes in matching algorithms for ad-selection.