Media Analysis
Collection Processing Manager
The Collection Processing Manager (CPM) manages the entire workflow for the analysis modules. The main tasks of the CPM are to receive new essence messages from the Essence Management Store, organise them into jobs and choose the proper workflow for each one depending on the type of the essence. The CPM provides SOAP web services for the communication with the analysis modules and the Essence Management Store.
Press Workflow
![]() |
Optical Character Recognition (OCR) is used to recognise text in advertisements. Text is presented as artificial overlays in these images/videos and on real world objects. There are two different OCR modules, one for videos in the TV workflow and one for single images in the press and internet workflow. These two modules mainly differ in the fact, that text recognition is performed on different media types but the same technology and algorithms are used for both tasks. The Text Analysis module analyses text transcripts coming from the OCR and ASR modules and identifies instances of entities in the MEPCO ontology, in particular Advertiser and Product. The initial prototype works for English documents only. The Image Fingerprinting component tries to find existing creatives, which are identical or similar to the investigated spot. This analysis is done with the help of visual features, such as the colour similarity, and is divided in two steps: similarity search and exact matching. The similarity search is a fast comparison of the input spot with the whole creative database and it returns an ordered list of the most similar image. The exact matching step uses the results of the similarity search to find out which images from the creative database are identical to the input spot. The Creative Detector is used to combine the results of different analysis modules in the press workflow. Therefore, the creative detector extracts relevant information from all available analysis modules belonging to an investigated spot. This information is than weighted according to the quality (precision/recall) of the module before it is combined to a merged analysis result. The AdComparer is used for manual ad comparing in the press workflow using the results of the creative detector as input. Ambiguous results can be clarified by the user using an auto suggestion mode or automatic classifications can be verified or overruled. The output of the components is a revised version of the output of the creative detector. |
TV Workflow
![]() |
The Audio Segmentation module creates a temporal segmentation of an audio file into the classes “speech”, “music”, “other” (sounds), “silence”, and all possible mixtures of “speech”, “music” and “other”. The result is a valid MPEG-7 document containing timecode and classification terms for each segment. The Jingle Recognition module tries to recognise previously learnt reference jingles in the audio data of a TV spot. If a jingle is found, a segment is added to the MPEG-7 description of the Audio Segmentation input containing timecode and the associated brand metadata. The reference jingles are learnt by extracting a fingerprint from jingle audio files. These fingerprints are stored in a jingle database. The metadata is supplied by MPEG-7 descriptions of the jingles. The Word Spotting modules produce location information about when a word or concept is found plus a confidence measure within speech parts of television commercials (speech in isolation and in music). The Speech to Text modules produce full text transcripts with word timing information of those parts in television commercials that contain speech (speech in isolation and in music). Currently speech-to-text for Dutch and English is available. Logo Recognition is performed to find out to which product or company an advertisement belongs. The logo recognition module works with a number of previously learnt logos (one or several images per logo) and tries to recognise these logos in a video. It is possible to recognise learnt logos everywhere in an image at different sizes, rotated, with minor colour changes, under different illumination conditions and perspective distortions. Video Fingerprinting tries to find existing creatives, which are identical or similar to the investigated spot. For this task the input spot is first fragmented into shots with a simple hardcut shot detection. From each shot the first frame, unless it is a blackframe, is used as keyframe. Each keyframe of the input spot is than matched against all the keyframes of the known creatives using the similarity search of the image fingerprinting module. Then the results of all the input keyframes are merged and the shot structure of the input spot is matched against the shot structure of the best matching creatives from the database to find out if they are identical or similar to the input spot. The OCR, Text Analysis and Creative Detector components are similar to the ones used in the press workflow, only adapted for video input. |


