“Audio without video is music; video without audio is surveillance”.

Audiola 4.

The Audiola SDK, audio analysis and processing library is now in its fourth version. It can be used on its own or in combination with smartience.


  • Audio Scene Analysis:
    • dialog spotting - automatically adjust settings for announcer, background music and dialog for clarity and consistency.
    • speaker recognition - speaker indexing by recognition. APIs for database training and query. See video below.
    • music analysis - pitch estimation, onset detection, beat detection, MFCCs and chromagrams.
  • Timewarp and Varispeed:
    • transparent, glitch-free adjustment to the pace of delivery.
    • independent control of speed, pitch and sample rate.
    • algorithms optimised for speech or music.
    • Dynamic time warping (DTW) for alignment.
  • Multi-band dynamics:
    • limiter, compressor, expander, noise gate, de-esser.
    • building blocks for Commercial Advertising Loudness Mitigation Act HR1084 CALM. Recent legislation makes such technology essential.
  • Surround sound tools:
    • bass management for optimum surround sound listening.
    • downmix to multiple formats.
    • virtual surround encoder.
  • A standard range of filters:
    • parametric EQ, shelving EQ, low/mid/high pass, brickwall, hilbert, etc.
    • delay, echo, reverberation, chorus, etc.


This video is an example of a system trained on 12 different speakers who are then identified from a single word utterance, “zero”. The quality is deliberately varied, mono and stereo, to show how robust the technique is.

Audiola is under continuous development with new features and capabilities added constantly. Please contact Uncut to discuss your requirements.