OuluVS2: a multi-view audiovisual database


4/25/2016 - Database available for downloading

5/17/2016 - Workshop "Multi-view lip-reading/audio-visual challenges" accepted by ACCV'16

8/11/2016 - Workshop paper submission opened

The OuluVS2 audiovisual database was collected at the Center of Machine Vision Research, Department of Computer Science and Engineering, University of Oulu, Finland. It was designed to facilitate research on visual speech recognition, sometimes also referred to as automatic lip-reading.

The database contains video recordings from 52 subjects speaking three types of utterances: including continuous digit strings, short phrases and TIMIT sentences. To address the fact that our talking motion is produced in a three-dimensional space, we placed six cameras around the speaker to film from five different views simultaneously, resulting in more than 20k video recordings.

Download Instructions

This database is FREE for public research. To downoad it, you need to sign a license agreement (click here for downloading) and send a scan copy of the signed agreement to Dr. Ziheng Zhou whose email can be found below. An accont would be created for you afterwards and an HTTP link sent to you to download the database.