Chibelushi, C.C., Mason, J.S.D. and Deravi, F. (1997) Feature-level data fusion for bimodal person recognition. In: Sixth IEE Int. Conf. on Image Processing and its Applications, 14 - 17July 1997, Dublin, Ireland.
Full text not available from this repository.Abstract or description
Consistently high person recognition accuracy is difficult to attain using a single recognition modality. This paper assesses the fusion of voice and outer lip-margin features for person identification. Feature fusion is investigated in the form of audio-visual feature vector concatenation, principal component analysis, and linear discriminant analysis. The paper shows that, under mismatched test and training conditions, audio-visual feature fusion is equivalent to an effective increase in the signal-to-noise ratio of the audio signal. Audio-visual feature vector concatenation is shown to be an efective method for feature combination, and linear discriminant analysis is shown to possess the capability of packing discriminating audio-visual information into fewer coefficients than principal component analysis. The paper reveals a high sensitivity of bimodal person identification to a mismatch between LDA or PCA feature-fusion module and speaker model training noise-conditions. Such a mismatch leads to worse identification accuracy than unimodal identification.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Faculty: | Previous Faculty of Computing, Engineering and Sciences > Computing |
Event Title: | Sixth IEE Int. Conf. on Image Processing and its Applications |
Event Location: | Dublin, Ireland |
Event Dates: | 14 - 17July 1997 |
Depositing User: | Claude CHIBELUSHI |
Date Deposited: | 12 May 2013 20:36 |
Last Modified: | 24 Feb 2023 13:38 |
URI: | https://eprints.staffs.ac.uk/id/eprint/1116 |