Tandoğan, Sinan Erkam

İnsan sesinin ayırt edici kapasitesinin irdelenmesi / Sinan Erkam Tandoğan. - xiii, 39 pages ; 29 cm

Tez (Yüksek Lisans)--TOBB ETÜ Fen Bilimleri Enstitüsü Ağustos 2018

Biyometrik tabanlı kimlik doğrulama sistemleri yaygın olarak parolalar yerine
kullanılmaya başlamıştır. Bir mikrofon kullanılarak kolayca elde edilebileceği için
ses biyometrisi tüm biyometriler arasında daha popülerdir. Ses biyometrisinin
kullanımı gün geçtikçe artmasına rağmen konuşmacı doğrulama sistemlerinin
kapasitesi ile ilgili çalışmalar sınırlıdır. Hatta bu alandaki çalışma sonuçları birbirleri
ile çelişerek bu konudaki problemleri çözmek yerine konuşmacı sistemlerine olan
güvenin azalmasına sebep olmaktadır.
Bu nedenlerden ötürü, bu tezde, ses tabanlı kimlik doğrulama sistemlerinin diğer bir
değişle konuşmacı doğrulama sistemlerinin kapasiteleri entropi açısından
araştırılmıştır. Bu konu üç temel başlık altında incelenmiştir. İlk olarak biyometrik
tabanlı sistemler için şimdiye kadar önerilen yöntemler detaylı bir şekilde incelenmiş
ve bu yöntemlerin ses tabanlı kimlik doğrulama sistemlerine uygun olup olmadığı da
araştırılmıştır. İkinci olarak konuşmacı doğrulama sistemlerinde kullanılan en
gelişmiş yöntemlerden bahsedilmiştir. Konuşmalardan çıkartılan özellikler, bu
özellikleri temsil etmek için kullanılan modeller ve bu modellerde kullanılan sestabanlı kimlik doğrulama yöntemleri ayrı ayrı incelenmiştir. Son olarak kullanılan
veri kümelerinin kişi ve süre gibi kısıtlarından dolayı açık kaynaklar kullanılarak
20000’den fazla kişiden oluşan veri kümesi oluşturulmuştur.
Kapasiteyi ölçmek için en gelişmiş konuşmacı doğrulama sistemi ile uyumlu yeni bir
yaklaşım önerilmiş ve bu yaklaşımın matematiksel alt yapısı detaylı bir şekilde
açıklanmıştır. Bu yaklaşım farklı durumlarda farklı veri kümeleri kullanılarak
incelenmiştir. Son olarak kapasite tahmini ile ilgili yeni araştırma konularından
bahsedilmiştir. Biometric-based authentication systems have been begun to be widely used instead
of passwords. Because voice can be captured easily by using a microphone, voice is
more popular between all biometric modalities. Although the use of voice biometrics
is increasing day by day, the studies about capacity of speaker verification systems
are limited. Moreover, the results of these studies conflict with each other and which
in turn raise doubts reliability of speaker verification systems instead of answering
questions.
Because of these reasons, in this thesis, the capacity of voice-based authentication
systems, in other words, speaker verification systems, is investigated in terms of
entropy. The subject has been examined under three main headings. Firstly, proposed
approaches up to now for measuring capacity of biometric systems are examined in
detail and whether these approaches are suitable for voice-based authentication
systems or not was also investigated. Secondly, state-of-the-art methods used in
speaker verification systems are overviewed. The features extracted from the
speeches, the models used for representation of the features, and voice-based
authentication methods for these models are examined separately. Thirdly, because
the dataset used in speaker verification systems contains limited number of speakers and speeches, by using open sources a new dataset containing more than 20000
speakers is created.
A new approach suitable with state-of-the-art speaker verification system is proposed
for measuring capacity and the mathematical background of this approach is
explained in detail. This approach is examined in different cases by using different
datasets. Finally, new research topics on capacity estimation are mentioned.

Subjects--Topical Terms:
Dissertations, Academic

Subjects--Index Terms: Konuşmacı doğrulama İ-vektör Entropi Karşılıklı bilgi ölçütü Biyometrik bilgi Speaker verification I-vector Entropy Mutual entropy Biometric information