Normal view MARC view ISBD view

Patolojik seslerin tanısı için derin öğrenme tabanlı tıbbi karar destek sisteminin geliştirilmesi / (Record no. 200450005)

MARC details
000 -LEADER
fixed length control field	07656nam a2200433 i 4500
003 - CONTROL NUMBER IDENTIFIER
control field	TR-AnTOB
005 - DATE AND TIME OF LATEST TRANSACTION
control field	20230908001003.0
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field	ta
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	171111s2022 xxu e mmmm 00\| 0 eng d
035 ## - SYSTEM CONTROL NUMBER
System control number	(TR-AnTOB)200450005
040 ## - CATALOGING SOURCE
Original cataloging agency	TR-AnTOB
Language of cataloging	eng
Description conventions	rda
Transcribing agency	TR-AnTOB
041 0# - LANGUAGE CODE
Language code of text/sound track or separate title	Türkçe
099 ## - LOCAL FREE-TEXT CALL NUMBER (OCLC)
Classification number	TEZ TOBB FBE BMM YL’22 BİG
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name	Bigat, İrem
Relator term	author
9 (RLIN)	129050
245 10 - TITLE STATEMENT
Title	Patolojik seslerin tanısı için derin öğrenme tabanlı tıbbi karar destek sisteminin geliştirilmesi /
Statement of responsibility, etc.	İrem Bigat; thesis advisor Osman Eroğul.
246 13 - VARYING FORM OF TITLE
Title proper/short title	Development of a deep learnıng-based medıcal decısıon support system for the dıagnosıs of pathologıcal voıces
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture	Ankara :
Name of producer, publisher, distributor, manufacturer	TOBB ETÜ Fen Bilimleri Enstitüsü,
Date of production, publication, distribution, manufacture, or copyright notice	2022.
300 ## - PHYSICAL DESCRIPTION
Extent	xv, 95 pages :
Other physical details	illustrations ;
Dimensions	29 cm
336 ## - CONTENT TYPE
Content type term	text
Content type code	txt
Source	rdacontent
337 ## - MEDIA TYPE
Media type term	unmediated
Media type code	n
Source	rdamedia
338 ## - CARRIER TYPE
Carrier type term	volume
Carrier type code	nc
Source	rdacarrier
502 ## - DISSERTATION NOTE
Dissertation note	Tez (Yüksek Lisans Tezi)--TOBB ETÜ Fen Bilimleri Enstitüsü Ağustos 2022
520 ## - SUMMARY, ETC.
Summary, etc.	Patolojik duruma bağlı olarak normal konuşma akışının bozulması, ses bozukluğu olarak bilinir. Bu nedenle, mevcut herhangi bir bozukluk, konuşma üretim sisteminin işleyişini bozar ve dolayısıyla bozuk bir ses üretir. Bazı laringeal patolojiler hayatı tehdit eder, bu nedenle ses bozukluğunun erken tespiti önemlidir. Patolojik seslerin tespitinde bir karar destek sisteminin geliştirilmesi hayati önem taşımaktadır. Patolojik seslerin belirlenmesi amacıyla seslerden çıkarılan özniteliklerin değerlendirilmesinde istatistiksel yöntemlerin grup bazında bir sonuç vermesi nedeniyle bireysel düzeyde bir cevap elde edilebilmesi amacıyla son yıllarda makine öğrenme yöntemleri araştırmacılar tarafından ilgi çekici bir konu olmuştur. Bununla birlikte makine öğrenmesinin özniteliklerin manuel çıkarılmasına ihtiyaç duyması nedeniyle optimal özniteliklerin otomatik olarak çıkarılabildiği derin öğrenme teknikleri araştırmacıların güncel araştırma konuları arasına girmiştir. Ancak henüz patolojik ses bozukluklarının tespiti alanında derin öğrenme tekniklerinin kullanımı ile ilgili az sayıda araştırma çalışması bulunmaktadır. Bu tez çalışmasında, patolojik seslerin belirlenmesi amacıyla derin öğrenme yöntemleri kullanılmıştır. Çalışmada Saarbruecken Ses Veritabanından vokal kordlardaki yapısal değişikliklerin neden olduğu organik disfoniye sebep olan patolojilere sahip hastaların ses kayıtları seçilmiştir. Bu patolojiler arasında larenjit, lökoplazi, Reinke ödemi, rekürren laringeal sinir felci, vokal kord karsinomu ve vokal kord polibi bulunmaktadır. Her bir bireyin nötr perdesinde sürekli sesli /a/ sesi kayıtları seçilmiştir. 380'i sağlıklı ve 380'i patolojik olmak üzere 760 ses kaydı kullanılmıştır. Veriler, sırasıyla %75 ve %25 örnek içeren eğitim seti ve test seti olarak ayrılmıştır. Ses sinyallerine öncelikle dalgacık gürültü giderme işlemi uygulanmıştır. Daha sonrasında ses sinyallerinin spektrogram görüntüleri alınarak dört faklı Evrişimsel Sinir Ağı (ESA) mimarisine girdi olarak verilmiştir. Tez kapsamında ESA mimarisi olarak GoogleNet, ResNet-50, AlexNet ve SqueezeNet çalışılmıştır. İlk aşamada patolojik seslerin belirlenmesi amacıyla seçilen Evrişimsel Sinir Ağı mimarileri kendi sınıflandırıcılarıyla birlikte kullanılmıştır. Daha sonra aynı Evrişimsel Sinir Ağı mimarileri bu kez sadece öznitelik çıkarımında kullanılmıştır ve Komşuluk Bileşen Analizi ile öznitelik seçimi yapıldıktan sonra farklı sınıflandırma algoritmalarıyla sınıflandırılarak oluşturulan modellerin performans analizleri yapılmıştır. Kullanılan sınıflandırma algoritmaları Karar ağaçları, Destek Vektör Makineleri, k-En yakın komşuluk, Ensemble ve bu çalışma için tasarlanmış bir karar ağacı yöntemidir. En başarılı performans SqueezeNet mimarisinden çıkarılan özniteliklerin Ensemble algoritması ile sınıflandırılması sonucu elde edilmiştir. Gözlemlenen bulgular, önerilen bu modelin patolojik seslerin belirlenmesinde umut verici olduğunu göstermektedir.

Summary, etc.	The disruption of normal speech flow due to pathological conditions is known as a voice disorder. Therefore, any existing disorder disrupts the speech production system's functioning and produces a distorted voice. Since some laryngeal pathologies are life-threatening, the early detection of voice disorders is important. For this purpose, there is a need to develop a decision support system in the detection of pathological voices. In recent years, machine learning methods have become an interesting research topic to determine pathological voices in order to obtain an individual-level answer, since statistical methods give a group-based result in the evaluation of features extracted from voices. However, since machine learning requires manual extraction of features, deep learning techniques, in which optimal features can be extracted automatically, have become one of the current research topics. However, there are only few research studies on the use of deep learning techniques in the detection of pathological voice disorders. In this thesis study, deep learning methods were used to identify pathological voices. The voice recordings of patients with pathologies causing organic dysphonia due to structural changes in the vocal cords were selected from the Saarbruecken Voice Database. These pathologies included laryngitis, leukoplakia, Reinke's edema, recurrent laryngeal nerve paralysis, vocal cord carcinoma, and vocal cord polyps. The sustained vowel /a/ at the neutral pitch of each individual was selected. The sample included a total of 760 recordings, of which 380 belonged to healthy voices and 380 belonged to pathological voices. The data were divided into training and test sets containing 75% and 25% of the samples, respectively. In the analysis of the samples, first, wavelet noise denoising was applied to the voice signals. Then, the spectrogram images of the voice signals were taken and utilized as inputs in four different Convolutional Neural Network (CNN) architectures, namely GoogleNet, ResNet-50, AlexNet, and SqueezeNet. The selected CNN architectures were used with their own classifiers to determine the pathological voices. Subsequently, the same architectures were employed only for feature extraction, then the Neighborhood Component Analysis employed for feature selection. The performance analyses of the models were undertaken by classifying the selected features with the following classification algorithms: Decision trees, Support Vector Machines, k-Nearest Neighborhood, Ensemble, and a decision tree method designed for this study. The most successful performance was obtained from the method in which the features had been extracted by the SqueezeNet architecture and classified with the Ensemble algorithm. According to the results, the proposed model is promising for the identification of pathological voices.
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term	Patolojik ses belirleme

Uncontrolled term	Derin öğrenme

Uncontrolled term	Evrişimsel sinir ağları

Uncontrolled term	GoogleNet

Uncontrolled term	ResNet-50

Uncontrolled term	AlexNet

Uncontrolled term	SqueezeNet

Uncontrolled term	Pathological voice detection

Uncontrolled term	Deep learning

Uncontrolled term	Convolutional neural networks
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name	Eroğul, Osman
9 (RLIN)	126315
Relator term	advisor
710 ## - ADDED ENTRY--CORPORATE NAME
Corporate name or jurisdiction name as entry element	TOBB Ekonomi ve Teknoloji Üniversitesi.
Subordinate unit	Fen Bilimleri Enstitüsü
9 (RLIN)	77078
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type	Thesis
Source of classification or shelving scheme	Other/Generic Classification Scheme

Holdings
Withdrawn status	Lost status	Source of classification or shelving scheme	Not for loan	Collection code	Home library	Current library	Shelving location	Date acquired	Source of acquisition	Total Checkouts	Full call number	Barcode	Date last seen	Copy number	Date shelved	Koha item type
		Other/Generic Classification Scheme	Ödünç Verilemez-Tez / Not For Loan-Thesis	Tezler	Merkez Kütüphane	Merkez Kütüphane	Tez Koleksiyonu / Thesis Collection	16/09/2022	Bağış / Donation		TEZ TOBB FBE BMM YL’22 BİG	TZ01435	16/09/2022	1	16/09/2022	Thesis