Normal view MARC view ISBD view

Derin öğrenme ile görüntülerde kararlı öznitelik eşleme tekniklerinin geliştirilmesi / (Record no. 200464073)

MARC details
000 -LEADER
fixed length control field	09950nam a2200445 i 4500
001 - CONTROL NUMBER
control field	200464073
003 - CONTROL NUMBER IDENTIFIER
control field	TR-AnTOB
005 - DATE AND TIME OF LATEST TRANSACTION
control field	20250310104729.0
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field	ta
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	171111s2024 xxu e mmmm 00\| 0 eng d
035 ## - SYSTEM CONTROL NUMBER
System control number	(TR-AnTOB)200464073
040 ## - CATALOGING SOURCE
Original cataloging agency	TR-AnTOB
Language of cataloging	eng
Description conventions	rda
Transcribing agency	TR-AnTOB
041 0# - LANGUAGE CODE
Language code of text/sound track or separate title	Türkçe
099 ## - LOCAL FREE-TEXT CALL NUMBER (OCLC)
Classification number	TEZ TOBB FBE BİL Ph.D’24 AYD
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name	Aydoğdu, Muhammet Fatih
Relator term	author
9 (RLIN)	71116
245 10 - TITLE STATEMENT
Title	Derin öğrenme ile görüntülerde kararlı öznitelik eşleme tekniklerinin geliştirilmesi /
Statement of responsibility, etc.	Muhammet Fatih Aydoğdu; thesis advisor Fatih Demirci.
246 13 - VARYING FORM OF TITLE
Title proper/short title	Development of robust feature matching techniques in images using deep learning
264 #1 - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture	Ankara :
Name of producer, publisher, distributor, manufacturer	TOBB ETÜ Fen Bilimleri Enstitüsü,
Date of production, publication, distribution, manufacture, or copyright notice	2024.
300 ## - PHYSICAL DESCRIPTION
Extent	xxii, 88 pages :
Other physical details	illustrations ;
Dimensions	29 cm
336 ## - CONTENT TYPE
Content type term	text
Content type code	txt
Source	rdacontent
337 ## - MEDIA TYPE
Media type term	unmediated
Media type code	n
Source	rdamedia
338 ## - CARRIER TYPE
Carrier type term	volume
Carrier type code	nc
Source	rdacarrier
502 ## - DISSERTATION NOTE
Dissertation note	Tez (Doktora Tezi)--TOBB ETÜ Fen Bilimleri Enstitüsü Kasım 2024
520 ## - SUMMARY, ETC.
Summary, etc.	Bilgisayarlı görü teknikleri görüntüler üzerinde tespit edilen öznitelik noktalarından yaygın bir şekilde yararlanmaktadırlar. Bu öznitelik noktaları kullanılarak görüntü çiftleri arasında yararlı tutarlılıklar tespit edilebilir. Benzerlikler kullanılarak görüntü eşleştirme, nesne tanıma, görüntü dikişleme, görüntü mozaiği oluşturma ve nesne takibi gibi birçok uygulama için başarılı ürünler elde edilebilir. Görüntü çiftleri üzerinde tespit edilen öznitelik noktalarının eşleştirilmesi sırasında noktaların öznitelik uzayında birbirlerine göre uzaklıkları temel alınır. Öznitelik uzayında birbirlerine en yakın olan özniteliklerin yakınlıkları eğer yeterince özgünse bu öznitelikler varsayılan eşleşmeler olarak kabul edilir. Ancak bu varsayılan eşleşmeler çoğu zaman hatalı eşleşmeleri tümüyle saf dışı bırakamaz. Bunun için literatürde tekrarlamalı algoritmalardan faydalanarak en çok sayıda varsayılan eşleşmeyi içerecek şekilde bir geometrik tutarlılık elde edilmeye çalışılır. Elde edilen geometrik tutarlılık sayesinde varsayılan eşleşmelerde bulunan hatalı eşleşmeler elenir. Bu yöntemdeki temel sorun tüm görüntü çiftlerinde başarıya götürecek bir tekrarlama sayısının elde edilmesinin pratikteki imkansızlığıdır. Derin öğrenme yöntemlerinin literatürde birçok problemde alternatiflerine göre daha etkin sonuçlar elde etmesinden sonra çoğu bilgisayarlı görü ve görüntü işleme probleminde olduğu gibi öznitelik eşleştirme problemi için de derin öğrenme ile eğitilmiş yapay sinir ağları kullanan çözümler literatüre yerleşmişlerdir. Bu tezde, kararlı öznitelik eşleme yapan derin öğrenme ağları ile bağıl kamera pozisyonu tahmini problemine çözümler geliştirilmiştir. Öncelikli olarak literatürdeki çalışmaların hepsine temel oluşturan n-n'lik çatı incelenmiştir. n-n'lik bu çatının varsayılan eşlemelerdeki özniteliklerin koordinatlarından oluşan bir küme tipi girdi üzerinde çalıştığı gözlemlenmiştir. n-n'lik çatıya ait girdiyi işleyen literatürdeki çalışmaların varsayılan öznitelik eşleşmelerinde genel bağlam ve yerel bağlam çıkarırken karşılaştığı zorluklar incelenmiştir. n-n'lik çatıya alternatif olarak 1-1'lik alternatif bir çatı oluşturulmuştur. n-n'lik çatıda her bir yığın örneğinde tek bir görüntü çiftine ait veri bulunurken öne sürülen 1-1'lik çatıda her bir yığında sadece tek bir görüntü çiftine ait veri bulunmaktadır. 1-1'lik çatıdaki görüntü çiftine ait her bir varsayılan eşleşme için özel bir bağlam kanalının kullanılmasına imkan sağlanmaktadır. Dahası bu bağlam kanalındaki girdi satırların her bir varsayılan eşleşme için özel olarak sıralanabilmesi mümkündür. Tez kapsamında 1-1'lik çatı kullanan çok sayıda ve farklı tipte ağ katmanları içeren yapay sinir ağları oluşturulmuştur. Oluşturulan 1-1'lik yapay sinir ağları ve literatürdeki n-n'lik başarılı sinir ağları Tensor İşlem Birimleri üzerinde eğitilmişlerdir. Eğitimlerde mimarilere ait hesapsal grafiklerdeki parametreler güncellenirken birden fazla kayıp fonksiyonunun birleşiminden oluşan bileşke kayıp fonksiyonundan faydalanılmıştır. Başarım metriği olarak minimum ortalama hassasiyet ölçütü temel alınmıştır. Elde edilen sonuçlara göre 1-1'lik çatı için oluşturulan yapay sinir ağları literatürdeki n-n'lik yapay sinir ağlarının biri hariç tümünün başarımlarını \%30'a varan farklar ile geride bırakmıştır. Ayrıca tezin asıl konusu üzerinde çalışmaya başlamadan önce derin öğrenme kullanarak yapay sinir ağlarının eğitilmesine aşinalığı arttırmak için bir durum çalışması yapılmıştır. Bunun için portre fotoğrafları üzerinden yaş sınıfı tahmini yapan yapay sinir ağları geliştirilmiştir. Kullanılan veri setindeki portreler 6 sınıfa ayrılmıştır. Literatürdeki 6 katmanlı bir yapay sinir ağının 18 ve 34 katmanlı artık sinir ağlarına göre daha başarılı olduğu gözlemlenmiştir. Kullanılan artık sinir ağlarının 6 sınıflı yaş tahmini problemi için aşırı öğrenmeye sebep olacak kadar derin olduğu veya veri setinin yeterince zengin olmadığı sonucuna varılmıştır.

Summary, etc.	Computer vision techniques widely utilize feature points detected in images. Using these feature points, useful consistencies can be detected between pairs of images. By leveraging these similarities, successful results can be achieved for various applications such as image matching, object recognition, image stitching, creating image mosaics, and object tracking. When matching feature points detected in pairs of images, the distances between the points in the feature space are used as the basis. If the proximities of the features closest to each other in the feature space are sufficiently distinctive, these features are considered putative matches. However, these putative matches often cannot entirely exclude incorrect matches. To address this, iterative algorithms are used in the literature to achieve geometric consistency that includes the maximum number of putative matches. Thanks to the geometric consistency obtained, incorrect matches in the putative matches are eliminated. The main issue with this method is the practical impossibility of finding a repetition number that guarantees success for all image pairs. Following the effective results achieved by deep learning methods over alternatives in many computer vision problems in the literature, solutions that use artificial neural networks trained using deep learning have become prevalent for the feature matching problem, as they have in most computer vision and image processing problems. In this thesis, solutions to the problem of relative camera pose estimation have been developed using deep learning networks that perform stable feature matching. First and foremost, the n-to-n framework, which forms the basis of all studies in the literature, was examined. This n-to-n framework was observed to operate on a set-type input consisting of the coordinates of features in putative matches. The challenges faced by studies in the literature using this framework in deriving global and local contexts from putative feature matches were analyzed. As an alternative to the n-to-n framework, a one-to-one framework was proposed. In the n-to-n framework, each batch contains data belonging to a single image pair, whereas, in the proposed one-to-one framework, each batch also includes data belonging to only a single image pair. The one-to-one framework allows the use of a dedicated context channel for each putative match of the image pair. Moreover, it is possible to specifically sort the input rows in this context channel for each putative match. Within the scope of the thesis, a variety of fundamental artificial neural networks with different architectures using the one-to-one framework were generated. The one-to-one artificial neural networks developed, along with the successful n-to-n ones in the literature, were trained on Tensor Processing Units (TPUs). Multiple loss functions were utilized during the training. The minimum average precision was used as the performance metric. According to the results, the artificial neural networks designed for the one-to-one framework outperformed all but one of the n-to-n neural networks from the literature by up to 30\%. Additionally, before working on the main topic of the thesis, a case study was conducted to increase familiarity with deep neural networks. For this purpose, artificial neural networks were developed to estimate age categories using portrait photographs. The portraits in the dataset were divided into six classes. It was observed that a 6-layer artificial neural network from the literature performed better than the 18 and 34-layer residual neural networks. It was concluded that the 18-layer and 34-layer residual networks might have been unnecessarily deep for the 6-class age estimation problem, leading to overfitting, or that the dataset lacked sufficient diversity to clearly demonstrate the advantages of deeper networks.
653 ## - INDEX TERM--UNCONTROLLED
Uncontrolled term	Derin ögrenme

Uncontrolled term	Bağıl kamera pozisyonu tahmini

Uncontrolled term	Tensor işleme uniteleri

Uncontrolled term	Öznitelik eşleştirme

Uncontrolled term	Derin öğrenme ile yaş sınıfı tahmini

Uncontrolled term	Deep learning

Uncontrolled term	Relative camera pose estimation

Uncontrolled term	Feature matching

Uncontrolled term	Tensor procesing units

Uncontrolled term	Age class estimation using deep learning
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name	Demirci, Fatih
9 (RLIN)	39931
Relator term	advisor
710 ## - ADDED ENTRY--CORPORATE NAME
Corporate name or jurisdiction name as entry element	TOBB Ekonomi ve Teknoloji Üniversitesi.
Subordinate unit	Fen Bilimleri Enstitüsü
9 (RLIN)	77078
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type	Thesis
Source of classification or shelving scheme	Other/Generic Classification Scheme

Holdings
Withdrawn status	Lost status	Source of classification or shelving scheme	Not for loan	Collection code	Home library	Current library	Shelving location	Date acquired	Source of acquisition	Total Checkouts	Full call number	Barcode	Date last seen	Copy number	Date shelved	Koha item type
		Other/Generic Classification Scheme	Ödünç Verilemez-Tez / Not For Loan-Thesis	Tezler	Merkez Kütüphane	Merkez Kütüphane	Tez Koleksiyonu / Thesis Collection	10/03/2025	Bağış / Donation		TEZ TOBB FBE BİL Ph.D’24 AYD	TZ01793	10/03/2025	1	10/03/2025	Thesis