Learning with Long-Tailed Noisy Labels

Dey, Sarbajit

Learning with Long-Tailed Noisy Labels

dc.contributor.author	Dey, Sarbajit
dc.date.accessioned	2025-02-07T11:31:22Z
dc.date.available	2025-02-07T11:31:22Z
dc.date.issued	2024-06
dc.description	Dissertation under the supervision of Dr. Swagatam Das	en_US
dc.description.abstract	Deep neural networks (DNNs) have shown exceptionally good performance in a variety of activities by using correctly labelled and ’good’ training datasets. These remarkable results, however, are mostly observed with datasets that are carefully controlled and precisely structured. Conversely, data obtained from real-world applications frequently encounter substantial problems that are not commonly found in these ’good’ datasets. Two common biases frequently found in real-world data are: (i) long-tailed class distribution, where a small number of classes have a significant number of instances while the rest have only a few, and (ii) label noise, which refers to inaccuracies and errors in the assigned data labels. When learning models are specifically built to address only one of these biases, either by focusing on the long-tailed nature of the data or on the noise in the labels, their performance declines when they come across data that has both long-tailed distribution and noisy labels, which is a very common occurrence in real-world applications. This work investigates the complex issue of learning from datasets with long-tailed label noise. In real-world problems such as autonomous driving, medical diagnosis, and large-scale user-generated content platforms, the data obtained frequently shows these properties. Therefore, it is essential to create strong learning algorithms that can successfully address both problems at the same time. Our objective is to study and make meaningful contributions to the progress of deep learning methods that can effectively handle real-world data difficulties while being robust and dependable. We study the current methods for handling these learning problems, focuse on their shortcomings and try to improve the same. We propose Median of Means for centroid estimation on a clean subset of the dataset. We then use the SFA framework and Semi supervised learning for classification task on imbalanced noisy labels.	en_US
dc.identifier.citation	52p.	en_US
dc.identifier.uri	http://hdl.handle.net/10263/7513
dc.language.iso	en	en_US
dc.publisher	Indian Statistical Institute, Kolkata	en_US
dc.relation.ispartofseries	MTech(CS) Dissertation;22-26
dc.subject	Classification	en_US
dc.subject	Class imbalance	en_US
dc.subject	Noisy labels	en_US
dc.subject	SFA	en_US
dc.subject	Median of Means	en_US
dc.title	Learning with Long-Tailed Noisy Labels	en_US
dc.type	Other	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Sarbajit-Dey_CS2226_Mtech2024d-1.pdf
Size:: 1.51 MB
Format:: Adobe Portable Document Format
Description:: Dissertations - M Tech (CS)

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Dissertations - M Tech (CS)