PUTTING NEWS ARTICLES IN CONTEXT

No Thumbnail Available

Date

2020-07

Journal Title

Journal ISSN

Volume Title

Publisher

Indian Statistical Institute, Kolkata

Abstract

The work of this dissertation has been done along the lines of TREC News Track Background Linking task. The task is, given a news article suggest other news articles that provide context and background to the current article. As we know, context and background are highly subjective terms. Here they are measured by comparing the system retrieved documents with a set of documents already marked relevant according to a panel of experts. The entire task is done on the Washington Post data set, A collection of 591537 news articles that appeared in Washington Post from 2012 to 2017. In this dissertation we explore Six methods used to solve this task. These tech- niques are based on standard Information Retrieval methods and Natural Language Processing techniques. We compare them with each other and pit them against the best performing methods. We use JAVA as the main programming language for data parsing, indexing and searching. Python is also used for data exploration in some limited cases.

Description

Dissertation under the supervision of Dr. Mandar Mitra, Indian Statistical Institute, Kolkata,

Keywords

verbose queries, query expansion,, word embedding

Citation

41p.

Endorsement

Review

Supplemented By

Referenced By