COS 470: Introduction to Information Retrieval

TTh 12:30-1:45 PM - Payson Smith 41

Instructor: Behrooz Mansouri

Office Hours: TTh 10:00-11:00 AM

Course Topics

An introduction to the theories and implementation techniques used in modern search engines. As part of the course, students will develop their own search engines using available information retrieval (IR) toolkits. Topics in the course include user interfaces for information retrieval systems, search result evaluation, text processing including natural language processing, retrieval models (e.g., Boolean, vector space, probabilistic, and learning-based methods such as neural information retrieval), and ethical dilemmas regarding the use of IR systems in society.

Course Materials

Learning Outcomes

Lectures

Session Topic Notes
1 Introduction to IR -
2 Python Refresher Codes
2.5 Python Refresher (Cont.) -
3 Boolean Retrieval -
4 Tokenization & Stemming Huffman Coding
5 Indexing & Retrieval -
6 Evaluation Measures -
7 Course Project Review Part 1 Invited Speaker: Daniel Lawrence (Library Specialist)
8 Term Weighting and Vector Space -
9 Constructing Test Collections Invited Speaker: Michael D. Ekstrand
10 Probablistic Models -
11 Language Models -
12 Project Presentations -
13 PyTerrier Tutorial -
14 Query Expansion -
15 Learning to Rank -
16 Ranking (Cont.) Ranx
17 Classification and Clustering -
18 Query Log Analysis Invited Speaker: Jian Wu
19 Project Part II Presentations -
20 Neural Information Retrieval (I) -
21 Neural Information Retrieval (II) Reading Assignment
22 Hugging Face Project Part III
23 Math Information Retrieval -
24 Telling Stories through Timelines Invited Speaker: Ricardo Campos
25 A Few Remaining Topics
26 Review Class Optional attendance

Assignment

Deadline Topic Notes
Sep 15 Assignment 1 -
Sep 29 Assignment 2 -
Oct 25 Assignment 3 -
Nov 17 Assignment 4 Reading Assignment
Dec 6 Assignment 5 -

Projects

Deadline Topic Notes
Oct 6 Project - Part 1 -
Nov 10 Project - Part 2 -
Dec 13 Project - Part 3 Sample Codes Provided!