COS 470: Natural language processing

MW 14:00-15:30 -

Instructor: Behrooz Mansouri

Student Hours:
M 15:30-16:30
T 11:00-12:00

Course Topics

This course provides an introduction to the field of computational linguistics, aka natural language processing (NLP) providing a theoretical foundation and hands-on (lab-style) practice in computational approaches for processing natural language text. We will discuss problems involving different language system components (such as meaning in context and linguistic structures). Students will collaborate in teams on modeling and implementing natural language processing and digital text solutions using Python and a variety of relevant tools. We will begin by discussing machine learning methods for NLP as well as core NLP, such as language modeling, part of speech tagging, and parsing. We will also discuss applications such as information extraction, machine translation, text generation, and automatic summarization.

Course Materials

Learning Outcomes


Session Topic Notes
1 Introduction -
2 Python Refresher -
3 Regular Expressions -
4 Tokenization -
5 Language Models -


1 Tokenization -


This course is project-based, with no final exam. Students can find the project description here.


1 Python Refresher -
2 Regular Expressions -