This course introduces some of the major issues and solutions in natural language processing. Both traditional rule-based context-free models and modern corpus-based quantitative techniques will be discussed. Selected natural language applications will also be introduced. Concepts taught in class will be reinforced by hands-on practical exercises.
Prerequisite: LT2231 Introduction to Language Technology / CS2311 Computer Programming / MS3111 Quantitative Business Analysis with Visual Basic for Applications / CS2360 Java Programming / IS2240 Python Programming for Business.
Semesters Taught: Fall 2022, Spring 2023.
Course Information: Syllabus
# | Lectures |
---|---|
1 | Tokenisation | slides | colab | reading: slp2, nltk1, nltk2 |
2 | Part-of-speech tagging | slides | colab | reading: slp8, nltk5 |
3 | N-gram models | slides | colab | reading: slp3.1-3.2, nltk2:2.4 |
4 | Context-free grammars | slides | colab | reading: slp12.1-12.4, nltk8:1-3, stabler_ch1 |
5 | Parsing | slides | colab | reading: slp13.2, nltk8:4, stabler_ch2,4,5 |
6 | Naive Bayes | slides | colab | reading: slp4, nltk6:1,3,5 |
7 | Logistic regression | slides | colab | reading: slp5 |
8 | Feedforward neural networks | slides | colab | reading: slp7.1-7.3 |
9 | Computational graph and backpropagation | slides | colab | reading: slp7.6 | standford course material on backprop | khan academy: understanding derivatives |
10 | Word embeddings | slides | colab | reading: slp6 |
11 | Feedforward neural networks with embeddings, PyTorch | slides | colab | reading: slp7.4-7.5 | PyTorch tutorial |
12 | Recurrent neural networks | slides | colab | reading: slp9.1-9.4 | blog about RNN |
13 | Attention and transformers | slides | colab | reading: slp9.7-9.7.1 | illustrated transformer |
This course introduces the basic properties of natural language and how they are related to other aspects of human cognition. Upon completion of the course, students would be able to identify and analyse the human language in general, which is the essential defining characteristic of human beings.
Prerequisite: N/A
Semesters Taught: Fall 2022, 2023, 2024
Course Information: Syllabus
# | Lectures |
---|---|
1 | Introduction | slides |
2 | Phonetics | slides | Praat, Praat tutorial | reading: Diel et al (2004) | type IPA | interactive sagittal section |
3 | Phonology | slides | sound features |
4 | Morphology | slides | tree generator |
5 | Syntax | slides |
6 | Semantics | slides | reading: Bemis & Pylkkänen (2011) |
7 | Pragmatics | slides |
8 | Language Acquisition | slides | reading: Saffran et al. (1996), Yang et al. (2004), Pinker & Prince et al. (1988), Rumelhart & McClelland et al. (1986) | corpus: CHILDES |
9 | Psycho/Neurolinguistics | slides | textbook: Kemmerer 2015 | reading: Hickok and Poeppel (2007), Kutas and Hillyard (1980), Ding et al. (2005), Mesgarani et al. (2014), Huth et al. (2016) |
10 | Computational Linguistics | slides | colab | textbooks: Jurafsky and Martin, Bird et al. | Intro to Python |
11 | Sociolinguistics | slides |
Linguistics concepts often display ranges of discrete values that can be translated into numerical variables and then scrutinized by statistical tests. This course enables the students to represent linguistic problems in terms of numerical problems, to calculate statistical measures and to reinterpret these measures back into linguistics in a way that provides an answer to the original linguistic problem.
Prerequisite: N/A
Semesters Taught: Spring 2023.
Course Information: Syllabus
# | Lectures |
---|---|
1 | Statistical problems, R tutorial | slides |
2 | Sampling | slides |
3 | Descriptive statistics | slides |
4 | Hypothesis testing | slides |
5 | t-tests | slides |
6 | ANOVA I | slides |
7 | ANOVA II | slides |
8 | Simple linear regression | slides |
9 | Correlation | slides |
10 | Multiple regression | slides |
11 | Logistic regression | slides |
12 | Linear mixed-effects models | slides |