Course Title:自然語言處理實作

(Natural Language Processing Lab)
Course number:CS 563200 3 credits
Instructor: Jason S. Chang 張俊盛

The course consists of a set of small exercises on natural language processing based on statistical
approach. The purpose is to give students opportunity to work with real problems and data in natural
language processing. Each session will start with explanation of background, experiental data, and
snippets of code. The students are required to do the assignment in class. The instuctor and teaching
assistants will be on hand to help students. The list of topics planned for the Fall, 2006 is as follows.

Topics
1. Introduction
2. Using Foxpro for NLP programming
3. Keyword Extraction and Statistical Improbable Phrases (SIP)
4. Corpus and text processing: sentence splitting and word counting
5. Take home exercise
6. Parameter estimation of N-gram model
7. Bag Translation using N-gram model
8. Collocation Extraction and Log Likelihood Ratio
9. Edit Distance and Dynamic Programming
10. Sentence Alignment Experiment (English-Chinese parallel text)
11. Word Alignment (WA) Via IBM Model 1
12. Class-based WA using WordNet Lexicographer Files
13. Word Translation Disambiguation (WTD)
14. Class-based WTD using WordNet Lexicographer Files
15. Machine transliteration or Text Categorization
16. Bilingual Concordance