開課年級: 大四 研究所 (英語授課 中文輔助)
學分數: 3
課程目標: be familiar with programming platform, tool-kits and algorithms for cloud
computing and data processing
先修科目: None

課程大綱:
1. Introduction of Cloud Computing
2. File System: HDFS, GFS (Google File System)
3. MapReduce: Hadoop
4. Text Retrieval Algorithms
5. Graph Algorithms
6. MapReduce & Database: BigTable, Hive, Pig
7. Google App Engine & Microsoft Azure

指定用書: Data-Intensive Text Processing with MapReduce, Jimmy Lin and Chris Dyer. Morgan & Claypool Publishers, 2010.
參考書籍:
A. 雲端程式設計 入門與應用實務
B. Hadoop: The Definitive Guide, O'Reilly, 2009
C. Paper reading:
1. Jeffrey Dean and Sanjay Ghemawat. (2004) MapReduce: Simplified Data Processing on Large Clusters. Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI 2004), pages 137-150.
2. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. (2003) The Google File System. SOSP-03, pages 29-43.
3. Michael Stonebraker, Daniel Abadi, David J. DeWitt, Sam Madden, Erik Paulson, Andrew Pavlo, and Alexander Rasin. (2010) MapReduce and Parallel DBMSs: Friends or Foes? Communications of the ACM, 53(1):64-71.
4. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, and Robert Gruber. Bigtable: A Distributed Storage System for Structured Data. OSDI 2006, pages 205-218.
5. Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. Pig Latin: A Not-So-Foreign Language for Data Processing. Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pages 1099-1110.

教學方式:
講義授課上機實驗

教學進度:
一周一個講義或實驗

成績考核
Lab & Program Homework: 70%
LAB1: Hadoop: HDFS & HBase
LAB2: MapReduce: Text Processing
LAB3: Google App Engine: Database & Web API
LAB4: Azure: SQL Integration
Final Projects: 20%
Course Participation & Quiz: 10%