網路資訊檢索與文字探勘 (Web Information Retrieval and Text Mining)
Instructor: Hung-Yu Kao (高宏宇)
Email: hykao (at) mail. ncku. edu. tw
Office: 4281A Tel: 2757575 ext 62546
Office Hour: Tue. 2PM~4PM
- Basic WWW Technologies
- Mathematical Background
- Web Information Retrieval
- Review on IR
- When IR meets Web
- Advanced Crawling Techniques
- Web Characteristics Analysis
- Web Graphs
- Text Analysis
- Link Analysis
- Projects: 50% (15%,15%,20% respectively)
- We have three coding projects in this course. You should propose your design, implement a workable system to demonstrate, and prepare a report for most of them.
- The first two projects are technical projects, and the last one trends to a research one.
- The topic of the last project is not limited. Experiments should be included in this project. It also contains a result presentation (5%).
- Paper debate: 30% (20% for paper presentation, 10% for three turn's challenges)
- Each team should select one topic (not a paper) to present. When one team is reporting, the selected two teams should stand for the negative side to ask questions and show the disadvantages of presented papers.
- A Final Quiz: 20%
- Participation: (extra)
TextBook & References
- "Modeling the Internet and the Web -- Probabilistic Methods and Algorithms," P. Baldi, P. Frasconi, P. Smyth. WILEY, 2003
- "Search Engines: Information Retrieval in Practice," W. Bruce Croft, Donald Metzler, Trevor Strohman, Addison Wesley, 2009
- "Introduction to Information Retrieval," Christopher D. Manning, Prabhakar Raghavan and Hinrich Schutze, Cambridge University Press. 2008
- "Mining the Web Discovering Knowledge from Hypertext Data," Soumen Chakrabarti, Morgan-Kaufmann Publishers
- "The Text Mining Handbook -- Advanced Approaches in Analyzing Unstructured Data," Ronen Feldman, James Sanger, Cambridge
- "Text Mining Application Programming, " Manu Konchady, Thomson
- "Text Mining for Biology and Biomedicine, " Sophia Ananiadou
TA: Room 4244, email@example.com