COURSE AIMS AND OBJECTIVES: This is a mathematical modelling course which serves as introduction to its second part, ''Mathematics of Search Engines''. Using real world examples (Internet search engines, protein structure, text and image analysis etc.) the students will learn how to express different relations and problems using the language of mathematics. The main goal is to exercise the process of modelling -- how elementary reasoning, experimenting, combined with relations and laws of real world systems lead to sophisticated mathematical models. The mathematics involved is a combination of statistics, numerical mathematics, graph theory, topology. The objective is reached once the students are able to use mathematics to describe problem not given in the language of mathematics.
COURSE DESCRIPTION AND SYLLABUS:
1. Motivation. Examples of transformations of raw data - noise removal, dimension reduction, grouping, pattern recognition etc. Soft computing and data mining.
2. Clasification. Decision trees. Entropy. An overview of clasification methods.
3. Grouping. Segmentation, grouping - an introduction. Spectral graph theory and graph partitioning. K-means. Segmentation methods.
- S. Mitra, T. Acharya: Data Mining
- R. B. Yates, B. R. Neto: Modern information retrieval
- T. Hastie, R. Tibshirani, J. Friedman: Elements of Statistical Learning: Data Mining, Inference, and Prediction
- R. Duda, P. Hart, D. Stork: Pattern Classification