COURSE OBJECTIVES:
1. modelling evolution of biological sequences - multiple sequence alignment; hidden Markov models for multiple sequence alignment and associated topics,
2. protein structure analysis: protein fold, secondary structure elements, distance matrices; structure alignment,
3. phylogenetic analysis - UPGMA and neighbour-joining algorithms
4. clustering and classification: k-means clustering, how many k in k-means; elements of machine learning for biological sequence classification - support vector machines.
COURSE CONTENT:
1. modelling evolution of biological sequences - multiple sequence alignment; hidden Markov models for multiple sequence alignment and associated topics,
2. protein structure analysis: protein fold, secondary structure elements, distance matrices; structure alignment,
3. phylogenetic analysis - UPGMA and neighbour-joining algorithms,
4. clustering and classification: k-means clustering, how many k in k-means; elements of machine learning for biological sequence classification - support vector machines.
|
-
Biological Sequences Analysis, Durbin et al, CUP.
-
Statistical Methods in Bioinformatics, An Introduction, Ewens, W.J., Grant, G.R., Springer, 2001.
-
Cross-Validatory Choice and Assessment of Statistical Predictions, M. Stone, Journal of the Royal Statistical Society. Series B (Methodological),Vol. 36, No. 2 (1974).
-
Weighing the Odds, A Course in Probability and Statistics, Williams, David, Cambridge UP, 2001.
|