0440949 Andreas van Cranenburgh Thu Sep 11 15:27:20 CEST 2008 Machine Learning, assignment 1 1.2 The ability to correctly turn spoken language into text. Task: Given sound waves of utterances, produce a textual representation of them. Performance measure: Percentage of words correctly recognized given a collection of utterances of which the correct transcription is known. Training experience: A collection of utterances of which the correct transcription is given (but not part of performance measure) Target function returns text for given sound. 1.4 - Generating random legal board positions: The downside to this is that the computer will possibly train on (mostly) unrealistic board positions, which might not even be considered worthy of consideration by human novices. The advantage is that the computer will train on a random sample of moves and not on any particular playing style such as that of a human player present in training data. - Generating a position by picking a board state from the previous game, then applying one of the moves that was not executed: This has the downside that experience will depend on previous games, which might not be representative of other games. The advantage is that certain board positions will be exhaustively studied, which can, depending on the game, be advantageous. - Strategy of my own: Take a large collection of games by human players, both good and bad players. Take a sample of board positions from these games and annotate them by humans with information on which player has an advantage, and specify which factors contribute to this. Finally, store these "judgements" as examplars in such a way that for any given board position, the one which most closely resembles it can be matched against it. The downside to this is that it is a lot of work, and it will not work for games where the number of legal board positions is too big to sample. The advantage is that it makes use of human insight without having to figure out how to acquire it algorithmically. 2.2 start S: 0, 0, 0, 0, 0, 0 G: 4. S: S, W, H, S, C, C G: 3. S: S, W, H, S, C, C G: ,, 2. S: S, W, H, S, ?, ? G: , 1. S: S, W, ?, S, ?, ? G: , Perhaps the optimal order is to present all the negative instances first, because the set G can become large, whereas S should always contain one member. 2.3 start S: 0, 0, 0, 0, 0, 0 G: 1. S: S, W, N, S, W, S G: 2. S: v G: 3. S: v G: vv 4. S: v G: v 2.4 a. S: 4<=x<=6, 3<=y<=5 b. G: x<=2, x>=9, y<=1, y>=8 c. Repeating an instance that has already been classified will certainly not reduce the version space. An instance between S and G, eg. x=3,y=4 will certainly reduce the version space. d. The smallest number of instances required to learn a target concept is 12. There should be 4 positive instances, showing the corners of the square, and then 8 negative instances, next to each of these corners (two for each corner).