0440949 Andreas van Cranenburgh, Leren

Mitchell 3.1, 3.2, 3.3, 10.1, 10.2, 10.3.

3.1 [zie .png tekening].
3.2
a) -0.5 * log(0.5)/log(2) - 0.5 * log(0.5)/log(2) = 1
b) Gain(S, a2) = Entropy(S)-(4/6)*Entropy(Strue)-(2/6)*Entropy(Sfalse)
Entropy(S) = 1
Entropy(Strue) = - (2/4)*log2(2/4) - (2/4)*log2(2/4) = 1
Entropy(Sfalse) = - (1/2)*log2(1/2) - (1/2)*log2(1/2) = 1

Gain(S, a2) = 1 - (4/6) * 1 - (2/6) * 1 = 0

3.3. Niet waar. Tegenvoorbeeld:
Stel het concept dat geleerd moet worden is: A v B
Stel D1 bestaat uit slechts 1 attribuut, waarbij afhankelijk van dit attribuut
de uitkomst YES of NO zal zijn (dit moet het ID3 algoritme gokken, dat wat het
meest voorkomt bepaalt of YES of NO wordt voorspelt).


Stel D2 bestaat uit 2 attributen, dan kan, los van het eerste attribuut, de
uitkomst YES of NO zijn, en dus kan D2 YES geven terwijl D1 NO zou
geven.


10.1
Aantal regels: 2 ^ (d - 1)
Aantal precondities:  d
Aantal keuzes: d * 2 ^ (d - 1)
Welk systeem sneller zal overfitten hangt af van de hoeveelheid data. Bij
weinig data zal simultaneous covering beter werken, bij veel data zal
sequential covering beter werken (minder gevoelig voor noise).

10.2
Bij stap 1, in de loop over All_Constraints, vervang code met deze:

if c is discrete,
	create a specialization of h by adding the constraint c
else if c is continuous,
	for each example e in Examples,
		calculate information gain if c is split at value of e
	add to h the constraint which produced the biggest information gain.

10.3
Bij stap 1, in de loop over All_Constraints, vervang code met deze:

values <- possible values for constraint c in Examples
for each subset s of values,
	calculate information gain if c is specified as subset s
add to h the constraint which produced the biggest information gain.