bogofilter -s < spam_corpus bogofilter -n < nonspam_corpus
I run bogofilter with the Robinson-Fisher method of calculation (described in Appendix A below), in which values of a guess a priori and a weight parameter for that guess are used in calculating individual token-probability estimates, and Fisher's method of combining probabilities is applied in calculation of the overall message "spamicity." In another experiment, I had investigated three things about training for this approach:
It seemed as though full training gave fewer, but not far fewer, classification errors than on-error training; however, not many rounds of training could be performed in that experiment. It seemed worthwhile to conduct a second test with more messages and more rounds of training. Anecdotal reports were in circulation that suggested full training should be used until the training database was of a respectable size (whatever that might mean), and thereafter, training-on-error could be effective. I myself had gained the impression that this might be true. I therefore wanted, in this second test, not only to compare pure full training with pure training on error, but also to see if switching from full to error after several rounds might be beneficial.
A run consisted of ten rounds. In round 0 only training was performed, and in round nine only testing. In each remaining round, the spam and nonspam files for that round were classified with the training databases from the preceding round. After classification, the same files were used to train the databases further.
Three training databases were built. Two of them, called "full" and "error", were started empty. In each round, the "full" database was fully trained, as the name implies, and the "error" database was trained on error only.
After round 2's training was complete, the "full" database, which had by that point been trained with 6,351 spams and 9,621 nonspams, was copied to create a new database called "half." In each of rounds 3 through 8, the "half" database was trained on error.
The training methods, for full training and for training on error, were briefly described in the foregoing section. For training on error, I wrote a script called randomtrain that produces a list of messages, in random order, with flags to indicate whether each message is spam or nonspam, and then uses the list to feed messages to bogofilter for classification and, when needed, for training.
Comparison of training methods: classification errors
The first table shows the percentages of misclassifications in each round from 1 to 9, for full training and for error training. 95% confidence limits are shown for each mean percentage error.
round meanfullpc flcl95 fucl95 meanerrorpc elcl95 eucl95 1 1 8.55 7.61 9.48 12.89 11.95 13.82 2 2 7.26 6.33 8.20 11.60 10.66 12.53 3 3 6.65 5.71 7.59 10.69 9.75 11.62 4 4 6.32 5.38 7.25 9.07 8.14 10.01 5 5 5.90 4.96 6.83 8.53 7.60 9.47 6 6 5.87 4.94 6.81 8.06 7.12 8.99 7 7 5.75 4.82 6.69 7.88 6.94 8.81 8 8 5.50 4.57 6.44 7.34 6.41 8.28 9 9 5.49 4.55 6.43 7.26 6.33 8.20
Full training does seem to be superior to training-on-error; however, as the number of messages used in training increases, the error rates for the two training methods appear to be converging. As might be expected, the effect on the error rate seems to diminish from round to round; full training beyond round 5 (10,585 spams and 16,035 nonspams) had relatively little effect on the error rate.
The next table covers rounds 4 to 9 and adds data for the mixed training method (three rounds of full training followed by a switch to training-on-error):
round meanfull flcl95 fucl95 meanerror elcl95 eucl95 meanhalf hlcl95 hucl95 1 4 6.32 5.80 6.83 9.07 8.56 9.59 6.34 5.83 6.86 2 5 5.90 5.38 6.41 8.53 8.02 9.05 6.04 5.53 6.56 3 6 5.87 5.36 6.39 8.06 7.54 8.57 5.99 5.47 6.50 4 7 5.75 5.24 6.27 7.88 7.36 8.39 6.13 5.61 6.65 5 8 5.50 4.99 6.02 7.34 6.83 7.86 5.75 5.23 6.26 6 9 5.49 4.97 6.01 7.26 6.75 7.78 5.74 5.23 6.26
These results are plotted on the left-hand graph below. It doesn't seem that switching to training-on-error after a period of full training leads to better results than are obtained by continuing to train fully. (The error and full data points are slightly offset to the left and right respectively so that all of the confidence limits, indicated by the vertical bars, can be seen.) It does appear, however, that switching to training-on-error is almost as effective as continuing to train fully; the difference between the full-training results (black) and the results obtained after switching (blue) is small, and lies within the limits of experimental error.

The classification error rate is also reflected in the number of messages used in training-on-error at each round. The right-hand graph above, and the following table, show that the advantage in error rate gained by beginning with full training is preserved during further rounds of training-on-error, though the difference diminishes as more rounds of training are performed. (To permit displaying the data with greater resolution, the graph of differences (the black line) has been offset by +4.5 on the y axis.)
round errorreg elcl95 eucl95 halfreg hlcl95 hucl95 1 3 16.0 15.1 16.8 12.6 11.67 13.4 2 4 14.9 14.0 15.8 12.2 11.35 13.1 3 5 13.3 12.5 14.2 11.2 10.33 12.1 4 6 12.7 11.9 13.6 10.9 9.97 11.7 5 7 11.9 11.0 12.8 10.5 9.57 11.3 6 8 11.4 10.5 12.3 10.0 9.15 10.9
(badcount/badlist_messagecount)
p(w) = -----------------------------------------------------------------
(badcount/badlist_messagecount + goodcount/goodlist_messagecount)
n = badcount + goodcount
f(w) = (s * x + n * p(w)) / (s + n)
scalefactor = badlist_messagecount / goodlist_messagecount f(w) = (s * x + badcount) / (s + badcount + goodcount * scalefactor)
Fisher's method uses an inverse chi-squared function, prbx, to get the probability associated with -2 times the sum of the logs of f(w) with 2N degrees of freedom:
P = prbx(-2 * sum(ln(1-f(w))), 2*N) Q = prbx(-2 * sum(ln(f(w))), 2*N) S = (1 + Q - P) / 2
Corpus consists of 21170 spams and 32070 nonspams:
mutt
32070 kept, 0 deleted.
mutt
21170 kept, 0 deleted.
grep -c '^From ' agg*
aggregate.bad:21170
aggregate.good:32070
The spams and nonspams were "dealt" out into ten files each:
cat ~/bin/tenths
#! /bin/sh
let n=${FILENO}%10
fname=cgx-$n
cat >>$fname
FILENO=0 formail -s ~/bin/tenths < aggregate.good
rename gx ns cgx*
FILENO=0 formail -s ~/bin/tenths < aggregate.bad
rename gx sp cgx*
The files were moved into a separate directory and subdirectories were
created for the bogofilter training databases:
mkdir train10-2
mv csp* cns* train10
cd train10-2
mkdir full half error
Random sequences were created by shuffling 0-9 with this little perl
script:
cat /usr/local/bin/shuffle
#! /usr/bin/perl
# shuffle -- echo stdin lines in a random order
srand ( time() ^ ($$ + ($$ << 15)) );
foreach $key (<>) {
$shuf{$key} = rand;
}
foreach $key (sort { $shuf{$b} <=> $shuf{$a} } keys %shuf ) {
print $key;
}
We do three runs of ten rounds. In round 0 we do training only, and in
round 9 testing only; otherwise, we first test the spam and nonspam
files for the current round against the training dbs from the preceding
round, and then train with the files just tested. After round 2 we
train the "half" db on error, and after round 3 we test against that db
in addition to the other two:
cat runex
#! /bin/bash
seq 0 9 >sequence
fmbf="formail -s /usr/bin/bogofilter"
for run in 0 1 2; do
echo "run $run"
file=( `shuffle sequence` )
/bin/rm full/* error/* half/*
for round in 0 1 2 3 4 5 6 7 8 9; do
fnam=${file[$round]}
echo "round $round, files $fnam"
if [ $round -gt 0 ]; then
for method in full error; do
$fmbf -d $method -v < csp-$fnam &> sp-$method-$run-$round
$fmbf -d $method -v < cns-$fnam &> ns-$method-$run-$round
done
fi
if [ $round -gt 3 ]; then
$fmbf -d half -v < csp-$fnam &> sp-half-$run-$round
$fmbf -d half -v < cns-$fnam &> ns-half-$run-$round
fi
if [ $round -lt 9 ]; then
/usr/bin/bogofilter -d full -v -n < cns-$fnam
/usr/bin/bogofilter -d full -v -s < csp-$fnam
randomtrain error -n cns-$fnam -s csp-$fnam
if [ $round -eq 2 ]; then
cp full/* half
fi
if [ $round -gt 2 ]; then
randomtrain half -n cns-$fnam -s csp-$fnam
fi
fi
done
done
output went to nohup.out which was edited to change \r to $
then
sed 's/.*\$//' nohup.out >runex.log
cat runex.log
run 0
round 0, files 0
# 1368656 words, 3207 messages
# 1289609 words, 2117 messages
error
spam reg good reg
2117 889 3207 882
round 1, files 2
# 1310658 words, 3207 messages
# 1305731 words, 2117 messages
error
spam reg good reg
2117 674 3207 495
round 2, files 4
# 1548414 words, 3207 messages
# 1299363 words, 2117 messages
error
spam reg good reg
2117 571 3207 400
round 3, files 1
# 1435450 words, 3207 messages
# 1444462 words, 2117 messages
error
spam reg good reg
2117 490 3207 300
half
spam reg good reg
2117 249 3207 377
round 4, files 8
# 1521052 words, 3207 messages
# 1326994 words, 2117 messages
error
spam reg good reg
2117 430 3207 380
half
spam reg good reg
2117 257 3207 395
round 5, files 9
# 1604435 words, 3207 messages
# 1475321 words, 2117 messages
error
spam reg good reg
2117 390 3207 288
half
spam reg good reg
2117 244 3207 337
round 6, files 6
# 1824156 words, 3207 messages
# 1271805 words, 2117 messages
error
spam reg good reg
2117 422 3207 297
half
spam reg good reg
2117 288 3207 322
round 7, files 7
# 1580733 words, 3207 messages
# 1218496 words, 2117 messages
error
spam reg good reg
2117 387 3207 242
half
spam reg good reg
2117 261 3207 274
round 8, files 5
# 1717412 words, 3207 messages
# 1293816 words, 2117 messages
error
spam reg good reg
2117 364 3207 229
half
spam reg good reg
2117 274 3207 249
round 9, files 3
run 1
round 0, files 1
# 1435450 words, 3207 messages
# 1444462 words, 2117 messages
error
spam reg good reg
2117 900 3207 888
round 1, files 8
# 1521052 words, 3207 messages
# 1326994 words, 2117 messages
error
spam reg good reg
2117 581 3207 511
round 2, files 9
# 1604435 words, 3207 messages
# 1475321 words, 2117 messages
error
spam reg good reg
2117 504 3207 379
round 3, files 4
# 1548414 words, 3207 messages
# 1299363 words, 2117 messages
error
spam reg good reg
2117 544 3207 367
half
spam reg good reg
2117 281 3207 430
round 4, files 6
# 1824156 words, 3207 messages
# 1271805 words, 2117 messages
error
spam reg good reg
2117 485 3207 331
half
spam reg good reg
2117 280 3207 389
round 5, files 3
# 1340454 words, 3207 messages
# 1269049 words, 2117 messages
error
spam reg good reg
2117 466 3207 310
half
spam reg good reg
2117 304 3207 338
round 6, files 2
# 1310658 words, 3207 messages
# 1305731 words, 2117 messages
error
spam reg good reg
2117 417 3207 272
half
spam reg good reg
2117 294 3207 301
round 7, files 5
# 1717412 words, 3207 messages
# 1293816 words, 2117 messages
error
spam reg good reg
2117 396 3207 249
half
spam reg good reg
2117 287 3207 290
round 8, files 0
# 1368656 words, 3207 messages
# 1289609 words, 2117 messages
error
spam reg good reg
2117 363 3207 243
half
spam reg good reg
2117 243 3207 265
round 9, files 7
run 2
round 0, files 9
# 1604435 words, 3207 messages
# 1475321 words, 2117 messages
error
spam reg good reg
2117 927 3207 920
round 1, files 7
# 1580733 words, 3207 messages
# 1218496 words, 2117 messages
error
spam reg good reg
2117 603 3207 509
round 2, files 2
# 1310658 words, 3207 messages
# 1305731 words, 2117 messages
error
spam reg good reg
2117 586 3207 402
round 3, files 4
# 1548414 words, 3207 messages
# 1299363 words, 2117 messages
error
spam reg good reg
2117 520 3207 328
half
spam reg good reg
2117 272 3207 396
round 4, files 3
# 1340454 words, 3207 messages
# 1269049 words, 2117 messages
error
spam reg good reg
2117 446 3207 308
half
spam reg good reg
2117 286 3207 346
round 5, files 1
# 1435450 words, 3207 messages
# 1444462 words, 2117 messages
error
spam reg good reg
2117 426 3207 252
half
spam reg good reg
2117 278 3207 290
round 6, files 8
# 1521052 words, 3207 messages
# 1326994 words, 2117 messages
error
spam reg good reg
2117 367 3207 259
half
spam reg good reg
2117 242 3207 287
round 7, files 5
# 1717412 words, 3207 messages
# 1293816 words, 2117 messages
error
spam reg good reg
2117 382 3207 248
half
spam reg good reg
2117 281 3207 277
round 8, files 6
# 1824156 words, 3207 messages
# 1271805 words, 2117 messages
error
spam reg good reg
2117 367 3207 251
half
spam reg good reg
2117 289 3207 282
round 9, files 0
grep '^ 2117' runex.log >errortrain
Collect fp and fn figures
for method in full error half; do
for round in 1 2 3 4 5 6 7 8 9; do
for run in 0 1 2; do
test -f ns-$method-$run-$round \
&& grep -c '^1' ns-$method-$run-$round >>fp-$method
test -f sp-$method-$run-$round \
&& grep -c -v '^1' sp-$method-$run-$round >>fn-$method
done
done
done
Remaining data reduction performed in R:
errortrain <- read.table("F/errortrain")
errortrain$round <- c(rep(c(0,1,2,3,3,4,4,5,5,6,6,7,7,8,8),3))
errortrain$method <- c(rep(c("error","error","error",
rep(c("error","half"),6)),3))
errortrain
V1 V2 V3 V4 round method
1 2117 889 3207 882 0 error
2 2117 674 3207 495 1 error
3 2117 571 3207 400 2 error
4 2117 490 3207 300 3 error
5 2117 249 3207 377 3 half
6 2117 430 3207 380 4 error
7 2117 257 3207 395 4 half
8 2117 390 3207 288 5 error
9 2117 244 3207 337 5 half
10 2117 422 3207 297 6 error
11 2117 288 3207 322 6 half
12 2117 387 3207 242 7 error
13 2117 261 3207 274 7 half
14 2117 364 3207 229 8 error
15 2117 274 3207 249 8 half
16 2117 900 3207 888 0 error
17 2117 581 3207 511 1 error
18 2117 504 3207 379 2 error
19 2117 544 3207 367 3 error
20 2117 281 3207 430 3 half
21 2117 485 3207 331 4 error
22 2117 280 3207 389 4 half
23 2117 466 3207 310 5 error
24 2117 304 3207 338 5 half
25 2117 417 3207 272 6 error
26 2117 294 3207 301 6 half
27 2117 396 3207 249 7 error
28 2117 287 3207 290 7 half
29 2117 363 3207 243 8 error
30 2117 243 3207 265 8 half
31 2117 927 3207 920 0 error
32 2117 603 3207 509 1 error
33 2117 586 3207 402 2 error
34 2117 520 3207 328 3 error
35 2117 272 3207 396 3 half
36 2117 446 3207 308 4 error
37 2117 286 3207 346 4 half
38 2117 426 3207 252 5 error
39 2117 278 3207 290 5 half
40 2117 367 3207 259 6 error
41 2117 242 3207 287 6 half
42 2117 382 3207 248 7 error
43 2117 281 3207 277 7 half
44 2117 367 3207 251 8 error
45 2117 289 3207 282 8 half
roundmethod <- function(x) {
x[5] == r && x[6] == m
}
rerr <- function(x,rnd) {
y <- 0
for (i in rnd) {
r <<- i
y <- c(y,x[apply(errortrain,1,roundmethod)])
}
y[2:length(y)]
}
m <- "error"
errorspamreg <- rerr(errortrain$V2, 0:8)
errornsreg <- rerr(errortrain$V4, 0:8)
errorreg <- data.frame(
round=c(0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8),
run=c(rep(c(0,1,2),9)), spam=errorspamreg,nonspam=errornsreg)
errorreg
1 0 0 889 882
2 0 1 900 888
3 0 2 927 920
4 1 0 674 495
5 1 1 581 511
6 1 2 603 509
7 2 0 571 400
8 2 1 504 379
9 2 2 586 402
10 3 0 490 300
11 3 1 544 367
12 3 2 520 328
13 4 0 430 380
14 4 1 485 331
15 4 2 446 308
16 5 0 390 288
17 5 1 466 310
18 5 2 426 252
19 6 0 422 297
20 6 1 417 272
21 6 2 367 259
22 7 0 387 242
23 7 1 396 249
24 7 2 382 248
25 8 0 364 229
26 8 1 363 243
27 8 2 367 251
m <- "half"
halfspamreg <- rerr(errortrain$V2, 3:8)
halfnsreg <- rerr(errortrain$V4, 3:8)
halfreg <- data.frame(round=c(3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8),
run=c(rep(c(0,1,2),6)), spam=halfspamreg,nonspam=halfnsreg)
ro <- rep(halfreg$round, 2)
me <- c(rep("error", 18), rep("half", 18))
sp <- c(errorreg$spam[10:27], halfreg$spam)
ns <- c(errorreg$nonspam[10:27], halfreg$nonspam)
data.frame(method=me, round=ro, run=rep(halfreg$run, 2),
spam=sp, nonspam=ns, reg=sp+ns, percent=(sp+ns)*100/(3207+2117)) ->
trainreg
print(trainreg,digits=3)
method round run spam nonspam reg percent
1 error 3 0 490 300 790 14.84
2 error 3 1 544 367 911 17.11
3 error 3 2 520 328 848 15.93
4 error 4 0 430 380 810 15.21
5 error 4 1 485 331 816 15.33
6 error 4 2 446 308 754 14.16
7 error 5 0 390 288 678 12.73
8 error 5 1 466 310 776 14.58
9 error 5 2 426 252 678 12.73
10 error 6 0 422 297 719 13.50
11 error 6 1 417 272 689 12.94
12 error 6 2 367 259 626 11.76
13 error 7 0 387 242 629 11.81
14 error 7 1 396 249 645 12.11
15 error 7 2 382 248 630 11.83
16 error 8 0 364 229 593 11.14
17 error 8 1 363 243 606 11.38
18 error 8 2 367 251 618 11.61
19 half 3 0 249 377 626 11.76
20 half 3 1 281 430 711 13.35
21 half 3 2 272 396 668 12.55
22 half 4 0 257 395 652 12.25
23 half 4 1 280 389 669 12.57
24 half 4 2 286 346 632 11.87
25 half 5 0 244 337 581 10.91
26 half 5 1 304 338 642 12.06
27 half 5 2 278 290 568 10.67
28 half 6 0 288 322 610 11.46
29 half 6 1 294 301 595 11.18
30 half 6 2 242 287 529 9.94
31 half 7 0 261 274 535 10.05
32 half 7 1 287 290 577 10.84
33 half 7 2 281 277 558 10.48
34 half 8 0 274 249 523 9.82
35 half 8 1 243 265 508 9.54
36 half 8 2 289 282 571 10.73
regaov <- aov(percent ~ method + round, data=trainreg)
summary(regaov)
Df Sum Sq Mean Sq F value Pr(>F)
method 1 41.627 41.627 73.938 6.151e-10 ***
round 1 55.207 55.207 98.059 2.068e-11 ***
Residuals 33 18.579 0.563
d <- c(1.95996, 0.412, 0.423)
rdf <- 33
rms <- deviance(regaov)/rdf
z <- (d[1] + 1 / (rdf * d[2] - d[3])) * sqrt(rms/3)
meanreg <- apply(array(trainreg$percent,dim=c(3,12)), 2, mean)
lcl95 <- meanreg - z
ucl95 <- meanreg + z
data.frame(round=c(3,4,5,6,7,8), errorreg=meanreg[1:6],
elcl95=lcl95[1:6],
eucl95=ucl95[1:6], halfreg=meanreg[7:12], hlcl95=lcl95[7:12],
hucl95=ucl95[7:12]) -> regresults
print(regresults, digits=3)
round errorreg elcl95 eucl95 halfreg hlcl95 hucl95
1 3 16.0 15.1 16.8 12.6 11.67 13.4
2 4 14.9 14.0 15.8 12.2 11.35 13.1
3 5 13.3 12.5 14.2 11.2 10.33 12.1
4 6 12.7 11.9 13.6 10.9 9.97 11.7
5 7 11.9 11.0 12.8 10.5 9.57 11.3
6 8 11.4 10.5 12.3 10.0 9.15 10.9
X11(width=3.5, height=3.5)
plot(regresults$round - 0.02, regresults$errorreg,
main="Error training vs mixed training", ylim=c(6,17),
xlab="Number of training cycles", ylab="Percent registered",
col="red")
lines(regresults$round - 0.02, regresults$eucl95, type="h", col="red")
lines(regresults$round - 0.02, regresults$elcl95, type="h",
col="white")
lines(regresults$round - 0.02, regresults$errorreg, col="red")
points(regresults$round + 0.02, regresults$halfreg, col="blue")
lines(regresults$round + 0.02, regresults$hucl95, type="h", col="blue")
lines(regresults$round + 0.02, regresults$hlcl95, type="h",
col="white")
lines(regresults$round + 0.02, regresults$halfreg, col="blue")
points(regresults$round, regresults$errorreg - regresults$halfreg + 4.5)
lines(regresults$round, regresults$errorreg - regresults$halfreg + 4.5)
text(6, 16.7, labels="error", col="red", pos=4)
text(6, 16.1, labels="half", col="blue", pos=4)
text(6, 15.5, labels="difference+4.5", pos=4)
read.table("F/fp-full") -> fpfull
read.table("F/fn-full") -> fnfull
read.table("F/fp-error") -> fperror
read.table("F/fn-error") -> fnerror
read.table("F/fp-half") -> fphalf
read.table("F/fn-half") -> fnhalf
tenround <- data.frame(method=c(rep("full", 27), rep("error", 27)),
round=c(rep(c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8,9,9,9),2)),
run=c(rep(0:2, 18)), fpos=c(fpfull$V1, fperror$V1),
fneg=c(fnfull$V1, fnerror$V1))
tenround$err <- tenround$fpos + tenround$fneg
tenround$percent <- tenround$err * 100 / (3207 + 2117)
print(tenround, digits=3)
method round run fpos fneg err percent
1 full 1 0 89 399 488 9.17
2 full 1 1 98 332 430 8.08
3 full 1 2 85 362 447 8.40
4 full 2 0 96 312 408 7.66
5 full 2 1 100 251 351 6.59
6 full 2 2 104 297 401 7.53
7 full 3 0 72 244 316 5.94
8 full 3 1 101 270 371 6.97
9 full 3 2 102 273 375 7.04
10 full 4 0 83 235 318 5.97
11 full 4 1 112 246 358 6.72
12 full 4 2 87 246 333 6.25
13 full 5 0 82 211 293 5.50
14 full 5 1 101 239 340 6.39
15 full 5 2 78 231 309 5.80
16 full 6 0 99 235 334 6.27
17 full 6 1 87 249 336 6.31
18 full 6 2 66 202 268 5.03
19 full 7 0 90 210 300 5.63
20 full 7 1 95 215 310 5.82
21 full 7 2 92 217 309 5.80
22 full 8 0 83 203 286 5.37
23 full 8 1 80 193 273 5.13
24 full 8 2 96 224 320 6.01
25 full 9 0 92 213 305 5.73
26 full 9 1 87 207 294 5.52
27 full 9 2 82 196 278 5.22
28 error 1 0 31 709 740 13.90
29 error 1 1 24 633 657 12.34
30 error 1 2 26 635 661 12.42
31 error 2 0 27 621 648 12.17
32 error 2 1 25 514 539 10.12
33 error 2 2 27 638 665 12.49
34 error 3 0 21 512 533 10.01
35 error 3 1 25 587 612 11.50
36 error 3 2 24 538 562 10.56
37 error 4 0 22 443 465 8.73
38 error 4 1 26 470 496 9.32
39 error 4 2 26 462 488 9.17
40 error 5 0 25 379 404 7.59
41 error 5 1 28 479 507 9.52
42 error 5 2 20 432 452 8.49
43 error 6 0 20 419 439 8.25
44 error 6 1 25 427 452 8.49
45 error 6 2 18 378 396 7.44
46 error 7 0 25 394 419 7.87
47 error 7 1 23 415 438 8.23
48 error 7 2 23 378 401 7.53
49 error 8 0 22 388 410 7.70
50 error 8 1 20 366 386 7.25
51 error 8 2 16 361 377 7.08
52 error 9 0 24 372 396 7.44
53 error 9 1 21 382 403 7.57
54 error 9 2 25 336 361 6.78
sixround <- data.frame(
method=c(rep("full",18), rep("error", 18), rep("half", 18)),
round=c(rep(c(4,4,4,5,5,5,6,6,6,7,7,7,8,8,8,9,9,9),3)),
run=c(rep(0:2, 18)),
fpos=c(fpfull$V1[10:27], fperror$V1[10:27], fphalf$V1),
fneg=c(fnfull$V1[10:27], fnerror$V1[10:27], fnhalf$V1))
sixround$err <- sixround$fpos + sixround$fneg
sixround$percent <- sixround$err * 100 / (3207 + 2117)
print(sixround, digits=3)
method round run fpos fneg err percent
1 full 4 0 83 235 318 5.97
2 full 4 1 112 246 358 6.72
3 full 4 2 87 246 333 6.25
4 full 5 0 82 211 293 5.50
5 full 5 1 101 239 340 6.39
6 full 5 2 78 231 309 5.80
7 full 6 0 99 235 334 6.27
8 full 6 1 87 249 336 6.31
9 full 6 2 66 202 268 5.03
10 full 7 0 90 210 300 5.63
11 full 7 1 95 215 310 5.82
12 full 7 2 92 217 309 5.80
13 full 8 0 83 203 286 5.37
14 full 8 1 80 193 273 5.13
15 full 8 2 96 224 320 6.01
16 full 9 0 92 213 305 5.73
17 full 9 1 87 207 294 5.52
18 full 9 2 82 196 278 5.22
19 error 4 0 22 443 465 8.73
20 error 4 1 26 470 496 9.32
21 error 4 2 26 462 488 9.17
22 error 5 0 25 379 404 7.59
23 error 5 1 28 479 507 9.52
24 error 5 2 20 432 452 8.49
25 error 6 0 20 419 439 8.25
26 error 6 1 25 427 452 8.49
27 error 6 2 18 378 396 7.44
28 error 7 0 25 394 419 7.87
29 error 7 1 23 415 438 8.23
30 error 7 2 23 378 401 7.53
31 error 8 0 22 388 410 7.70
32 error 8 1 20 366 386 7.25
33 error 8 2 16 361 377 7.08
34 error 9 0 24 372 396 7.44
35 error 9 1 21 382 403 7.57
36 error 9 2 25 336 361 6.78
37 half 4 0 64 252 316 5.94
38 half 4 1 81 273 354 6.65
39 half 4 2 64 279 343 6.44
40 half 5 0 56 237 293 5.50
41 half 5 1 64 294 358 6.72
42 half 5 2 44 270 314 5.90
43 half 6 0 54 272 326 6.12
44 half 6 1 52 298 350 6.57
45 half 6 2 39 241 280 5.26
46 half 7 0 50 269 319 5.99
47 half 7 1 44 288 332 6.24
48 half 7 2 51 277 328 6.16
49 half 8 0 45 272 317 5.95
50 half 8 1 38 241 279 5.24
51 half 8 2 36 286 322 6.05
52 half 9 0 37 279 316 5.94
53 half 9 1 44 277 321 6.03
54 half 9 2 40 240 280 5.26
tenround$method <- factor(tenround$method)
tenround$round <- factor(tenround$round)
tenround$run <- factor(tenround$run)
tenaov <- aov(percent ~ method + round + run, data=tenround)
summary(tenaov)
Df Sum Sq Mean Sq F value Pr(>F)
method 1 112.845 112.845 174.6996 < 2.2e-16 ***
round 8 107.077 13.385 20.7213 2.793e-12 ***
run 2 0.228 0.114 0.1765 0.8388
Residuals 42 27.129 0.646
d <- c(1.95996, 0.412, 0.423)
rdf <- 42
rms <- deviance(tenaov)/rdf
z <- (d[1] + 1 / (rdf * d[2] - d[3])) * sqrt(rms/3)
meanerr <- apply(array(tenround$percent,dim=c(3,18)), 2, mean)
lcl95 <- meanerr - z
ucl95 <- meanerr + z
tenres <- data.frame(round=c(1:9),
meanfullpc=meanerr[1:9], flcl95=lcl95[1:9], fucl95=ucl95[1:9],
meanerrorpc=meanerr[10:18], elcl95=lcl95[10:18], eucl95=ucl95[10:18])
print(tenres,digits=3)
round meanfullpc flcl95 fucl95 meanerrorpc elcl95 eucl95
1 1 8.55 7.61 9.48 12.89 11.95 13.82
2 2 7.26 6.33 8.20 11.60 10.66 12.53
3 3 6.65 5.71 7.59 10.69 9.75 11.62
4 4 6.32 5.38 7.25 9.07 8.14 10.01
5 5 5.90 4.96 6.83 8.53 7.60 9.47
6 6 5.87 4.94 6.81 8.06 7.12 8.99
7 7 5.75 4.82 6.69 7.88 6.94 8.81
8 8 5.50 4.57 6.44 7.34 6.41 8.28
9 9 5.49 4.55 6.43 7.26 6.33 8.20
sixround$method <- factor(sixround$method)
sixround$round <- factor(sixround$round)
sixround$run <- factor(sixround$run)
sixaov <- aov(percent ~ method + round + run, data=sixround)
summary(sixaov)
Df Sum Sq Mean Sq F value Pr(>F)
method 2 54.390 27.195 138.4376 < 2.2e-16 ***
round 5 7.351 1.470 7.4838 3.735e-05 ***
run 2 1.974 0.987 5.0245 0.01083 *
Residuals 44 8.643 0.196
d <- c(1.95996, 0.412, 0.423)
rdf <- 44
rms <- deviance(sixaov)/rdf
z <- (d[1] + 1 / (rdf * d[2] - d[3])) * sqrt(rms/3)
meanerr <- apply(array(sixround$percent,dim=c(3,18)), 2, mean)
lcl95 <- meanerr - z
ucl95 <- meanerr + z
sixres <- data.frame(round=c(4:9),
meanfull=meanerr[1:6], flcl95=lcl95[1:6], fucl95=ucl95[1:6],
meanerror=meanerr[7:12], elcl95=lcl95[7:12], eucl95=ucl95[7:12],
meanhalf=meanerr[13:18], hlcl95=lcl95[13:18], hucl95=ucl95[13:18])
print(sixres,digits=3)
round meanfull flcl95 fucl95 meanerror elcl95 eucl95 meanhalf hlcl95 hucl95
1 4 6.32 5.80 6.83 9.07 8.56 9.59 6.34 5.83 6.86
2 5 5.90 5.38 6.41 8.53 8.02 9.05 6.04 5.53 6.56
3 6 5.87 5.36 6.39 8.06 7.54 8.57 5.99 5.47 6.50
4 7 5.75 5.24 6.27 7.88 7.36 8.39 6.13 5.61 6.65
5 8 5.50 4.99 6.02 7.34 6.83 7.86 5.75 5.23 6.26
6 9 5.49 4.97 6.01 7.26 6.75 7.78 5.74 5.23 6.26
X11(width=3.5, height=3.5)
plot(tenres$round - 0.06, tenres$meanerrorpc,
main="Full training vs training on error", ylim=c(4,14),
xlab="Number of training cycles", ylab="Percent error", col="red")
lines(tenres$round - 0.06, tenres$eucl95, type="h", col="red")
lines(tenres$round - 0.06, tenres$elcl95, type="h", col="white")
lines(tenres$round - 0.06, tenres$meanerrorpc, col="red")
points(tenres$round + 0.06, tenres$meanfullpc)
lines(tenres$round + 0.06, tenres$fucl95, type="h")
lines(tenres$round + 0.06, tenres$flcl95, type="h", col="white")
lines(tenres$round + 0.06, tenres$meanfullpc)
points(sixres$round, sixres$meanhalf, col="blue")
lines(sixres$round, sixres$hucl95, type="h", col="blue")
lines(sixres$round, sixres$hlcl95, type="h", col="white")
lines(sixres$round, sixres$meanhalf, col="blue")
lines(c(3,4), c(tenres$meanfullpc[3], sixres$meanhalf[1]), col="blue")
text(6, 13.75, labels="error", col="red", pos=4)
text(6, 13.2, labels="full", pos=4)
text(6, 12.65, labels="half", col="blue", pos=4)
axis(1,at=c(1,2,3,4,5,6,7,8,9))
[© Greg Louis, 2002, 2003; last modified 2003-04-12]