Bogofilter Parameters: Effect of varying s and mindev, continued:
Comparison of results from different email sources

Abstract:

It is demonstrated that a relatively high value of the minimum deviation, together with a value of Robinson's s around 0.1, allows bogofilter to discriminate fairly effectively between spam and nonspam messages with three different message corpora: one from a multiuser environment and two individuals' email collections. A warning is given that using very low values of s, between 0.001 and 1.0e-8, can distort the calculation of the spam score for certain messages, with effects that are not easily predictable.

Practical instructions for tuning bogofilter are presented in the accompanying HOWTO document.

Introduction and general description:

This report describes the third in a series of experiments intended to characterize the effects on bogofilter's discrimination capability of varying two of its fundamental parameters.  The report of the first experiment of this series has a longish introduction that explains the background and purpose of the series.  Readers unfamiliar with bogofilter's calculation methods, or with the testing methods used here, might find it helpful to read that introduction.

That experiment found that "The best discrimination was achieved with a mindev of 0.35 and s of 0.0032; the cutoff point for those values was 0.992.  There are, however, other local minima; it's not clear at this point how generally applicable this minimum might prove."  It seemed desirable to repeat the experiment with larger training and test message corpora, and that was the object of the second in the series.  It used final sizes of:

$ grep -c '^From ' *.ns *.sp
r0.ns:3502
r1.ns:3502
r2.ns:3501
t.ns:21116
r0.sp:5667
r1.sp:5667
r2.sp:5666
t.sp:20935

The t.* files were used to build a bogofilter training set, and the r0, r1 and r2 files were used for classification.  These experiments employed a special version of bogofilter with an option  -m mindev[,cutoff[,s]]  so that mindev, s and the spam-cutoff value (the value of the spam index that determines the classification) could be set from the command line.

(Note that starting with version 0.11.1.9, bogofilter's -o and -m options together allow specification of these parameters; a special version is no longer needed. The listing of the "runex" script in the Appendix has been modified to work with the new command-line options.)

It seemed that an s value between 0.032 and 0.32 is optimal for the data used in this experiment, and that a relatively high mindev (between 0.3 and 0.4) was to be preferred.

In the conclusions I stated that this experiment should be repeated with mail corpora derived from other sources, in order to determine how generally applicable its conclusions might be.  That was the purpose of the present work, the third in the series.

Procedure and results:

The second experiment indicated the desirability of scanning the range of Robinson's s parameter between 1 and 1e-8, and of testing a full range of minimum deviations as well.  Three data sets (message corpora) were used: GL-w, the message set from the previous experiment; GL-h, a body of the author's personal email; and DR-h, a body of David Relson's personal email; David kindly ran the tests and supplied me with the results for this writeup.  The sizes of these corpora are shown in the following table, along with the ranges of values over which s and the minimum deviation (md) were tested (the cutoff target is explained below):
                GL-w    GL-h    DR-h
ns (training)  21116   26732   14182
sp (training)  20935    9462    5238
ns (run 1)      3502   13365    4728
ns (run 2)      3502   13366    4727
ns (run 3)      3501     n/a    4727
sp (run 1)      5667    4732    1746
sp (run 2)      5667    4730    1746
sp (run 3)      5666     n/a    1746
smallest md     0.02    0.02    0.02
largest md      0.48    0.48    0.48
smallest s      1e-8    1e-8    1e-8
largest s       4.64    4.64    4.64
cutoff target     21       8       6

Each message set was individually evaluated.  First, the training files were used to create new training databases; then a combination of bogolexer and bogoutil was used to build, for each test file, a set of message digests that could be fed into a specially written classifier (apclass).  This greatly accelerated the processing, in comparison with the formerly-used method of reclassifying the original messages repeatedly with bogofilter (the Appendix describes that method, since apclass is not generally available).  Then, for each combination of s and md, the following procedure was applied.  The ns (nonspam) digest files were pooled and the pool was classified with apclass.  A spam cutoff was chosen, such that the number of false positives (messages with spam scores greater than or equal to the cutoff) was equal to the "cutoff target" value.  The current s, md and spam cutoff values were then used by apclass to classify each message in the spam (sp) digest files; the number of false negatives (messages with spam scores lower than the cutoff) in each run was determined.

As these evaluations produced many data, the results appear in separate tables: GL-w, GL-h and DR-h.  The first two lines of each table repeat the numbers of spam and nonspam messages in the run files; the remaining lines contain columns of s, md, cutoff, run ordinal, false-positive count and false-negative count in that order.

Script smindev.R (listed in the Appendix), was then run with the three tables named in the preceding paragraph.  The output of this script consists of four parts: (1) the summary of an analysis of variance of the percent error with factors s and md and the interaction between them; (2) a list of the 15 combinations of s and md that gave the highest percent error for the message set; (3) a perspective plot of percent correct (not percent error this time) vs. s and md; and (4) a perspective plot of the cutoff value vs. s and md.

Here are the results for dataset GL-w.  In the table, the leftmost column is just the original ordinal of the record, and may be ignored; column rs has the s value, and md the minimum deviation.  Both parameters and their interaction are highly significant:

              Df Sum Sq Mean Sq  F value    Pr(>F)    
s             26 1964.8    75.6  3231.97 < 2.2e-16 ***
md            23 6348.9   276.0 11805.69 < 2.2e-16 ***
s:md         598 1601.9     2.7   114.57 < 2.2e-16 ***
Residuals   1296   30.3 0.02338                       
---

         rs   md cutoff  percent
212 0.01000 0.40  0.504 1.156153
188 0.02150 0.40  0.508 1.159789
141 0.10000 0.42  0.519 1.199782
165 0.04640 0.42  0.514 1.207053
260 0.00215 0.40  0.502 1.217960
189 0.02150 0.42  0.507 1.221596
115 0.21500 0.38  0.565 1.247046
210 0.01000 0.36  0.533 1.254317
140 0.10000 0.40  0.554 1.265225
213 0.01000 0.42  0.504 1.268860
164 0.04640 0.40  0.536 1.276132
138 0.10000 0.36  0.603 1.283403
187 0.02150 0.38  0.530 1.283403
284 0.00100 0.40  0.502 1.283403
211 0.01000 0.38  0.523 1.294310

Here are the corresponding results for dataset GL-h:

             Df  Sum Sq  Mean Sq  F value    Pr(>F)    
s            26 1067.41    41.05 16984.15 < 2.2e-16 ***
md           23  453.68    19.73  8160.33 < 2.2e-16 ***
s:md        598  336.14     0.56   232.55 < 2.2e-16 ***
Residuals   648    1.57 0.002417                       
---

          rs   md cutoff   percent
166 4.64e-02 0.44  0.608 0.8206007
329 2.15e-04 0.34  0.503 0.8288896
352 1.00e-04 0.32  0.503 0.8399414
305 4.64e-04 0.34  0.507 0.8454674
353 1.00e-04 0.34  0.503 0.8703340
304 4.64e-04 0.32  0.514 0.8730970
328 2.15e-04 0.32  0.509 0.8786229
190 2.15e-02 0.44  0.610 0.8841489
281 1.00e-03 0.34  0.517 0.8841489
142 1.00e-01 0.44  0.709 0.8924378
165 4.64e-02 0.42  0.655 0.9145415
375 4.64e-05 0.30  0.506 0.9173045
377 4.64e-05 0.34  0.503 0.9173045
327 2.15e-04 0.30  0.518 0.9200674
280 1.00e-03 0.32  0.537 0.9228304

And here are the results for dataset DR-h:

              Df Sum Sq Mean Sq F value    Pr(>F)    
s             26 493.23   18.97 1720.22 < 2.2e-16 ***
md            23  68.09    2.96  268.45 < 2.2e-16 ***
s:md         598 549.74    0.92   83.36 < 2.2e-16 ***
Residuals   1296  14.29    0.01                      
---

          rs   md cutoff   percent
406 2.15e-05 0.44    0.5 0.8599382
407 2.15e-05 0.46    0.5 0.9320288
383 4.64e-05 0.46    0.5 0.9474768
312 4.64e-04 0.48    0.5 0.9577755
360 1.00e-04 0.48    0.5 0.9629248
336 2.15e-04 0.48    0.5 0.9783728
431 1.00e-05 0.46    0.5 0.9783728
384 4.64e-05 0.48    0.5 0.9835221
502 1.00e-06 0.44    0.5 0.9835221
526 4.64e-07 0.44    0.5 0.9938208
408 2.15e-05 0.48    0.5 1.0041195
432 1.00e-05 0.48    0.5 1.0195675
479 2.15e-06 0.46    0.5 1.0453141
574 1.00e-07 0.44    0.5 1.0504634
456 4.64e-06 0.48    0.5 1.0556128

The next graphic shows the percentage-correct plots again, grouped to make comparison easier.  The personal (single-user) email dataset results appear on the left and the work-environment (multi-user) result on the right:

Relatively low values of s (1e-3 and below), and relatively high values of minimum deviation (0.25 and above) seem consistently to produce good results, but low s values are dangerous (see the first item in the Conclusions).  It would be helpful to know if there were a region within the s / md surface that gave near-optimal discrimination for all three data sets, as this might then be taken as a worthwhile starting point for new bogofilter installations.  If we take the results for the three datasets and rank the s / md pairs according to the percentage correct that was obtained in each case, each s / md combination will have three ranks.  If we then take the worst rank for each point as a measure of the overall merit of that s / md combination, we can list and/or plot the "best" combinations.

The following table shows the best fifteen values of s (rs) and minimum deviation (md), percentage correct and cutoff, rank (smaller number means higher rank), maximum (worse) rank of the two and the difference; the rows of the table are sorted by rank, and within rank by difference.  The graph on the left is a perspective plot of the best 150 points; on the right, to facilitate picking the optimum parameter values, the same data are plotted in a "thermal image" form:

          rs   md glwpc glwco glhpc glhco drhpc drhco glwr glhr drhr maxr difr
142 1.00e-01 0.44  1.35 0.526 0.892 0.709  1.71 0.747   28   10   34   34   24
166 4.64e-02 0.44  1.41 0.525 0.821 0.608  1.72 0.691   40    1   36   40   39
190 2.15e-02 0.44  1.46 0.517 0.884 0.610  1.92 0.689   56    8   57   57   49
214 1.00e-02 0.44  1.52 0.512 0.970 0.606  1.86 0.611   78   69   49   78   29
141 1.00e-01 0.42  1.20 0.519 0.970 0.768  2.04 0.943    3   68  111  111  108
165 4.64e-02 0.42  1.21 0.514 0.915 0.655  2.04 0.912    4   11  112  112  108
400 2.15e-05 0.32  1.59 0.539 0.975 0.507  2.05 0.669  101   88  126  126   38
399 2.15e-05 0.30  1.63 0.582 0.997 0.510  2.03 0.696  123  135  107  135   28
375 4.64e-05 0.30  1.60 0.587 0.917 0.506  2.07 0.766  107   12  140  140  128
189 2.15e-02 0.42  1.22 0.507 0.926 0.610  2.09 0.873    6   16  159  159  153
376 4.64e-05 0.32  1.55 0.540 0.931 0.506  2.09 0.722   91   19  161  161  142
423 1.00e-05 0.30  1.65 0.573 1.014 0.507  2.02 0.650  128  166   95  166   71
117 2.15e-01 0.42  1.54 0.544 1.025 0.759  1.93 0.895   83  178   58  178  120
237 4.64e-03 0.42  1.42 0.509 1.025 0.604  2.05 0.744   45  180  120  180  135
352 1.00e-04 0.32  1.55 0.557 0.840 0.503  2.11 0.779   90    3  180  180  177

In the following table, these 15 "best-compromise" values are compared against the individual optima of the work and personal data sets.  The glwd, glhd and drhd columns show the amount by which the percent correct figures exceed the optima (0.16, 0.82 and 0.86% respectively):

         rs   md    glwd    glhd    drhd
1  1.00e-01 0.44 0.19270 0.07184 0.84964
2  4.64e-02 0.44 0.25087 0.00000 0.85994
3  2.15e-02 0.44 0.30540 0.06355 1.06076
4  1.00e-02 0.44 0.36721 0.14920 0.99897
5  1.00e-01 0.42 0.04363 0.14920 1.17919
6  4.64e-02 0.42 0.05090 0.09394 1.17919
7  2.15e-05 0.32 0.42902 0.15473 1.19464
8  2.15e-05 0.30 0.47628 0.17683 1.17405
9  4.64e-05 0.30 0.44719 0.09670 1.21009
10 2.15e-02 0.42 0.06545 0.10499 1.22554
11 4.64e-05 0.32 0.39630 0.11052 1.22554
12 1.00e-05 0.30 0.49082 0.19341 1.15860
13 2.15e-01 0.42 0.38175 0.20446 1.07106
14 4.64e-03 0.42 0.26541 0.20446 1.18949
15 1.00e-04 0.32 0.39630 0.01934 1.24614

Conclusions:

  1. Variation of the s parameter within a range so close to zero that one would think there should be no effect -- particularly at high minimum-deviation values -- did in fact influence the distribution of percent-correct scores, as is obvious from the spam-cutoff graphs.  This is because tiny values of s disproportionately increase the influence of tokens that have hitherto been seen in only spam or only nonspam.  That's because in such cases, p(w) is exactly 0 or 1, so that s alone determines how far f(w) deviates from 0 or 1.  It follows that, to avoid giving excessive weight to tokens that appear just a few times in one wordlist and not at all in the other, the value of the s parameter should not be set lower than about 0.01.  Fortunately, many of the best "compromise" settings listed above do in fact include s values of 0.01 and above.  (David Relson contributed significantly to the understanding of this problem, both with discussion and by supplying the message that ultimately showed what it was -- Thanks, David!)
  2. Ignoring those tokens with f(w) values not far from 0.5 seems consistently to give better results.  Although the difference is small when a well-trained and uniform message corpus is examined, a minimum deviation in the range of 0.3 to 0.46 may be preferable.
  3. The first line in the two preceding tables shows a good "compromise" setting that gives near-optimum discrimination for both the GL-w and the GL-h data sets, but when the DR-h data set is included, the best compromise is still significantly worse than optimal for that set.  (It's not entirely clear what causes the odd jump in discrimination at very high minimum deviation and low s values; if that is dismissed as an artefact, the compromise setting is quite good for all three data sets.)  It seems that although discrimination deemed adequate may be obtained with stock parameter values, some tuning of s and mindev, and consequently of the spam cutoff, can yield a significant reduction in the number of spam that bogofilter fails to recognize.

Appendix: Details of the experimental procedure

There is intended to be sufficient information in this appendix to permit the experiment to be repeated.  Other experimental reports, to which my general bogofilter page contains links, may help to clarify some of what is presented here.

The following R scripts were used in data reduction in this experiment.  Other (non-R) scripts needed are included as comments in the smindev.R listing (the runex script shown is one that uses bogofilter for all its classification work, since apclass is not generally available).  The setR script is used to render smindev.R and mergeparm.R executable, with parameters, from the command line.


#! /bin/sh /usr/bin/setR
# This file is smindev.R, an R script to perform the data reduction for
# experiments in which min_dev and s are varied over a wide range of
# values; for each combination, a spam cutoff is first determined such
# that there are no more than some target number of false positives when
# nonspam files are pooled and evaluated, and with that cutoff, the number
# of false negatives is determined for each spam file.

# The following script distributes the messages into four files, t for
# training and r0, r1 and r2 for the experiment.
# It's run from formail as follows:
#   cat [list of spam mbox files] | formail -s ./distrib sp
#   cat [list of nonspam mboxen]  | formail -s ./distrib ns

#   #! /bin/sh
#   #  distrib - deal messages from an mbox into files
#   #  usage: FILENO=0 formail -s ./distrib extension < mbox
# 
#   # put names of files to be produced into this array
#   FILE=(t.$1 r0.$1 t.$1 r1.$1 t.$1 r2.$1)
#
#   # no user serviceable parts beyond this point
#   let n=${FILENO}%${#FILE[*]}
#   fname=${FILE[$n]}
#   cat >>$fname

# In the experiment for which this script was written, extra files
# were added to the training mailboxes after the distrib script had
# been run:
# $ ./sizes
# ns 3502 3502 3501
# sp 5667 5667 5666

# The training database is built with the command
#   bogofilter -d db -s &1 | \
#     perl -e ' $target = 10; while (<>) { ' \
#	 -e ' ($i, $d) = split; push @diffs, $d unless $i != 1; }' \
#	 -e ' die "dainbramage" unless scalar @diffs > 15;' \
#	 -e ' @s = sort { $a <=> $b } (@diffs); $co = $s[$target];' \
#	 -e ' while($co < 0.000001) { ++$target; $co = $s[$target]; }' \
#	 -e ' printf("%8.6f %d",1.0-$s[$target],$target-1);'`
# }
#
# function wrapper () {
#   mopt=$1; oopt=$2; shift; v=-v
#   res=`cat $* | formail -s bogofilter -d db -m $mopt -o $oopt -v -t 2>&1 | \
#     grep -c $v '^1'`
# }
#
# sizes >parms.tbl
# for s in 1e-2 3.2e-3 1e-3 3.2e-4 1e-4 3.2e-5 1e-5 3.2e-6 1e-6 3.2e-7 \
#   1e-7 3.2e-8 1e-8; do
#   for md in `seq 0.025 0.025 0.47501`; do
#     echo -n "$s $md fpos... "
#     getco $md,$s 0.1 r0.ns r1.ns r2.ns
#     fpos=${res##* }; co=${res%% *}; let fpos=$fpos/3
#     echo -n "$fpos at cutoff $co, run0... "
#     run=0; wrapper $md,$s $co r0.sp; fneg=$res
#     echo "$s $md $co $run $fpos $fneg" >>parms.tbl
#     echo -n "$fneg, run1... "
#     run=1; wrapper $md,$s $co r1.sp; fneg=$res
#     echo "$s $md $co $run $fpos $fneg" >>parms.tbl
#     echo -n "$fneg, run2... "
#     run=2; wrapper $md,$s $co r2.sp; fneg=$res
#     echo "$s $md $co $run $fpos $fneg" >>parms.tbl
#     echo $fneg
#   done
# done

#                                                              /**/
# For use in R, parms.tbl from runex is expected in bogolog/smindev.tbl
# by default; otherwise run ./smindev.R filename

graphics.off(); setwd("/proj/Rwork")
if(length(argv) > 0) fn <- argv[1] else fn <- "bogolog/smindev.tbl"
if(file.exists(fn) == FALSE) stop(paste("file", fn, "not found"))
if(length(argv) > 1) sub <- argv[2] else sub <- "--"

### First read the message counts:
read.table(fn, nrows=2) -> meta
msgcount <- sum(apply(meta[,2:length(meta)],1,mean))

### Now read the data 
parms <- read.table(fn, col.names=c("s", "md", "cutoff", "run", "fp", "fn"),
  skip=2)

### Get axis values and number of replicates
sval <- sort(unique(parms$s), decreasing=TRUE)
x <- -log10(sval)
y <- sort(unique(parms$md))
n <- length(unique(parms$run))

### Express error in percentage and perform an anova
parms$percent = (parms$fp + parms$fn) * 100 / msgcount
parms$s = factor(parms$s)
parms$md = factor(parms$md)
paov <- aov(percent ~ s + md + s*md, data=parms)
print(summary(paov))

### Now express results as mean percent correct
pcs <- array(parms$percent, dim=c(n, length(parms$percent)/n))
meanerr <- apply(pcs, 2, mean)
cutoffs <- array(parms$cutoff, dim=c(n, length(parms$cutoff)/n))[1,]

### Create a data frame with these results
parmres <- data.frame(rs=rep(sval, each=length(parms$s)/(n*length(sval))),
  md=rep(y,length(sval)), cutoff=cutoffs, percent=meanerr)

### calculate the z-axis values for percent correct and for cutoffs
z <- t(array(100 - parmres$percent, dim=c(length(parms$md)/(n * length(sval)),
  length(sval))))
co <- t(array(parmres$cutoff, dim=c(length(parms$md)/(n * length(sval)),
  length(sval))))

X11(width=4.5, height=4.5)

### produce a trial plot, making it easy to try other rotation values
pplot <- function(th,ph) {
  persp(x, y, z, ticktype="detailed", theta=th, phi=ph,
  main="Percent correct vs s and mindev", sub=sub,
  xlab="-log(10) s", ylab="mindev", zlim=c(90,100),
  zlab="percent correct", shade=0.6, border=4, r=sqrt(2), d=2.5)
}

pplot(70,15)

### another trial plot, this time for cutoffs
X11(width=4.5, height=4.5)

qplot <- function(th,ph) {
  persp(x, y, co, ticktype="detailed", theta=th, phi=ph,
  main="Cutoff vs s and mindev", sub=sub, xlab="-log(10) s",
  ylab="mindev", zlab="cutoff", shade=0.6, border=4, r=sqrt(2), d=2.5)
}

qplot(70,15)

### get the 15 best combinations of s and mindev
sortlist <- sort(parmres$percent, index.return=TRUE)
system("echo")
print(parmres[sortlist$ix,][1:15,])


#! /bin/sh /usr/bin/setR
#  mergeparm.R -- find common optima
graphics.off(); setwd("/proj/Rwork")
if(length(argv) < 3) {
    read.table("bogolog/parm21glw") -> glw
    read.table("bogolog/parm08glh") -> glh
    read.table("bogolog/parm06drh") -> drh
    sub <- "GL-w / GL-h / DR-h, best 150"
} else {
    glw <- read.table(argv[1])
    glh <- read.table(argv[2])
    drh <- read.table(argv[3])
    if(length(argv) > 3) sub <- argv[4] else sub <- "--"
}
# Get the individual md and rs values and make rs and md columns for
# the data frame to be written to parmres.merge
md <- sort(unique(drh$md))
rs <- sort(unique(drh$rs), decreasing=TRUE)
prs <- rep(rs, each=length(md))
pmd <- rep(md,length(rs))

# Create vectors of percentages and cutoffs for data points in common
drhpc <- seq(1,length(drh$rs))
drhco <- drhpc
glwpc <- drhpc
glwco <- drhpc
glhpc <- drhpc
glhco <- drhpc
for(i in 1:length(drh$rs)) {
  for(j in 1:length(prs)) {
    if(drh$rs[i] == prs[j] && drh$md[i] == pmd[j]) {
      drhpc[j] <- drh$percent[i]
      drhco[j] <- drh$cutoff[i]
    }
  }
}
for(i in 1:length(glw$rs)) {
  for(j in 1:length(prs)) {
    if(glw$rs[i] == prs[j] && glw$md[i] == pmd[j]) {
      glwpc[j] <- glw$percent[i]
      glwco[j] <- glw$cutoff[i]
    } 
  }
}
for(i in 1:length(glh$rs)) {
  for(j in 1:length(prs)) {
    if(glh$rs[i] == prs[j] && glh$md[i] == pmd[j]) {
      glhpc[j] <- glh$percent[i]
      glhco[j] <- glh$cutoff[i]
    } 
  }
}

# Create and write the merged table
p <- data.frame(rs=prs, md=pmd, glwpc=glwpc, glwco=glwco,
  glhpc=glhpc, glhco=glhco, drhpc=drhpc, drhco=drhco)
sink("bogolog/parmres.merge"); print(p); sink()

# Rank the percentage figures and add columns of ranks
sglw <- sort(p$glwpc, index.return=TRUE)
sglh <- sort(p$glhpc, index.return=TRUE)
sdrh <- sort(p$drhpc, index.return=TRUE)
rsglw <- sort(sglw$ix, index.return=TRUE)
rsglh <- sort(sglh$ix, index.return=TRUE)
rsdrh <- sort(sdrh$ix, index.return=TRUE)
p$glwr <- rsglw$ix              
p$glhr <- rsglh$ix
p$drhr <- rsdrh$ix

# Get the maximum (lowest) rank and the greatest difference in rank
# for each rs/md pair
p$maxr <- pmax(p$glwr, p$glhr, p$drhr)
dif1 <- abs(p$glwr - p$glhr)
dif2 <- abs(p$glwr - p$drhr)
dif3 <- abs(p$glhr - p$drhr)
p$difr <- pmax(dif1, dif2, dif3)

# Sort by rank, and within rank, by difference
sortrank <- sort(p$maxr + p$difr / 10000, index.return=TRUE)

# Create a table in sortrank order, copy and print the top 40 records
ranka <- p[sortrank$ix,]
rank150 <- ranka[1:150,]
print(rank150,digits=3)

# Make x, y and z and do thermal and perspective plots
x <- sort(unique(p$md))
sval <- sort(unique(p$rs))
y <- log10(sval)
z <- array(dim=c(length(y),length(x)))
for(i in seq(along=x)) {
  for(j in seq(along=y)) {
    for(k in 1:length(rank150$rs)) {
      if((rank150$rs[k] == sval[j]) && (rank150$md[k] == x[i])) z[j,i] <-
        rank150$maxr[k]
    }
  }
}
X11(width=3.5,height=3.5)
image(x,y,t(z),col=heat.colors(15),
  main="Discrimination (darker is better) vs s and mindev",
  sub=sub,ylab="log(10) s", xlab="mindev")
X11(width=3.5,height=3.5)
persp(x,y,300-t(z),ticktype="detailed",shade=0.6, phi=15, theta=-20,
  expand=0.7, border=4,r=sqrt(2),d=2.5,xlab="mindev",ylab="log(10) s",
  zlab="rank",main="Discrimination vs. s and mindev", sub=sub)

Greg Louis, 2003; last modified 2003-04-19]