Biplots

The following code for R produces biplots. I have tried to follow Biplots (1996) by Gower and Hand, though there may be errors in my linear maths.

I have not wrapped this up into a package. When I have time, and when I have debugged some more, I shall. In the meantime, you can download the source code: biplot.R

Typical use
xyaplot (y~x, groups=sex, data=observations,
         axes=y~x|type, data.axes=variables,
         axes.spec=list(origin="o",min="min",max="max",ticks=TRUE),
         label="label", label.axes="label")
Typical output

Here is a full script to produce the above output. Make sure you have biplot.R in your working directory. My function princompfull is simply an extended version of princomp, returning the results in a pair of data frames, and including information about the mean, minimum and maximum values in each column.

Full invoking script
library(lattice)
trellis.device()
bg <- trellis.par.get("background")
bg$col <- "white"
trellis.par.set("background",bg)
gr.plot.symbol <- trellis.par.get("plot.symbol")
gr.plot.symbol$col <- rgb(0,0,0)
trellis.par.set("plot.symbol",gr.plot.symbol)
gr.line <- trellis.par.get("plot.line")
gr.line$col <- rgb(0,0,0)
trellis.par.set("plot.line",gr.line)
source("biplot.R")

m <- data.frame(row.names=c("Me","Law","Pitt","Li","Deneuve","Clooney","Zwilliger"),
                looks=c(8,9,4,6,10,5,4),
                charm=c(6,7,1,5,9,7,8),
                acting=c(0,8,3,9,9,6,7),
                success=c(2,6,9,5,8,9,6),
                sex=c("male","male","male","female","female","male","female"))
res <- princompfull(m[,c("looks","charm","acting","success")])
res$observations$sex <- m$sex
res$variables$type <- c("personal","personal","worldly","worldly")

xyaplot (y~x, groups=sex, data=res$observations,
         axes=y~x|type, data.axes=res$variables,
         axes.spec=list(origin="o",min="min",max="max",ticks=TRUE),
         label="label", label.axes="label")

Some brief explanation of how it works. In the command

Typical use
xyaplot (y~x, groups=sex, data=observations,
         axes=y~x|type, data.axes=variables,
         axes.spec=list(origin="o",min="min",max="max",ticks=TRUE,interval=TRUE),
         label="label", label.axes="label")

the first three arguments are just like those for xyplot: a formula (with optional conditioning terms), a groups term (so that groups of points can be coloured differently), and a data frame containing the variables in the formula.

The second arguments are new. The data frame data.axes contains a row for each axis. The axis= formula specifies the y and x coordinates of a unit step in the direction of that axis. The formula may have conditioning terms. The optional groups.axis term can be used to show groups of axes in different colours.

In the axes.spec= and label= arguments, I diverge from standard lattice usage. axes.spec is a list of character strings, each referring to a column in data.axes (except that ticks and interval may be booleans). Also label and label.axes are character strings, referring to columns in data and data.axes.

The columns referred to by axes.spec can specify: the origin, i.e. the location of the display-origin (0,0) on each axis, and the minimum and maximum range to show for that axis. If all three are specified, and interval=TRUE, the range of the axis is highlighted. Note that all three columns are necessarily numeric vectors (possibly containing NA values).

If you want to control the formatting, you should override the style functions, by specifying style= and style.axis= in the call to xyaplot. These functions take a single argument, the current group, and return the style for that group. See the source code. For example, the default is style=pointstyle.default:

pointstyle.default, the default style function for observations
pointstyle.default <- function(g) {
  n <- length(levels(g))
  i <- which(levels(g)==g)
  sym <- if (n>1) trellis.par.get("superpose.symbol") else trellis.par.get("plot.symbol")
  cex <- rep(sym$cex,length.out=i)[i]
  col <- rep(sym$col,length.out=i)[i]
  font <- rep(sym$font, length.out=i)[i]
  pch <- rep(sym$pch, length.out=i)[i]
  list(cex=cex,col=col,font=font,pch=pch)
}

If you want tighter control over the formatting, you should override the panel functions. There are two default panel functions:

default panel functions
panel=function(...) panel.superpose.ex(...,panel.groups=panel.xyaplot),
panel.axes=function(...) panel.superpose.ex(...,panel.groups=panel.xyaplot.axes),

for plotting points and axes. There is also the default panel function panel2=panel2.default which for each panel calls panel.axes and then panel. My panel functions are unlike the standard lattice panel functions. For example,

default panel function for plotting points
panel.xyaplot <- function(data, x, y, group, label=NULL, style=pointstyle.default,...) {
  sty <- style(group)
  if (is.null(label))
    panel.xyplot(data[[x]],data[[y]], cex=sty$cex, col=sty$col, pch=sty$pch)
  else
    ltext(data[[x]],data[[y]],data[[label]], cex=sty$cex, col=sty$col, font=sty$font)
}

The first argument is a subset of the original data frame, containing those points we wish to plot. x and y are characters, indicating which columns of the data frame contain x and y coordinates. group is single factor variable, indicating which group we are currently plotting.