The following code for R produces biplots.
I have tried to follow
I have not wrapped this up into a package. When I have time, and when I have debugged some more, I shall. In the meantime, you can download the source code: biplot.R
xyaplot (y~x, groups=sex, data=observations,
axes=y~x|type, data.axes=variables,
axes.spec=list(origin="o",min="min",max="max",ticks=TRUE),
label="label", label.axes="label")
Here is a full script to produce the above output. Make sure you have biplot.R in your working directory. My function princompfull is simply an extended version of princomp, returning the results in a pair of data frames, and including information about the mean, minimum and maximum values in each column.
library(lattice)
trellis.device()
bg <- trellis.par.get("background")
bg$col <- "white"
trellis.par.set("background",bg)
gr.plot.symbol <- trellis.par.get("plot.symbol")
gr.plot.symbol$col <- rgb(0,0,0)
trellis.par.set("plot.symbol",gr.plot.symbol)
gr.line <- trellis.par.get("plot.line")
gr.line$col <- rgb(0,0,0)
trellis.par.set("plot.line",gr.line)
source("biplot.R")
m <- data.frame(row.names=c("Me","Law","Pitt","Li","Deneuve","Clooney","Zwilliger"),
looks=c(8,9,4,6,10,5,4),
charm=c(6,7,1,5,9,7,8),
acting=c(0,8,3,9,9,6,7),
success=c(2,6,9,5,8,9,6),
sex=c("male","male","male","female","female","male","female"))
res <- princompfull(m[,c("looks","charm","acting","success")])
res$observations$sex <- m$sex
res$variables$type <- c("personal","personal","worldly","worldly")
xyaplot (y~x, groups=sex, data=res$observations,
axes=y~x|type, data.axes=res$variables,
axes.spec=list(origin="o",min="min",max="max",ticks=TRUE),
label="label", label.axes="label")
Some brief explanation of how it works. In the command
xyaplot (y~x, groups=sex, data=observations,
axes=y~x|type, data.axes=variables,
axes.spec=list(origin="o",min="min",max="max",ticks=TRUE,interval=TRUE),
label="label", label.axes="label")
the first three arguments are just like those for xyplot: a formula (with optional conditioning terms), a groups term (so that groups of points can be coloured differently), and a data frame containing the variables in the formula.
The second arguments are new. The data frame data.axes contains a row for each axis. The axis= formula specifies the y and x coordinates of a unit step in the direction of that axis. The formula may have conditioning terms. The optional groups.axis term can be used to show groups of axes in different colours.
In the axes.spec= and label= arguments, I diverge from standard lattice usage. axes.spec is a list of character strings, each referring to a column in data.axes (except that ticks and interval may be booleans). Also label and label.axes are character strings, referring to columns in data and data.axes.
The columns referred to by axes.spec can specify: the origin, i.e. the location of the display-origin (0,0) on each axis, and the minimum and maximum range to show for that axis. If all three are specified, and interval=TRUE, the range of the axis is highlighted. Note that all three columns are necessarily numeric vectors (possibly containing NA values).
If you want to control the formatting, you should override the style functions, by specifying style= and style.axis= in the call to xyaplot. These functions take a single argument, the current group, and return the style for that group. See the source code. For example, the default is style=pointstyle.default:
pointstyle.default <- function(g) {
n <- length(levels(g))
i <- which(levels(g)==g)
sym <- if (n>1) trellis.par.get("superpose.symbol") else trellis.par.get("plot.symbol")
cex <- rep(sym$cex,length.out=i)[i]
col <- rep(sym$col,length.out=i)[i]
font <- rep(sym$font, length.out=i)[i]
pch <- rep(sym$pch, length.out=i)[i]
list(cex=cex,col=col,font=font,pch=pch)
}
If you want tighter control over the formatting, you should override the panel functions. There are two default panel functions:
panel=function(...) panel.superpose.ex(...,panel.groups=panel.xyaplot), panel.axes=function(...) panel.superpose.ex(...,panel.groups=panel.xyaplot.axes),
for plotting points and axes. There is also the default panel function panel2=panel2.default which for each panel calls panel.axes and then panel. My panel functions are unlike the standard lattice panel functions. For example,
panel.xyaplot <- function(data, x, y, group, label=NULL, style=pointstyle.default,...) {
sty <- style(group)
if (is.null(label))
panel.xyplot(data[[x]],data[[y]], cex=sty$cex, col=sty$col, pch=sty$pch)
else
ltext(data[[x]],data[[y]],data[[label]], cex=sty$cex, col=sty$col, font=sty$font)
}
The first argument is a subset of the original data frame, containing those points we wish to plot. x and y are characters, indicating which columns of the data frame contain x and y coordinates. group is single factor variable, indicating which group we are currently plotting.