A cheatsheet for R.
See on CiteULike.
> install.packages("PKGNAME", dependencies=TRUE)
Reduce the number of rows in a data frame to a given length, with a uniform distribution throughout the initial frame.
filter.subsample <- function(dataframe, maxelt) dataframe[seq(1,(nrow(dataframe)),max(1,nrow(dataframe)/maxelt)),]
Or more easily,
dataframe[sample(1,nrow(dataframe)), maxelt]
for loops seem to be very costly and not optimised. It appears to be more efficient to manipulate full datasets at once.
Let's assume a dataframe as follows.
DF <- data.frame(time=jitter(1:400), gauge=randu[,1])
The idea is to have temporary vectors of time and absolute data with their index offset by one, then subtract and divide appropriately, and fill in with NAs
size = nrow(DF) time1 = c(DF$time[2:size], NA) gauge1 = c(DF$gauge[2:size], NA) DF$rate <- (gauge1 - DF$gauge) / (time1 - DF$time)
Four graphs-page, two on top and a large one at the bottom.
layout(matrix(c(1,2,3,3), 2, 2, byrow = TRUE))
Seen here.
## add `drop.unused.levels' argument to boxplot.formula
boxplot2 <-
function (formula, data = NULL, ...,
subset, na.action = NULL,
drop.unused.levels = TRUE)
{
if (missing(formula) || (length(formula) != 3))
stop("'formula' missing or incorrect")
m <- match.call(expand.dots = FALSE)
if (is.matrix(eval(m$data, parent.frame())))
m$data <- as.data.frame(data)
m$... <- NULL
m$na.action <- na.action
m$drop.unused.levels <- drop.unused.levels
m[[1]] <- as.name("model.frame")
mf <- eval(m, parent.frame())
response <- attr(attr(mf, "terms"), "response")
boxplot(split(mf[[response]], mf[-response]), ...)
}
library(RSQLite) con <- dbConnect(drv="SQLite", dbname="db.sq3") d <- dbGetQuery(con, "select ...")