A cheatsheet for R.
See on CiteULike.
> <in>install.packages("PKGNAME", dependencies=TRUE)</in>
Reduce the number of rows in a data frame to a given length, with a uniform distribution throughout the initial frame.
filter.subsample <- function(dataframe, maxelt) dataframe[seq(1,(nrow(dataframe)),max(1,nrow(dataframe)/maxelt)),]
Or more easily,
dataframe[sample(1,nrow(dataframe)), maxelt]
for
loops seem to be very costly and not optimised. It appears to be more efficient to manipulate full datasets at once.
Let's assume a dataframe as follows.
DF <- data.frame(time=jitter(1:400), gauge=randu[,1])
The idea is to have temporary vectors of time and absolute data with their index offset by one, then subtract and divide appropriately, and fill in with NA
s
size = nrow(DF) time1 = c(DF$time[2:size], NA) gauge1 = c(DF$gauge[2:size], NA) DF$rate <- (gauge1 - DF$gauge) / (time1 - DF$time)
Four graphs-page, two on top and a large one at the bottom.
layout(matrix(c(1,2,3,3), 2, 2, byrow = TRUE))
Seen here.
## add `drop.unused.levels' argument to boxplot.formula boxplot2 <- function (formula, data = NULL, ..., subset, na.action = NULL, drop.unused.levels = TRUE) { if (missing(formula) || (length(formula) != 3)) stop("'formula' missing or incorrect") m <- match.call(expand.dots = FALSE) if (is.matrix(eval(m$data, parent.frame()))) m$data <- as.data.frame(data) m$... <- NULL m$na.action <- na.action m$drop.unused.levels <- drop.unused.levels m[[1]] <- as.name("model.frame") mf <- eval(m, parent.frame()) response <- attr(attr(mf, "terms"), "response") boxplot(split(mf[[response]], mf[-response]), ...) }
library(RSQLite) con <- dbConnect(drv="SQLite", dbname="db.sq3") d <- dbGetQuery(con, "select ...")