Code related to HW3

Below is some code you might find useful for homework 3.

## defining a matrix:
##
## The 'matrix' command allows you to pass a list of data and format it into
## a matrix.  It will fill the data down rows by default unless you set
## byrow=TRUE.  If you tell it to make a matrix with more entries than the
## the data you provide will fill, it will repeat the data.

## Example 1:
x = matrix( c(1,2,3,4), nrow=2, ncol=2)  # Creates a 2x2 matrix
x

x = matrix(c(1,2,3,4), nrow=4,ncol=2) # Repeats the data to create a 4x2 matrix
x

## Example 2:
x = matrix( c(1,2,3,4), nrow=2, ncol=2, byrow=TRUE)
x

x = matrix(c(1,2,3,4), nrow=4,ncol=2, byrow=TRUE)
x

## Example 3:
x = matrix (c(6,9,10,6,8,3), nrow=3, ncol=2, byrow=TRUE)
x


## Matrix multiplication is performed using the %*% notation.  You can take a
## transpose of a matrix using the function t().  You can find the inverse of
## a matrix using the solve() command.
##
## All of the usual operators will work on a matrix, by on an
## element-by-element basis, NOT via matrix operations.

## Example 4:

x*x  # Note that this shouldn't be possible, since the dimensions don't
     # conform.  This performs element-wise multiplication.

x%*%x   # Here we try to do a matrix multiplication and get an appropriate
      # error message.

x%*%t(x) # By taking a transpose we can obtain a valid matrix multiplication

## Other useful functions include:  rowMeans
##                                  colMeans
##                                  rowSums
##                                  colSums
## Note that the capitalization is important.  These functions give you
## row-by-row and column-by-column means and sums.

## Example 5: center our data x using colMeans and assembling it into a
## matrix of the right size.

(means = matrix(colMeans(x), byrow=TRUE, nrow=3, ncol=2))

(Centered = x - means)

## Example 6: Create the S-matrix:

(S = (1/2)*t(Centered)%*%Centered)

## We can find S^-1 using solve:

(Sinv = solve(S))

## Now calculate the T statistic for the point (0,0)

U0 = matrix( c(0,0), nrow=1)

(U0mean = colMeans(x)-U0)

3*U0mean%*%Sinv%*%t(U0mean)

## This will be distributed as a constant times an f distribution with
## 2 and (3-1) degrees of freedom: (n-1)p / (n-p) * F_(p,n-p)

qf(.95, 2, 2)

(3-1)*2/(3-2) * qf(.95, 2, 2)

## We have that our T^2 value of 112 exceeds 76, so we reject the
## hypothesis that the data is centered.

## Let's look at a different set of data:

x = matrix(c(3,4,2,6,8,2,4,5,1,2,4,7), nrow=4, ncol=3)

## Now let's say we wish to investigate jointly x1-x2 and x1-x3.  Make the
## contrast vector a:

A = matrix( c(1, -1, 0, 1, 0, -1), nrow=2, byrow=T)

x_new = t(A%*%t(x))

## Are they both centered at zero?  Calculate a T^2:
## First get the centered x_new:

Centered = x_new - matrix(colMeans(x_new), byrow=T, nrow=4, ncol=2)

## Now our S matrix and its inverse
S = (1/3)*t(Centered)%*%Centered
Sinv = solve(S)

## Now our mean vector that we want to test
U0 = matrix(c(0,0),nrow=1)

## Finally calculate our T^2 stat
(T2 = 4*(colMeans(x_new) - U0)%*%Sinv%*%t(colMeans(x_new)-U0))

qf(0.95, 2, 2)

((4-1)*2 / (4-2)) * qf(0.95, 2, 2)

## So we accept the null hypothesis that the differences x1-x2 and x1-x3
## both seem to be centered at (0,0)

################# MANOVA code:

# Here we construct some data, 20 cases from 1 MVN distribution and
# 20 from another.  In this case we're introducing independence, but this works
# fine with dependence as well.
x1 = c(rnorm(20, 0, 1), rnorm(20, 3, 1))
x2 = c(rnorm(20, 1, 4) ,rnorm(20, -1, 2))
x3 = c(rnorm(20, 12, 1), rnorm(20, 10, 1))

# Now combine them all into 1 response.
Y = cbind(x1,x2,x3)

# We'll make a set of indicator variables -- zeros and ones -- to distinguish
# the two groups
indic = c(rep(0,20), rep(1,20))

# Now call manova in much the same way as we do e.g. lm
m1 = manova(Y~indic)

# Show a test summary, indicating that (unsurprisingly) our "indic" variable
# is highly significant, indicating that the means in the two groups are
# different.  We're using Wilk's test since that's what's in the book.  There's
# also the Pillai, Roy, and Hotelling-Lawley tests.
summary(m1, test="Wilks")

# Show an AOV breakdown for each individual response.  Note that x2 isn't as
# significant, which shouldn't be surprising since the variances were fairly
# large.
summary.aov(m1)