23 Feb 2017

Consider the following repeated measures model:

for , where is the sample size, represents the index of the repeated measure, i.e., each subject has two measurements, is a normally distributed random effect, is a normally distributed error term, is the continuous response, and are covariates. This is a multilevel model because of the nested structure of the data, and also non-linear in the parameter. In this post I simulate some data under this model, and try to leverage Bayesian computation techniques to estimate the parameters using the brms which is an interface to fit Bayesian generalized (non-)linear multilevel models using Stan.

**Read more »**
07 Feb 2017

When analyzing large amounts of genetic and genomic data, the first line of analysis is usually some sort of univariate test. That is, conduct a statistical test for each SNP or CpG site or Gene and then correct for multiple testing. The limma package on Bioconductor is a popular method for computing *moderated* t-statistics using a combination of the `limma::lmFit`

and `limma::eBayes`

functions. In this post, I show how to calculate the *ordinary* t-statistics from `limma`

output.

**Read more »**
25 Feb 2016

When performing Studentâ€™s t-test to compare difference in means between two group, it is a useful exercise to determine the effect of unequal sample sizes in the comparison groups on power. Large imbalances generally will not have adequate statistical power to detect even large effect sizes associated with a factor, leading to a high Type II error rate as shown in the figure below:

**Read more »**
08 Feb 2016

In this post I show how we can use math expressions to label the panels in facets to produce the following plot:

**Read more »**
10 Jun 2015

In every statistical analysis, the first thing one should do is try and visualise the data before any modeling. In microarray studies, a common visualisation is a heatmap of gene expression data. In this post I simulate some gene expression data and visualise it using the `pheatmap`

function from the pheatmap package in `R`

. You will also need the `mvrnorm`

function from the MASS library to simulate from a multivariate normal distribution, and the `brewer.pal`

function from the RColorBrewer library for easier customization of colors.

**Read more »**