geneSetTest              package:limma              R Documentation

_G_e_n_e _S_e_t _T_e_s_t

_D_e_s_c_r_i_p_t_i_o_n:

     Test whether a given statistic is larger or over-represented in a
     given subset of genes.

_U_s_a_g_e:

     geneSetTest(selected,statistics,alternative="two.sided",nsim=10000,ranks.only=FALSE)

_A_r_g_u_m_e_n_t_s:

selected: vector specifying the elements of 'statistic' in the test
          group.  This can be a vector of indices, or a logical vector
          of the same length as 'statistics', or any vector such as
          'statistic[selected]' contains the statistic values for the
          selected group.

statistics: numeric vector giving the values of the test statistic for
          every gene or probe in the reference set, usually every probe
          on the microarray.

alternative: character string specifying the alternative hypothesis,
          must be one of '"two.sided"' (default), '"greater"' or
          '"less"'.  You can specify just the initial letter.

    nsim: number of random samples to take in computing the p-value.
          Not used if 'ranks.only=TRUE'.

ranks.only: logical, should the values 'statistics' be used only to
          rank the genes or does it make sense to average statistics
          for selected sets?

_D_e_t_a_i_l_s:

     This function computes a p-value to test the hypothesis that the
     selected genes tend to be more highly ranked on the given
     statistic. If it makes sense to average values of the statistic,
     which would be so for example if the statistic was a t-statistic,
     then a permutation test is conducted. In that case the function
     returns the proportion of 'nsim' randomly selected groups from the
     set of all statistics which have mean statistic equal or more
     extreme than that of the test group.

     If it doesn't make sense to average the values of the statistic
     for any reason, then only the ranks of the statistics are used and
     a Wilcoxon two-sample test, also known as a Mann-Whitney test, is
     performed.

     This is essentially a stream-lined approach to Gene Set Enrichment
     Analysis introduced by Mootha et al (2003).

     Usually, 'statistics' is intended to hold t-like statistics,
     meaning that the genewise null hypotheses would be rejected for
     large positive or large negative values. Then
     'alternative="greater"' can be used to test whether genes in the
     set tend to be up-regulated, 'alternative="less"' can be used to
     test whether the gene set is down-regulated, while
     'alternative="two.sided"' tests whether the gene set holds highly
     ranked genes without regard to direction of change. Important
     note: if 'statistics' is an F-like statistic for which only large
     values are relevant for rejecting the null hypothesis, then you
     must use 'alternative="greater"' to get meaningful results.

_V_a_l_u_e:

     Numeric value giving the estimated p-value.

_A_u_t_h_o_r(_s):

     Gordon Smyth

_R_e_f_e_r_e_n_c_e_s:

     Mootha, V. K., Lindgren, C. M., Eriksson, K. F., Subramanian, A.,
     Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale,
     M., Laurila, E., Houstis, N., Daly, M. J., Patterson, N., Mesirov,
     J. P., Golub, T. R., Tamayo, P., Spiegelman, B., Lander, E. S.,
     Hirschhorn, J. N., Altshuler, D., Groop, L. C. (2003). 
     PGC-1alpha-responsive genes involved in oxidative phosphorylation
     are coordinately downregulated in human diabetes. _Nature
     Genetics_ 34, 267-273.

_S_e_e _A_l_s_o:

     'wilcox.test'

_E_x_a_m_p_l_e_s:

     sel <- c(2,4,5)
     stat <- -9:9
     geneSetTest(sel,stat,nsim=100)
     geneSetTest(sel,stat,ranks.only=TRUE)

