Decoding Gene Set Variation Analysis

Which is highly positive and indicates that genes in the genes are positively enriched as compared to genes not in the gene set.Sample 2Note that B, E and H genes are now all among the lowly ranked genesRunning sum for genes in gene setThis is how the random walk looks like for genes lying in the gene set.Running sum for genes not in gene setThis is how the random walk looks for the distribution of genes not lying in the gene set.GSVA score for Sample 2The GSVA score comes out to be -0.71..Which is highly negative and indicates that genes in the genes are negatively enriched as compared to genes not in the gene set.Sample 3Note that B is one of the higher ranked genes and E and H genes among the lowly ranked genes..Can you guess what the GSVA score would come out to be for such a case?.Perhaps, it will be close to 0 ?.Let’s see.Running sum for genes in gene setThis is how the random walk looks like for genes lying in the gene set.Running sum for genes not in gene setThis is how the random walk looks for the distribution of genes not lying in the gene set.GSVA score for Sample 3The distributions are intermingling..The GSVA score comes out to be 0.1 which is very close to 0..This means that the genes are neither positively or negatively enriched as compared to genes not in the gene set..So, if the some genes of the gene set lie in the higher ranks and some lie in the lower ranks their effect is cancelled out and the GSVA score comes out to be close to 0.ConclusionIn conclusion the GSVA is a key method of quantifying enrichment in pathways and signatures on a sample by sample basis..It gives a very clever method which is based on the simple intuition that a gene set’s enrichment in a sample will depend on where the genes lie when we rank all the genes and look for the positions of the gene set’s genes in the ranked list.ReferencesGSVA literature. More details

Leave a Reply