Enrichr, Prerank, GSEA or ssGSEA?

2 minute read

Bioinformatician’s main tool for discovery has often been differential expression analysis. But between Enrichr, Prerank, GSEA, and ssGSEA, which tool should you use? Here is the quick reminder X-plainer. 🧬

The Decision Tree

The key question is: what shape is your data?

Enrichr — you have a gene list, nothing else

Enrichr is when you just have a list of genes (can be small). No values, no conditions — just names. 📋

GENEA, GENEB, GENEC

Under the hood, Enrichr tests each gene set in its databases using a Fisher’s exact test (hypergeometric). It asks: is my list enriched for genes in pathway X more than expected by chance?

Best for: DE gene lists, hit lists from CRISPR screens, manually curated sets.

👉 Enrichr

Python: gseapy.enrichr

Prerank — you have a ranked gene list

Prerank is when you have a continuous value per gene that you can rank — a fold change, a correlation, a t-statistic, anything. 📊

Gene	Value
GENEA	12
GENEB	8
GENEC	4

Prerank runs GSEA logic (the enrichment score / walking statistic) on your pre-ranked list, without needing raw expression data or phenotype labels. Useful when you already have a score but not the underlying samples.

Best for: correlation with a phenotype, output from another model, single-sample pseudo-bulk scores.

👉 Python: gseapy.prerank

GSEA — you have expression data with two conditions

GSEA works best when you have a matrix of gene expression values across multiple samples with a clear phenotype label (treated vs control, disease vs healthy, etc.). It computes its own gene ranking internally. 🔬

Gene	C1	C2	C3	D1	D2	D3
GENEA	12	7	3	1	1	0
GENEB	8	0	6	8	1	1
GENEC	4	4	3	2	3	4

The key advantage over Enrichr: GSEA doesn’t require you to define a hard cutoff (“top 200 DE genes”). It uses the full ranked list and identifies pathways enriched at the top or bottom. This makes it more sensitive and less arbitrary. ✅

Best for: bulk RNA-seq, any two-condition comparison with replicates (n ≥ 3 per group recommended).

👉 GSEA software

Python: gseapy.gsea

ssGSEA — you want a per-sample enrichment score

ssGSEA is for when you have many samples with no clear two-group contrast — or when you want a continuous enrichment score per sample rather than a comparison between groups. 🗂️

Gene	A	B	C	D	E	F
GENEA	12	7	3	1	1	0
GENEB	8	0	6	8	1	1
GENEC	4	4	3	2	3	4

Each sample gets its own enrichment score for each pathway, independently. The output is a sample × pathway matrix. Great for downstream analysis — clustering, survival analysis, correlating pathway activity with other variables.

Best for: large cohorts (TCGA, GTEx), single-cell pseudo-bulk, any analysis where you want pathway activity as a continuous feature.

👉 Python: gseapy.ssgsea

Quick summary table

Tool	Input	Statistics	Best for
Enrichr	Gene list only	Fisher / hypergeometric	Small lists, no values
Prerank	Genes + score	GSEA walking statistic	Pre-computed rankings
GSEA	Expression matrix + 2 conditions	GSEA walking statistic	Bulk RNA-seq DE
ssGSEA	Expression matrix, no labels	Per-sample enrichment	Large cohorts, per-sample scores

For most single-cell work: compute pseudo-bulk, run GSEA or Prerank per cell type. For single-cell pathway scoring directly, decoupleR is worth a look. 🔍

Share on

X Facebook LinkedIn Bluesky

Jérémie Kalfon

Enrichr, Prerank, GSEA or ssGSEA?

The Decision Tree

Enrichr — you have a gene list, nothing else

Prerank — you have a ranked gene list

GSEA — you have expression data with two conditions

ssGSEA — you want a per-sample enrichment score

Quick summary table

Share on

Leave a comment

You may also enjoy

The Gene Regulation Landscape

Finishing the PhD

How I managed thousands of datasets to build the scPRINT family of scRNA-seq foundation models

VCC starter pack