NOTEBOOK 02 ← Previous  |  Next →

Diversity Analysis: Apheresis vs Product

TCR Repertoire Analysis — Notebook 02: Comparative diversity metrics, statistical testing, and clinical response stratification
Joshua Luthy R + immunarch + edgeR Synthetic Data 2026
Contents
  1. Analysis Overview
  2. Shannon Entropy Comparison
  3. Clonality Index
  4. Repertoire Richness & D50
  5. Diversity by Clinical Response
  6. Summary

01 Analysis Overview

This notebook quantifies the change in TCR repertoire diversity from Apheresis (starting PBMC material) to the manufactured drug Product. We compare multiple diversity indices and stratify by clinical response (CR, PR, PD).

The central hypothesis is that manufacturing selects and expands a subset of clonotypes, resulting in reduced diversity in the Product relative to the Apheresis starting material.

# Load diversity metrics (precomputed) diversity_df <- read_csv("data/processed/diversity_metrics.csv") # Paired Wilcoxon test: Apheresis vs Product aph_shannon <- diversity_df %>% filter(sample_type == "Apheresis") %>% pull(shannon_entropy) prod_shannon <- diversity_df %>% filter(sample_type == "Product") %>% pull(shannon_entropy) wilcox.test(aph_shannon, prod_shannon, paired = TRUE)
Wilcoxon signed rank exact test data: aph_shannon and prod_shannon V = 21, p-value = 0.03125 alternative hypothesis: true location shift is not equal to 0

02 Shannon Entropy Comparison

Figure 1. Shannon entropy (H) for Apheresis vs Product samples. Each point represents one patient. The manufacturing process significantly reduces repertoire diversity (paired Wilcoxon, p = 0.031).
7.76
Mean H (Apheresis)
4.28
Mean H (Product)
45%
Mean Reduction
0.031
p-value (Wilcoxon)

03 Clonality Index

Clonality is defined as 1 - (H / log(N)), where H is Shannon entropy and N is the number of unique clonotypes. A clonality of 0 indicates a perfectly even repertoire; a value approaching 1 indicates dominance by a single clone.

Figure 2. Clonality index for Apheresis vs Product samples. Product samples show significantly higher clonality, reflecting oligoclonal expansion during manufacturing.
Finding

Product samples exhibit 2–3× higher clonality than matched Apheresis samples (mean 0.431 vs 0.171). This confirms that manufacturing drives substantial clonal selection and expansion.

04 Repertoire Richness & D50

Richness measures the total number of unique clonotypes. D50 measures the minimum number of clonotypes comprising the top 50% of all reads — a low D50 indicates a highly skewed repertoire.

# Compute fold-change in richness richness_fc <- diversity_df %>% select(patient_id, sample_type, n_clonotypes) %>% pivot_wider(names_from = sample_type, values_from = n_clonotypes) %>% mutate(fold_change = Apheresis / Product) richness_fc
PatientApheresis ClonotypesProduct ClonotypesFold Change Apheresis D50Product D50
PT-001 13,238 2,029 6.5× 1816 3
PT-002 14,132 940 15.0× 2046 1
PT-003 11,658 1,123 10.4× 1379 2
PT-004 12,786 2,474 5.2× 1639 5
PT-005 9,771 1,913 5.1× 858 5
PT-006 8,676 2,853 3.0× 614 11
Finding

Repertoire richness drops 4–15× from Apheresis to Product, and D50 collapses from hundreds/thousands to single digits, indicating the Product is dominated by very few expanded clonotypes.

05 Diversity by Clinical Response

We stratify the diversity metrics by clinical response to explore whether the degree of clonal focusing in the Product is associated with treatment outcome.

Figure 3. Product Shannon entropy stratified by clinical response. CR patients trend toward lower Product diversity (more focused), while PD patients retain higher Product diversity.
Finding

Complete responders (CR) show the most focused Product repertoires (lowest Shannon H, highest clonality), suggesting that effective clonal selection during manufacturing may be associated with clinical benefit. Progressive disease patients retain relatively higher Product diversity, potentially reflecting insufficient clonal expansion of therapeutic clones.

06 Summary

Key Findings

1. Manufacturing reduces Shannon entropy by ~45–60% across all patients (p = 0.031).

2. Clonality increases 2–3× from Apheresis to Product.

3. D50 collapses to single digits in Product, indicating extreme oligoclonality.

4. CR patients show the most focused Product repertoires, suggesting a potential link between clonal selection efficiency and clinical response.