This notebook quantifies the change in TCR repertoire diversity from Apheresis (starting PBMC material) to the manufactured drug Product. We compare multiple diversity indices and stratify by clinical response (CR, PR, PD).
The central hypothesis is that manufacturing selects and expands a subset of clonotypes, resulting in reduced diversity in the Product relative to the Apheresis starting material.
# Load diversity metrics (precomputed)
diversity_df <- read_csv("data/processed/diversity_metrics.csv")
# Paired Wilcoxon test: Apheresis vs Product
aph_shannon <- diversity_df %>% filter(sample_type == "Apheresis") %>% pull(shannon_entropy)
prod_shannon <- diversity_df %>% filter(sample_type == "Product") %>% pull(shannon_entropy)
wilcox.test(aph_shannon, prod_shannon, paired = TRUE)
Clonality is defined as 1 - (H / log(N)), where H is Shannon entropy and N is the number
of unique clonotypes. A clonality of 0 indicates a perfectly even repertoire; a value approaching 1
indicates dominance by a single clone.
Product samples exhibit 2–3× higher clonality than matched Apheresis samples (mean 0.431 vs 0.171). This confirms that manufacturing drives substantial clonal selection and expansion.
Richness measures the total number of unique clonotypes. D50 measures the minimum number of clonotypes comprising the top 50% of all reads — a low D50 indicates a highly skewed repertoire.
# Compute fold-change in richness
richness_fc <- diversity_df %>%
select(patient_id, sample_type, n_clonotypes) %>%
pivot_wider(names_from = sample_type, values_from = n_clonotypes) %>%
mutate(fold_change = Apheresis / Product)
richness_fc
| Patient | Apheresis Clonotypes | Product Clonotypes | Fold Change | Apheresis D50 | Product D50 |
|---|---|---|---|---|---|
| PT-001 | 13,238 | 2,029 | 6.5× | 1816 | 3 |
| PT-002 | 14,132 | 940 | 15.0× | 2046 | 1 |
| PT-003 | 11,658 | 1,123 | 10.4× | 1379 | 2 |
| PT-004 | 12,786 | 2,474 | 5.2× | 1639 | 5 |
| PT-005 | 9,771 | 1,913 | 5.1× | 858 | 5 |
| PT-006 | 8,676 | 2,853 | 3.0× | 614 | 11 |
Repertoire richness drops 4–15× from Apheresis to Product, and D50 collapses from hundreds/thousands to single digits, indicating the Product is dominated by very few expanded clonotypes.
We stratify the diversity metrics by clinical response to explore whether the degree of clonal focusing in the Product is associated with treatment outcome.
Complete responders (CR) show the most focused Product repertoires (lowest Shannon H, highest clonality), suggesting that effective clonal selection during manufacturing may be associated with clinical benefit. Progressive disease patients retain relatively higher Product diversity, potentially reflecting insufficient clonal expansion of therapeutic clones.
1. Manufacturing reduces Shannon entropy by ~45–60% across all patients (p = 0.031).
2. Clonality increases 2–3× from Apheresis to Product.
3. D50 collapses to single digits in Product, indicating extreme oligoclonality.
4. CR patients show the most focused Product repertoires, suggesting a potential link between clonal selection efficiency and clinical response.