This notebook examines clonal expansion patterns in the manufactured Product, identifying the most dominant clonotypes and characterizing the degree of oligoclonality. We assess whether specific clones dominate the Product and how the top clone fraction varies with clinical response.
# Identify top expanded clonotypes per sample
top_clones <- tcr_data %>%
filter(sample_type == "Product") %>%
group_by(patient_id) %>%
arrange(desc(clone_fraction)) %>%
mutate(rank = row_number()) %>%
filter(rank <= 20)
Below are the top 10 expanded clonotypes in the Product for patient PT-001 (CR). Note the extreme dominance of the top clone, which comprises nearly 9% of all reads.
| Rank | CDR3 Sequence | V Gene | J Gene | Count | Fraction |
|---|---|---|---|---|---|
| 1 | CDWVSYQFTKRRF |
TRBV7-9 | TRBJ1-1 | 30,790 | 42.46% |
| 2 | CMGDVHRMPPGLMF |
TRBV29-1 | TRBJ2-1 | 5,447 | 7.51% |
| 3 | CYSAQCWFMKLMYEF |
TRBV29-1 | TRBJ2-1 | 1,988 | 2.74% |
| 4 | CLICPRQLPLMVNKLF |
TRBV10-3 | TRBJ1-2 | 980 | 1.35% |
| 5 | CETHNLRSMQAPQSIQVF |
TRBV12-4 | TRBJ1-6 | 565 | 0.78% |
| 6 | CVIYIPQNADGMSHIGF |
TRBV6-5 | TRBJ2-5 | 368 | 0.51% |
| 7 | CINEITGPPDMNGIVF |
TRBV18 | TRBJ1-5 | 239 | 0.33% |
| 8 | CWMWNDKGQWESRWEWIF |
TRBV15 | TRBJ2-4 | 185 | 0.26% |
| 9 | CEWFQHPQDVWDRIAF |
TRBV30 | TRBJ2-3 | 128 | 0.18% |
| 10 | CLGIVHPSSGGAHPVVF |
TRBV27 | TRBJ1-1 | 107 | 0.15% |
We quantify clonal dominance by examining the fraction of total reads held by the top 1, top 5, and top 10 clonotypes in each Product sample.
# Compute cumulative clone fraction for top clonotypes
cumulative <- top_clones %>%
group_by(patient_id) %>%
arrange(rank) %>%
mutate(cum_fraction = cumsum(clone_fraction))
# Plot cumulative curves
ggplot(cumulative, aes(x = rank, y = cum_fraction,
color = clinical_response, group = patient_id)) +
geom_line(linewidth = 1.2) +
geom_point(size = 2) +
scale_color_manual(values = c("CR" = "#00ff9d", "PR" = "#00d4ff", "PD" = "#ff6b6b")) +
labs(x = "Clonotype Rank", y = "Cumulative Fraction") +
theme_minimal()
In CR patients, the top 10 clonotypes account for 30–45% of all Product reads, indicating highly focused expansion. PD patients show a flatter accumulation curve, where the top 10 clones capture only 15–25% of reads — consistent with less efficient clonal selection during manufacturing.
We examine the overlap between Apheresis and Product repertoires to determine what fraction of the Product's dominant clones were detectable in the starting material.
# Identify shared clonotypes between Apheresis and Product
shared <- tcr_data %>%
select(patient_id, sample_type, cdr3_aa, clone_fraction) %>%
pivot_wider(names_from = sample_type, values_from = clone_fraction,
values_fn = list) %>%
filter(map_lgl(Apheresis, ~ !is.null(.x)),
map_lgl(Product, ~ !is.null(.x)))
1. Manufacturing drives extreme oligoclonal expansion — top 10 Product clones capture 15–45% of reads depending on patient.
2. CR patients show the most concentrated Product repertoires, suggesting efficient clonal selection is therapeutically beneficial.
3. The majority of dominant Product clones are traceable back to the Apheresis starting material.
4. PD patients exhibit more diffuse Product repertoires with less dominant clonal expansion.