School: Manhasset High School
Impact Statement: From the program, I learned how to navigate and analyze publicly available biomedical datasets, apply multivariate statistical methods, and interpret biomarker patterns in relation to disease onset and progression. This experience improved my data analysis, critical thinking, and scientific writing skills, while also giving me exposure to bioinformatics tools that are widely used in biomedical research. It prepared me for future research by showing me how computational methods can be integrated with biological questions, equipping me with both technical skills and a deeper appreciation of how data-driven approaches contribute to advancements in precision medicine.
Multiple sclerosis (MS) is a chronic autoimmune disorder in which the body’s immune system attacks the central nervous system (CNS), causing inflammation, demyelination of neurons, and progressive neurodegeneration. It affects over 2.8 million people worldwide, with onset usually between ages 20 and 40, and is more common in women. Symptoms vary widely, ranging from vision loss and muscle weakness to impaired coordination and memory. Diagnosis typically relies on MRI findings, clinical symptoms, and cerebrospinal fluid (CSF) analysis, which often detect the disease only after irreversible CNS damage has occurred. MS exists in several subtypes: relapse-remitting MS (RRMS), which often transitions to secondary progressive MS (SPMS); primary progressive MS (PPMS), characterized by gradual worsening; and radiologically isolated syndrome (RIS/CIS), a preclinical stage detectable only by imaging. Recent studies reveal that immune dysregulation, genetic predispositions, and environmental factors such as Epstein-Barr virus (EBV) infection often precede clinical onset by years. However, most existing studies focus on single biomarkers, overlooking the complexity of MS pathogenesis. Identifying multivariate biomarker profiles capable of distinguishing preclinical stages or predicting progression between subtypes offers the potential for earlier detection, risk stratification, and timely therapeutic intervention.
Although progress has been made in understanding MS, current diagnostic tools remain limited because they typically identify disease only after substantial neurological damage has occurred. MRI, CSF analysis, and clinical evaluation provide important information but lack the sensitivity to detect MS in its earliest stages or to reliably predict disease course. Biomarkers such as oligoclonal bands, cytokines, and B-cell subsets have been linked to MS onset and progression, but when studied in isolation, their predictive value is inconsistent. This creates a gap between biological insight and clinical application, especially in preclinical stages like RIS and CIS, where timely intervention could alter long-term outcomes. Because MS pathogenesis involves interactions among multiple immune pathways and environmental triggers, there is a need to move beyond single-marker studies toward integrated multivariate approaches capable of capturing disease complexity and producing stronger predictive models for onset and progression.
This study hypothesizes that combining multiple immune biomarkers into a single multivariate profile will predict MS onset and progression more accurately than single biomarkers. Specifically, integrated biomarker panels built from transcriptomic datasets will identify early immune changes in preclinical stages such as RIS and CIS and distinguish relapsing forms of MS from progressive subtypes. The null hypothesis is that multivariate profiles will show no improvement in predictive accuracy compared to individual biomarkers.
Analysis of transcriptomic datasets identified stage-specific molecular signatures that separate early risk from active multiple sclerosis (MS). In CIS/RIS compared to controls, differentially expressed genes included GBP1, CCL3L3, and IGIP, which regulate immune signaling and cytokine response. Transcriptional regulators such as NR4A3, CREM, and MAFB, along with stress response genes like CDKN1A and MRS2, also appeared consistently, suggesting immune activity is altered before full disease onset.
For RRMS compared to controls, distinct patterns emerged. Genes such as CR2, FCER1G, and TYROBP indicated complement and Fc receptor signaling, while ITGA2B and PTPRK pointed to cell adhesion and integrin pathways. Additional genes including CSF2RA and CST3 reflected cytokine and growth factor response, and UBC suggested changes in protein turnover. These findings reflect heightened immune activation and cellular remodeling during active disease stages.
Functional enrichment analysis supported these observations, with Group 1 genes clustering into cytokine-mediated signaling and transcriptional regulation, while Group 2 genes mapped to immune effector pathways, adhesion, and protein regulation. Overlap between blood and brain datasets indicated systemic immune signatures that extend beyond local CNS damage.
These results demonstrate that combining multiple biomarkers can distinguish early MS risk from active progression. The separation between gene groups suggests that multivariate panels may improve early detection and provide a framework for monitoring disease stage and progression.
This project shows that early MS (CIS/RIS) and active MS (RRMS) are marked by different gene expression profiles. Early-stage disease is defined by cytokine signaling and transcriptional regulation, while active stages show stronger involvement of immune effector pathways, complement activity, and cellular remodeling. These differences suggest that the immune system shifts in a stage-specific way, which could explain variations in disease onset and progression.
Focusing on multivariate profiles, rather than individual markers, provides clearer separation between early and active stages. This approach has the potential to improve early diagnosis and risk stratification, while also offering new tools for monitoring progression.
The study is limited by dataset scope and the absence of cerebrospinal fluid profiles. Larger and more diverse patient datasets, along with additional protein markers and epigenetic data, are needed to confirm these findings. Even with these limitations, the results support the value of multivariate biomarker panels for understanding and predicting MS progression.
By: Madison Qu. The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of Elio Academy.