September 23, 2025

Uniting Omics and Clinical Data for Precision Multiple Sclerosis Care

Uniting Omics and Clinical Data for Precision Multiple Sclerosis Care

The Challenge: Complexity Without Connection

Multiple Sclerosis (MS) is an autoimmune disorder that affects nearly one million people in the United States. It occurs when the body’s immune system attacks the myelin sheath—the protective covering around neurons—leading to progressive nerve damage. Despite advances in treatment, researchers still struggle to fully understand what drives MS onset and progression. Huge amounts of data are being generated from patient visits, medical records, and laboratory experiments, but these rich sources—clinical data and proteomic data—often exist in silos. Clinicians rely on fragmented clinical observations, while researchers focus on isolated protein-level findings. The result is a persistent gap between molecular discovery and real-world patient care.

Our Mission

To empower clinicians and researchers by transforming omic and clinical data into actionable insights through a reproducible machine learning pipeline—advancing precision care for Multiple Sclerosis patients.

Our Solution

IntegraMS bridges this gap by integrating proteomic and clinical data to model MS disability status and uncover protein signatures linked to disease progression. Through a unified web platform, researchers and clinicians can upload a patient’s data portfolio, run analysis pipelines, and receive model-predicted DSS (Disability Status Scale) scores alongside curated summaries of the most informative peptide-level features. The result is a single dashboard that replaces hours of manual cross-referencing with interpretable, data-driven insights grounded in both clinical context and proteomic signal.

Why It Matters

The need for such a system is urgent. The global MS therapeutics market was valued at $27.4 billion in 2024, projected to reach $38.6 billion by 2030, while the proteomics and transcriptomics data market is expected to grow from $32.6 billion to $51.8 billion in the same period. Both sectors depend on better integration of clinical and molecular data to speed up drug discovery and improve care outcomes.By providing a scalable, reproducible, and clinically interpretable pipeline, IntegraMS unlocks that potential—helping translate research discoveries into practical clinical tools.

Who It’s For

  • Neurologists and Clinicians: Gain a comprehensive, data-integrated view of each patient to support early intervention and personalized treatment.
  • Clinical Researchers: Explore relationships between autoantibody patterns, EDSS, and nFL to uncover predictive biomarkers.
  • Data Scientists: Build upon a reproducible ML framework designed for multi-omics integration and small cohort scalability.

IntegraMS System Workflow

The IntegraMS platform follows a four-stage workflow:

  1. Data Integration – Secure ingestion of de-identified proteomic and clinical datasets from UCSF Neurology into a unified schema. Phage display proteomic measurements are aligned with patient-level timelines, demographics, and DSS outcomes to create a structured dataset suitable for modeling.
  2. Exploratory Analysis – Outlier detection, normalization, cohort characterization, and dimensionality reduction are applied to understand peptide feature distributions and identify meaningful protein–disability patterns prior to model training.
  3. Machine Learning Models – Supervised regression models trained on a combined real and synthetic cohort—augmented with SDV-generated patients and SMOTE demographic balancing—predict current Disability Status Scale (DSS) scores directly from curated peptide features and aligned clinical variables.
  4. Visualization & Insights – All predictions and feature summaries are delivered through an interactive dashboard built with React and a Node.js backend, supported by AWS (S3, ECS, SQS) and Firebase. The system presents DSS predictions, peptide feature importance, and clinical context through interpretable charts and tables, enabling researchers to extract clear, actionable insights from complex MS data.

The Data Behind IntegraMS

IntegraMS draws on a private UCSF Neurology cohort of approximately 500 MS patients, containing both experimental proteomic assays and rich clinical metadata. All datasets are fully de-identified, and their use is governed by UCSF’s IRB and institutional approval processes, ensuring that model development and analysis are conducted ethically and in compliance with patient privacy standards.

What Sets IntegraMS Apart

Existing tools analyze omics and clinical data separately, often requiring manual correlation between findings. IntegraMS unifies these steps within one reproducible framework, enabling:

  • Faster turnaround from raw data to actionable predictions
  • Streamlined workflows for both clinicians and researchers
  • A scalable model adaptable to other autoimmune or neurodegenerative diseases

In short, IntegraMS transforms complexity into clarity—making precision medicine accessible, interpretable, and reproducible.

Looking Ahead

The UC Berkeley MIDS team envisions expanding IntegraMS beyond MS. The architecture supports adaptation to other diseases where proteomic and clinical data intersect, such as Lupus, ALS, or Alzheimer’s.
Future milestones include:

  • Integrating genomic and transcriptomic datasets
  • Enhancing model interpretability for clinical explainability
  • Collaborating with UCSF Neurology for validation and real-world deployment
IntegraMS represents more than a project—it’s a bridge between data and care, research and reality, science and the patient.