BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Dana-Farber Cancer Institute - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://ds.dfci.harvard.edu
X-WR-CALDESC:Events for Dana-Farber Cancer Institute
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250918T160000
DTEND;TZID=America/New_York:20250918T170000
DTSTAMP:20260501T151357
CREATED:20250912T174007Z
LAST-MODIFIED:20250918T235342Z
UID:6513-1758211200-1758214800@ds.dfci.harvard.edu
SUMMARY:Reproducible Research - Tools and a case study with NHANES
DESCRIPTION:HSPH Biostatistics & DFCI Data Science Colloquium Series \nSeptember 18\, 2025\n4:00 PM\nHSPH FXB-301 \nRobert Gentleman\, PhD\nPrincipal Research Scientist\nHarvard T.H. Chan School of Public Health and Dana-Farber Cancer Institute \nI will discuss how new technologies and statistical methodologies can help enhance our ability to perform reproducible research. I will demonstrate how these could be used in a real world setting by examining questions\, primarily of an epidemiological nature\, using data from the NHANES surveys. I will describe one version of an Environment Wide Association Study (EnWAS) and show how this methodology can potentially be employed to interrogate large complex data resources. \n 
URL:https://ds.dfci.harvard.edu/event/reproducible-research-tools-and-a-case-study-with-nhanes/
LOCATION:Harvard TH Chan School of Public Health\, 677 Huntington Ave\, Boston\, MA\, 02115
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://ds.dfci.harvard.edu/wp-content/uploads/2025/09/Robert-Gentlemen-850x430-2-e1757698738137.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250911T160000
DTEND;TZID=America/New_York:20250911T170000
DTSTAMP:20260501T151357
CREATED:20250903T113817Z
LAST-MODIFIED:20250912T120312Z
UID:6428-1757606400-1757610000@ds.dfci.harvard.edu
SUMMARY:Preference Inference for Language Models Debiased by Fisher Random Walk Models
DESCRIPTION:﻿HSPH Biostatistics & DFCI Data Science Colloquium Series\nSeptember 11 at 4:00PM\nHarvard TH Chan School of Public Health\, FXB-301 \nJunwei Lu\, PhD\nAssociate Professor of Biostatistics\, Harvard TH Chan School of Public Health \nHuman preference alignment has been shown to be effective in training the large language models (LMs). It allows the LLM to understand human feedback and preferences. Despite the extensive literature dealing with algorithms aligning the rank of human preference\, uncertainty quantification for the ranking estimation still needs to be explored and is of great practical significance. For example\, it is important to overcome the problem of hallucination for LLM in the medical domain\, and an inferential method for the ranking of LM answers becomes necessary. In this talk\, we will present a novel framework called “Fisher random walk” to conduct semi-parametric efficient preference inference for language models and illustrate its application in the language models for medical knowledge.
URL:https://ds.dfci.harvard.edu/event/preference-inference-for-language-models-debiased-by-fisher-random-walk-models/
LOCATION:Harvard TH Chan School of Public Health\, 677 Huntington Ave\, Boston\, MA\, 02115
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://ds.dfci.harvard.edu/wp-content/uploads/2025/09/junweilarger.jpeg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250403T160000
DTEND;TZID=America/New_York:20250403T170000
DTSTAMP:20260501T151357
CREATED:20250314T164130Z
LAST-MODIFIED:20250328T121330Z
UID:5991-1743696000-1743699600@ds.dfci.harvard.edu
SUMMARY:Fréchet Regression of Random Objects on Vector Covariates and Its Applications for  Single Cell RNA-seq Data Analysis
DESCRIPTION:HSPH Biostatistics and DFCI Data Science Colloquium\nThursday\, April 3\, 2025\n4:00pm\nHarvard TH Chan School of Public Health\, FXB G13 \nHongzhe Li\, PhD\nPerelman Professor of Biostatistics\, Epidemiology and Informatics\nDirector\, Center for Statistics in Big Data Vice Chair for Research Integration\, Department of Biostatistics\, Epidemiology and Informatics\, University of Pennsylvania \nPopulation-level single-cell RNA-seq data captures gene expression profiles across thousands of cells from each individual in a sizable cohort. This data facilitates the construction of cell-type- and individual-specific gene co-expression networks by estimating covariance matrices. Investigating how these co-expression networks relate to individual-level covariates provides critical insights into the interplay between molecular processes and biological or clinical traits. This talk introduces Fréchet regression\, modeling covariance matrices as outcomes and vector covariates as predictors\, using the Wasserstein distance between covariance matrices as a metric instead of the Euclidean distance. A test statistic is proposed based on the Fréchet mean and covariate-weighted Fréchet mean\, with its asymptotic null distribution derived. Analysis of large-scale single-cell RNA-seq data reveals an association between the co-expression network of genes in the nutrient-sensing pathway and age\, highlighting perturbations in gene co-expression networks with aging. Additionally\, a robust local Fréchet regression approach\, leveraging neural unbalanced optimal transport\, is briefly discussed to explore how cells are temporally organized during the differentiation of human embryonic stem cells into embryoid bodies.
URL:https://ds.dfci.harvard.edu/event/frechet-regression-of-random-objects-on-vector-covariates-and-its-applications-for-single-cell-rna-seq-data-analysis/
LOCATION:Harvard TH Chan School of Public Health\, 677 Huntington Ave\, Boston\, MA\, 02115
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://ds.dfci.harvard.edu/wp-content/uploads/2025/03/li-crop.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250327T160000
DTEND;TZID=America/New_York:20250327T170000
DTSTAMP:20260501T151357
CREATED:20250314T163931Z
LAST-MODIFIED:20250328T121320Z
UID:5985-1743091200-1743094800@ds.dfci.harvard.edu
SUMMARY:Data Integration in Spatial and Single Cell Omics:  What is Erased\, and Can you Recover it?
DESCRIPTION:HSPH Biostatistics and DFCI Data Science Colloquium\nThursday\, March 27\, 2025\n4:00pm\nHarvard TH Chan School of Public Health\, FXB G13 \n\nNancy Zhang\, PhD\nGe Li and Ning Zhao Professor\, Professor of Statistics and Data Science\, Vice Dean of Wharton Doctoral Programs\,  The Wharton School\, University of Pennsylvania \nIn single-cell and spatial biology\, data integration refers to the alignment of cells across samples and modalities\, and is an ubiquitous challenge affecting all downstream analyses. The goal in cell integration is to find cells across data sets that share the same biological state that may be obscured by technical differences. \nIn this talk\, I will cast the cell integration problem on a continuum of weak to strong linkage\, depending on the strength of feature sharing between experiments. First\, I will examine integration across data modalities of weak linkage. This arises when there are few shared features between the data being integrated\, for example\, between single-cell RNA sequencing data and spatial proteomics data. For this\, I will present MaxFuse\, a method that leverages higher order relationships between all features\, including unshared features\, to achieve accurate integration. Next\, we consider the scenario of data alignment across the same modality in clinical scale studies. For this setting\, I will show that existing paradigms are overly aggressive\, erasing disease and treatment effects and introducing severe data distortion. I will introduce a “pool-of-controls” experimental design concept to disentangle biological variation from unwanted variation. Based on this\, I will describe CellANOVA\, a novel statistical model and scalable algorithm that recovers biological signals lost during batch integration and corrects integration related data distortion. Through these two contrasting paradigms\, I will share the key lessons learned and the remaining challenges in this field.
URL:https://ds.dfci.harvard.edu/event/data-integration-in-spatial-and-single-cell-omics-what-is-erased-and-can-you-recover-it/
LOCATION:Harvard TH Chan School of Public Health\, 677 Huntington Ave\, Boston\, MA\, 02115
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://ds.dfci.harvard.edu/wp-content/uploads/2025/03/zhang-crop-e1741970356597.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250311T160000
DTEND;TZID=America/New_York:20250311T170000
DTSTAMP:20260501T151357
CREATED:20250226T134543Z
LAST-MODIFIED:20250312T130019Z
UID:5910-1741708800-1741712400@ds.dfci.harvard.edu
SUMMARY:Dissecting Tumor Transcriptional Heterogeneity from Single-cell RNA-seq Data by Generalized Binary Covariance Decomposition
DESCRIPTION:HSPH Biostatistics and DFCI Data Science Colloquium\nTuesday March 11th at 4:00pm\nHSPH FXB G12 \nYusha Liu\, PhD\nResearch Assistant Professor\nDepartment of Biostatistics\nThe University of North Carolina at Chapel Hill \nProfiling tumors with single-cell RNA sequencing has the potential to identify recurrent patterns of transcription variation related to cancer progression\, and to produce therapeutically relevant insights. However\, strong inter-tumor heterogeneity can obscure more subtle patterns that are shared across tumors. In this talk\, I will introduce a novel statistical method\, generalized binary covariance decomposition (GBCD)\, to address this problem. GBCD can decompose transcriptional heterogeneity into interpretable components — including patient-specific\, dataset-specific and shared components relevant to disease subtypes — and that\, in the presence of strong inter-tumor heterogeneity\, it can produce more interpretable results than existing methods. Applied to data on pancreatic ductal adenocarcinoma\, GBCD produced a refined characterization of existing tumor subtypes\, and identified a gene expression program prognostic of poor survival independent of tumor stage and subtype. The gene expression program is enriched for genes involved in stress responses\, and suggests a role for the integrated stress response in pancreatic ductal adenocarcinoma.
URL:https://ds.dfci.harvard.edu/event/dissecting-tumor-transcriptional-heterogeneity-from-single-cell-rna-seq-data-by-generalized-binary-covariance-decomposition/
LOCATION:Harvard TH Chan School of Public Health\, 677 Huntington Ave\, Boston\, MA\, 02115
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://ds.dfci.harvard.edu/wp-content/uploads/2025/02/headshot-copy-e1740577495994.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250310T160000
DTEND;TZID=America/New_York:20250310T170000
DTSTAMP:20260501T151357
CREATED:20250226T134218Z
LAST-MODIFIED:20250311T161944Z
UID:5905-1741622400-1741626000@ds.dfci.harvard.edu
SUMMARY:Decoding Aging at Spatial and Single-cell Resolution with Machine Learning
DESCRIPTION:﻿HSPH Biostatistics and DFCI Data Science Colloquium\nMonday March 10th at 4:00pm\nHSPH Kresge G2 \nEric Sun\nPhD Candidate\, Department of Biomedical Informatics\nStanford University \nAging is a highly complex process and the greatest risk factor for many chronic diseases including cardiovascular disease\, dementia\, stroke\, diabetes\, and cancer. Recent spatial and single-cell omics technologies have enabled the high-dimensional profiling of complex biology including that underlying aging. As such\, new machine learning and computational methods are needed to unlock important insights from spatial and single-cell omics datasets. First\, I present the development of high-resolution machine learning models (‘spatial aging clocks’) that can measure the aging of individual cells in the brain. Using these spatial aging clocks\, I discovered that some cell types can dramatically influence the aging of nearby cells. Next\, I present new computational and statistical methods for overcoming the gene coverage limitations of existing spatially resolved single-cell omics technologies\, which have enabled the discovery of gene pathways underlying the spatial effects of brain aging. \n 
URL:https://ds.dfci.harvard.edu/event/decoding-aging-at-spatial-and-single-cell-resolution-with-machine-learning/
LOCATION:Harvard TH Chan School of Public Health\, 677 Huntington Ave\, Boston\, MA\, 02115
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://ds.dfci.harvard.edu/wp-content/uploads/2025/02/Eric_Sun-scaled-e1740577294700.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250303T160000
DTEND;TZID=America/New_York:20250303T170000
DTSTAMP:20260501T151357
CREATED:20250221T180945Z
LAST-MODIFIED:20250304T155554Z
UID:5888-1741017600-1741021200@ds.dfci.harvard.edu
SUMMARY:How Do Neural Networks Learn Features From Data?
DESCRIPTION:HSPH Biostatistics and DFCI Data Science Colloquium\nMonday March 3rd at 4:00pm\nHSPH Kresge G2 \nAdityanarayanan Radhakrishnan\nEric and Wendy Schmidt Center Postdoctoral Fellow\, Broad Institute of MIT and Harvard\, Harvard School of Engineering and Applied Sciences \nAbstract: Understanding how neural networks learn features\, or relevant patterns in data\, is key to accelerating scientific discovery. In this talk\, I will present a unifying mechanism that characterizes feature learning in neural network architectures. Namely\, features learned by neural networks are captured by a statistical operator known as the average gradient outer product (AGOP). More generally\, the AGOP enables feature learning in machine learning models that have no built-in feature learning mechanism (e.g.\, kernel methods). I will present two applications of this line of work. First\, I will show how AGOP can be used to steer LLMs and vision-language models\, guiding them towards specified concepts and shedding light on vulnerabilities in these models. I will then discuss how AGOP can be used to discover cellular programs (sets of genes whose expressions exhibit dependencies across cell subpopulations) from millions of sequenced cells. I will show how AGOP identified programs that reflect the heterogeneity found in various cell types\, subtypes\, and states in this data. Overall\, this line of work advances our fundamental understanding of how neural networks extract features from data\, leading to the development of novel\, interpretable\, and effective methods for use in scientific applications.
URL:https://ds.dfci.harvard.edu/event/how-do-neural-networks-learn-features-from-data/
LOCATION:Harvard TH Chan School of Public Health\, 677 Huntington Ave\, Boston\, MA\, 02115
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://ds.dfci.harvard.edu/wp-content/uploads/2025/02/headshot-e1740161254828.jpeg
END:VEVENT
END:VCALENDAR