BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Dana-Farber Cancer Institute - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Dana-Farber Cancer Institute
X-ORIGINAL-URL:https://ds.dfci.harvard.edu
X-WR-CALDESC:Events for Dana-Farber Cancer Institute
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20210314T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20211107T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20220313T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20221106T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20230312T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20231105T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20220401T130000
DTEND;TZID=America/New_York:20220401T173000
DTSTAMP:20260428T050123
CREATED:20220227T194301Z
LAST-MODIFIED:20220404T003221Z
UID:3312-1648818000-1648834200@ds.dfci.harvard.edu
SUMMARY:2022 Marvin Zelen Symposum: Data Visualization
DESCRIPTION:The growing availability of informative datasets and software tools has led to increased reliance on data visualizations across many industries\, academia\, and government. News organizations are increasingly embracing data journalism and including effective infographics as part of their reporting\, while in research we increasingly rely on data visualization to assess data quality and describe and defend our findings. This year’s Zelen symposium brings together data visualization experts and developers of some of the most widely used data visualization software to share their thoughts and ideas with us. \nApril 1\, 2022 1:00 – 5:30 PM\nSimmons University\, Paresky Conference Center\nRegister for in-person or virtual attendance. \nThis event is currently scheduled as a hybrid event. We will follow all CDC and institutional guidelines for COVID safety. In-person participants are asked to provide proof of up-to-date vaccination or a negative COVID test taken within the 48 hours prior to the event.  \n#zelen2022 \nProgram PDF. \nSpeakers: \n\nAlberto Cairo\, University of Miami\n“What You Design is Not What People See”\nAmanda Cox\,USA Facts\n“End of an Era”\nJeffrey Heer\, University of Washington\n“Authoring and Visualizing Multiverse Analyses”\nJessica Hullman\, Northwestern University\n“Visualizations as Model Checks”\nAlvitta Ottley\, Washington University in St. Louis\n“The Case for Precision Visualization”\nLace Padilla\, University of California\, Merced\n“Impacts of COVID-19 Uncertainty Visualizations”\nHadley Wickham\, RStudio\n“Recent Advances in the ggplot2 Ecosystem”\n\n  \nSpecial thanks to Frontier Science Foundation
URL:https://ds.dfci.harvard.edu/event/2022-marvin-zelen-symposum-data-visualization/
CATEGORIES:Conference
ATTACH;FMTTYPE=image/png:https://ds.dfci.harvard.edu/wp-content/uploads/2022/02/zelen2022_wordpress2.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20220426T130000
DTEND;TZID=America/New_York:20220426T140000
DTSTAMP:20260428T050123
CREATED:20220105T135143Z
LAST-MODIFIED:20220429T134618Z
UID:3248-1650978000-1650981600@ds.dfci.harvard.edu
SUMMARY:Frontiers in Biostatistics: Tree-based Ensembling Strategies for Handling Heterogeneous Data
DESCRIPTION:Maya Ramchandran\nData Scientist\, ZephyrAI \nAbstract: Adapting machine learning algorithms to better handle clustering or other partition structure within training data sets is important across a wide variety of biological applications. We first consider the task of learning prediction models when multiple training studies are available. We present a novel weighting approach  for constructing tree-based ensemble learners in this setting\, showing that incorporating multiple layers of ensembling in the training process by weighting trees increases the robustness of the resulting predictor and achieves superior performance to Random Forest. Next\, we broaden the scope of the problem to consider the effect of ensembling forest-based learners trained on clusters within a single data set with heterogeneity in the distribution of the features. We show that constructing ensembles of forests trained on estimated clusters determined by algorithms such as k-means results in significant improvements in accuracy and generalizability over the traditional Random Forest algorithm. We denote our novel approach as the Cross-Cluster Weighted Forest\, and display its robustness and accuracy across simulations and on cancer molecular profiling and gene expression data sets that are naturally divisible into clusters. Finally\, we provide theoretical support to these empirical observations by asymptotically analyzing linear least squares and random forest regressions under a linear model. In particular\, for random forest regression under fixed dimensional linear models\, our bounds imply a strict benefit of our ensembling strategy over classic Random Forest. \nYouTube Video. \nMaya Ramchandran recently completed her PhD at the Harvard Biostatistics department under the supervision of Dr. Giovanni Parmigiani\, where she developed machine learning ensembling strategies with applications to cancer prediction problems. She holds a BS in Applied Mathematics-Biology from Brown University and a Masters of Music in Violin Performance from the New England Conservatory. She currently works as a data scientist at ZephyrAI\, a biotechnology startup that develops novel drug combination and repurposing treatments for oncology.
URL:https://ds.dfci.harvard.edu/event/frontiers-in-biostatistics-tree-based-ensembling-strategies-for-handling-heterogeneous-data/
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://ds.dfci.harvard.edu/wp-content/uploads/2022/01/1595653826867.jpeg
END:VEVENT
END:VCALENDAR