Software and Platforms Team
The Software and Platforms Team provides scientific software services to DFCI investigators. Services include:
- Data Visualization and Exploratory Data Analysis (EDA) Tools
- AI Platform Development and Large Language Model (LLM) Integration
- Web Applications and Dashboards
- Cloud Computing and Data Processing at Scale
- UI/UX Design and Usability Testing
- Data Extraction and Transformation Pipelines
- Database Design and Implementation
- Data Modeling, Curation, Integration and Sharing
- Clinical and Pathology Workflow Systems
- Bioinformatics and Machine Learning Pipelines
For examples of our recent work, please see Example Software Products below.
Information for DFCI Investigators
If you are a DFCI investigator in need of software services for your projects, consider hiring through the Department of Data Science.
Why? We offer a unique approach to develop scientifically useful, scalable software products. Our approach is based on two core principles:
- Multi-disciplinary Teams: Research software requires expertise in a variety of fields including, but not limited to, biology, computational biology and bioinformatics, laboratory science, clinical practice and software engineering. We are a group of scientists, computational biologists, bioinformatics engineers and software developers. Typically we will form a small interdisciplinary team led by a scientist to work with you. In this way, we are able to produce thoughtful solutions which are informed by our collective experience and diverse backgrounds.
- Iterative Development: We approach development in an iterative manner, based on the principles of Agile Software Development. After receiving initial feedback on a prototype, we aim to efficiently and effectively create a product that demonstrates utility. After the initial product is deployed, we continue cycles of user feedback and testing and further development in order to continuously introduce improvements and enhancements until the desired outcome is achieved.
Funding and Logistics
Funding for employee salary, benefits, and computing is provided by the investigator hiring into the team. Budget information is needed before honoring a hiring request.
The Department of Data Science provides space, mentorship, training, and enrichment events. Assignment of team members to specific investigators and assignment of Data Science mentors will be determined by research activities and area(s) of expertise. The Department of Data Science also takes care of employee annual reviews, which are done in partnership between the investigators they support and their Data Science mentor.
Curious to Get Started?
When we partner with a group or individual, we offer free design sprints to kick off a project. The 5-day design sprint includes mapping out goals and ideas and ends with a prototype. Our process is based on the Design Sprint Process used by Google Ventures. Most recently, we have used this process to develop products such as MatchMiner-AI and MatchMiner-dashboard with our clinical collaborators.
To get started, reach out to Ethan Cerami.
Example Software Products
The cBioPortal for Cancer Genomics is an open-access portal and open source software platform that enables interactive, exploratory analysis and visualization of large-scale cancer genomics data sets. The platform is now developed and maintained across multiple cancer centers, including DFCI and Memorial Sloan-Kettering Cancer Center.
MatchMiner is an open source computational platform for matching patient-specific genomic profiles to cancer precision medicine trials. Via a recent grant from Meta and an on-going collaboration with Dr. Kenneth Kehl, we are now adding AI and Large Language Model (LLM) support to MatchMiner.
Profile and ImmunoProfile: Profile is a Next Generation Sequencing (NGS) Platform, developed jointly by Dana-Farber and the Brigham and Women’s Hospital. ImmunoProfile is a histology-based multiplex tissue assay. Our group was integral to both platforms, where we focused on bioinformatics pipeline development, data modeling, software architecture and results delivery.
Data Modeling and Data Management: We have built multiple data platforms for research and clinical research operations. This includes management of complex clinical data, single cell sequencing data, spatial profiling data and clinical trials data.
