AMG Lab

Eran Halperin, PhD

Computer Science Department, Courant Institute, NYU

Division of Precision Medicine, Langone, NYU

Eran Halperin's CV →

AI in Medicine & Genomics Lab

Our lab develops machine learning models and statistical approaches to improve detection and treatment of human disease. Our work spans different modalities, including genomic data, medical imaging, electronic health records, and physiological waveforms.

Selected Publications

  • 2025 Unico: a unified model for cell-type resolution genomics from heterogeneous omics data. Genome Biology.
  • 2025 Memorize and Rank: Elevating Large Language Models for Clinical Diagnosis Prediction AAAI.
  • 2024 Accurate prediction of disease-risk factors from volumetric medical scans by a deep vision model pre-trained with 2D scans. Nature Biomedical Engineering.
  • 2023 Understanding and Predicting the Effect of Environmental Factors on People with Type 2 Diabetes. CHIL.
  • 2022 Methylation risk scores are associated with a collection of phenotypes within electronic health record systems. Nature Genomics Medicine.
  • 2022 Extend and Explain: Interpreting Very Long Language Models. ML4H@Neurips
  • 2021 Imputation of the continuous arterial line blood pressure waveform from non-invasive measurements using deep learning. Scientific Reports.
  • 2021 Automated identification of clinical features from sparsely annotated 3-dimensional medical imaging. npj Digital Medicine.
  • 2020 Optimized design of single-cell RNA sequencing experiments for cell-type-specific eQTL analysis. Nature Communications.
  • 2020 Accurate estimation of cell composition in bulk expression through robust integration of single-cell information. Nature Communications.
  • 2020 Context-aware dimensionality reduction deconvolutes gut microbial community dynamics. Nature Biotechnology.
  • 2019 FEAST: fast expectation-maximization for microbial source tracking. Nature Methods.
  • 2019 Cell-type-specific resolution epigenetics without the need for cell sorting or single-cell biology. Nature Communications.
  • 2019 CONFINED: distinguishing biological from technical sources of variation by leveraging multiple methylation datasets. Genome Biology.
  • 2018 Detecting heritable phenotypes without a model using fast permutation testing for heritability and set-tests. Nature Communications.
  • 2018 BayesCCE: a Bayesian framework for estimating cell-type composition from DNA methylation without the need for methylation reference. Genome Biology.
  • 2017 Correcting for cell-type heterogeneity in DNA methylation: a comprehensive evaluation. Nature Methods.
  • 2016 Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nature Methods.
  • 2016 Fast and accurate construction of confidence intervals for heritability. The American Journal of Human Genetics.
  • 2015 A genetic and socioeconomic study of mate choice in Latinos reveals novel assortment patterns. PNAS.
  • 2013 Identifying personal genomes by surname inference. Science.
  • 2012 A model-based approach for analysis of spatial structure in genetic data. Nature Genetics.
  • 2010 Genome-wide association study of follicular lymphoma identifies a risk locus at 6p21.32. Nature Genetics.
  • 2009 Genomic privacy and limits of individual detection in a pool. Nature Genetics.
  • 2009 Maximizing power in association studies. Nature Biotechnology.
  • 2008 Estimating local ancestry in admixed populations. The American Journal of Human Genetics.
Full publication list on Google Scholar →

Research

Computational Genomics Machine Learning in Medicine Deep Learning Algorithms Epigenomics Single-Cell Analysis Clinical AI

Cell-Type Specific & Clinical Genomics

We develop deconvolution and dimensionality reduction methods for analyzing methylation and RNA expression data at cell-type resolution, working on bulk tissue samples without requiring cell sorting or single-cell biology. We also develop methods for microbiome data, including microbial source tracking and community analysis. Examples of our work include TCA and ReFACTor (cell-type deconvolution of methylation data), Bisque (RNA deconvolution), and FEAST (microbial source tracking).


Data types: methylation, RNA expression, single-cell/nucleus analysis, microbiome.

Clinical AI

We build computational frameworks that support clinical decision-making in ophthalmology, anesthesiology, and acute medicine. Our approaches combine statistical methods with modern deep learning architectures applied to medical imaging, electronic health records, and physiological waveforms.


Data types: OCT, MRI, ultrasound, CT, EHR, ECG, PPG, arterial blood pressure waveforms.

Members

Halperin Lab Team
Eran Halperin
Eran Halperin, PhD
Professor
Dept. of Computer Science, Courant Institute
Dept. of Precision Medicine, NYU Langone
Google Scholar
Michal Sadowski
Michal Sadowski, PhD
Post-doctoral Researcher
Google Scholar
Zeyuan Chen
Zeyuan (Johnson) Chen
PhD Student
Google Scholar
Ulzee An
Ulzee An
PhD Student
Google Scholar

Software and Repositories

FEAST

Fast expectation-maximization for microbial source tracking.

GitHub →

ReFACTor

Corrects for cell-type heterogeneity in whole-genome methylation studies (EWAS).

GitHub →

TCA

Deconvolution method for cell-type specific analysis of DNA methylation data.

GitHub →

Bisque

Inference of cell-type composition in heterogeneous tissues using RNA-seq data.

GitHub →

MTV-LMM

Analysis and prediction of temporal microbiome data using linear mixed models.

GitHub →

CONFINED

Distinguishes biological from technical sources of variation using multiple methylation datasets.

GitHub →

MRS Weights

Methylation risk scores (MRS) derived from UCLA electronic health records, predicting associations with medications, lab results, and medical conditions from DNA methylation patterns.

GitHub →

ALBI & FIESTA

ALBI estimates the distribution of heritability estimators using a bootstrap approach. FIESTA is its faster successor, constructing accurate confidence intervals for heritability using stochastic approximation.

GitHub →

Unico

Decomposes bulk genomic data into cell-type-specific components, representing samples as a 3D tensor to enable cell-type resolution across diverse genomic assays.

GitHub →

SLIViT

A data-efficient deep learning framework for measuring disease-related risk factors in volumetric biomedical imaging scans (MRI, OCT, ultrasound, CT).

GitHub →

GLINT

User-friendly command-line tool for fast genome-wide DNA methylation (EWAS) analysis. Includes ReFACTor, EPIStructure, LMM association testing, and reference-based cell-type estimation.

GitHub →

GEVALT

Selects the most predictive tag SNPs. Maintained by Ron Shamir's group.

RECYCLER

Detects plasmids from de-novo assembly graphs. Maintained by Ron Shamir's group.

SecureGenome • SEQEM • CAMP • WHAP • LOCO-LD • SPA • BARCODE • LAMP • LAMP-LD

News

Contact

Eran Halperin
Professor, Department of Computer Science
Courant Institute of Mathematical Sciences
Research Professor, Division of Precision Medicine, NYU Langone Health

Email: first + last @nyu.edu

Google Scholar: Profile

Funding

We thank the National Science Foundation and the National Institute of Health for their current support. We also thank the Israeli Science Foundation, the German-Israeli Science Foundation, IBM, the Blavatnik Research Foundation, the Juludan Research Foundation, the National Institute of Health, and The Edmond J. Safra Center for Bioinformatics for their support in the past, and hopefully in the future.