Internship 8
Host name: Elham Azizi
School and Department: SEAS; Biomedical Engineering and Computer Science
Internship title: Guided Diffusion Models for Protein Design
Number of interns to be hosted: 1 (one)
Types of support offered:
- Stipend: $3,250 per month
- Access to Columbia University campus services, computational resources, and lab meetings
- Mentorship and supervision by faculty and senior lab members
- Immigration and Visa assistance through Columbia’s International Students and Scholars Office (ISSO)
Internship description:
The intern will work on developing and testing computational methods that adapt recent advances in diffusion-based generative modeling for the design of protein sequences. Specifically, the project will focus on incorporating guidance and steering mechanisms into diffusion models to optimize protein sequences for molecular properties such as binding affinity and interaction specificity. The intern will review prior literature (e.g., work such as ProteinSGM), implement and experiment with diffusion-based models, and evaluate generated sequences against target objectives. This project sits at the intersection of applied mathematics, generative AI, and protein engineering, with opportunities to present results in group meetings and potentially contribute to a manuscript.
Skills required:
- Strong background in mathematics, including vector calculus and differential equations
- Solid foundation in statistics, particularly high-dimensional statistics
- Familiarity with stochastic differential equations (SDEs), score-based generative modeling, or diffusion models (preferred)
- Knowledge of generative AI methods, such as variational autoencoders (VAEs) or score matching approaches
- Some background in natural language processing (NLP) methods is a plus
- Prior experience with biological datasets is not required, as the project focuses on computational methodology
Additional information
This internship provides training in modern machine learning for scientific discovery, with mentoring tailored to strengthen both theoretical foundations and applied research skills. The project is highly interdisciplinary and will prepare the student for future graduate-level research at the interface of AI and computational biology.