benford's comp. bio. research

The summer after my second year at Carnegie Mellon University, I was living in New Orleans and had been hired at Starbucks but they wouldn't schedule me for training. As I had no money, I applied to a few labs that were looking for remote biology students or anyone with a little programming experience, since I'd just written a bunch of python and javascript stuff for Golan Levin's Drawing with Machines course (this is me). I only heard back from one lab in a field I knew nothing about, read one of their papers, interviewed, and joined the following week.

I worked for three years in the Mohimani Lab as an undergraduate research assistant, then for one year as a research associate in the same lab. I spent most of my time working on things related to seq2RiPP, a genome-mining pipeline for predicting ribosomally synthesized and post-translationally modified peptides (RiPPs) written in Rust. The first thing I did for the lab was collecting, fixing, and otherwise implementing chemical modifications for graph-based molecular data. These modifications simulate activity by biosynthetic enzymes identified via HMM search. I've done a fair amount of other large-scale data collection/validation, but the first time was probably the most fun. Over the course of the next three years I picked up pretty much everything I know about computer science and working with computers generally.

As for my biology background, I had a really good high school biology teacher, Nhan Pham, who really encouraged me to pursue biology. I didn't listen, because I applied to every school for environmental engineering, but I ended up back in biology anyway. Good thing, because I would have hated environmental engineering. I took a fair number of wet lab courses while at CMU, but my favorites were definitely Organic Synthesis and Analysis and Experimental Techniques in Molecular Biology.

In the past year I was accepted into the Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology (CPCB). I'm just starting my second semester and non-rotation research.

I did my first rotation with the same lab I'd been working in to wrap up some work in implementing certain aspects of the commandline interface for seq2NRP and seq2PKS (the same pipeline for two different classes of natural products). There were also a good number of other bug fixes specifically for the CLI related to spectral networking and molecule-spectrum match p-value I/O.

My second rotation was in the Robin E. C. Lee Lab, where I worked on a project involving interleukin 1-beta-receptor binding affinity. It's a significantly wet lab, meaning most of what I had been doing for the first couple weeks was getting up to speed on specific wet lab techniques. I took care of my own plate of U2OS cells and made some flourescence microscopy movies of formation. Then, as a project picked up, I had to learn all about molecular dynamics simulations in order to predict the binding affinities of different species' IL-1B to the human receptor complex. The eventual goal is to relate these to the observations in fluorescence microscopy when stimulating these U2OS cells with the different species' cytokines. This work is still ongoing though my rotation's ended, but I'm having fun with it.

My third rotation was in the Koes Lab. I worked on a machine-learned distance offset for interatomic distances to make k-nearest-neighbors more attuned to important atom relationships. The task I'm training it for is for protein-ligand scoring, but the method itself is somewhat general. As this rotation was during final exams and final projects, I pretty much was only able to re-implement the previous version of the model and get it training. Regardless, I learned a lot and enjoyed the work.

Now, I've been matched to both Robin E. C. Lee and David Koes, who will be my co-advisors for the next five or however many years. I'm super excited.

Publications

A Hyun Kim, Benjamin Krummenacher, Jason Yeung, David Ryan Koes, Robin E. C. Lee. NEMO recruitment at single cytokine-receptor complexes shows quantized dynamics independent of ligand affinity. bioRxiv 2025.04.18.649561; DOI: 10.1101/2025.04.18.649561.
Abhinav K. Adduri, Andrew T. McNutt, Caleb N. Ellington, Krish Suraparaju, Nan Fang, Donghui Yan, Benjamin Krummenacher, Sitong Li, Camilla Bodden, Eric P. Xing, Bahar Behsaz, David Koes, Hosein Mohimani. Interpretable adenylation domain specificity prediction using protein language models. Preprint (2025). DOI: 10.1101/2025.01.13.632878.
Yi-Yuan Lee, Mustafa Guler, Desnor N. Chigumba, Shen Wang, Neel Mittal, Cameron Miller, Benjamin Krummenacher, Haodong Liu, Liu Cao, Aditya Kannan, Keshav Narayan, Samuel T. Slocum, Bryan L. Roth, Alexey Gurevich, Bahar Behsaz, Roland D. Kersten, and Hosein Mohimani. HypoRiPPAtlas as an Atlas of hypothetical natural products for mass spectrometry database search. In: Nature Communications 14.4219 (2023). DOI: 10.1038/s41467-023-39905-4.

Computational Biology Research

Publications

Coursework

Fall 2024

Spring 2025 (ongoing)