Overview

Experience and Education

Senior Data Science Engineer, SimSpace

Principal Data Scientist, Geneia

Medical Science Liaison (MSL), Rheumatology, Bristol-Myers Squibb

Medical Science Liaison (MSL), Neurology, EMD Serono

University of Pennsylvania, Philadelphia, PA, Ph.D., Neuroscience


LinkedIn

Github

Publications

Thesis Lab


Professional Outlets

AI and Bias In Healthcare – a video discussion about social bias in artificial intelligence and how to address it

AI interpretability is especially critical in healthcare – a blog post about model interpretability

Model interpretability and healthcare – highlights from a podcast about data science, model interpretability, COVID-19, and healthcare


Personal Projects

State-Space Models: Learning the Kalman Filter – Different research fields may speak different mathematical languages. There’s nothing like rigorous software testing for accurate translation. Go here for the code.

Beyond Point Estimates – When we need to predict more than just a mean or a median, full posterior distributions from Bayesian models are often the way to go. But sometimes, that’s too computationally intensive and we need some shortcuts. Quantile regression is a handy alternative. For even more efficiency, we can use multi-task learning so that a single model produces all the quantiles we want. Go here for the code.

Weather and climate API – Using mock testing and FastAPI to query, create, and test web APIs. Go here for the code.

Pandas vs. Polars, Python vs. Rust: Who will win? – Benchmarks are nice, but how fast are our favorite data tools on realistic data workflows? Go here for the code.

Bayesian Updating with a Beta-Binomial Model: Basketball Edition – We start the season thinking our team is this good (or bad). But as the wins and losses pile up, how do we update our priors? Go here for the code.

Bayesian Updating with a Dirichlet-Multinomial Model: Visualizing More Outcomes – As we add outcomes to our model, the concepts stay the same but the dynamics grow more complex. Viewing animations of the model can help us develop intuitions about how it works. Go here for the code.

Investment Performance Metrics Dashboard – Plotly Dash app for tracking profit/loss and other investment performance per transaction or over time. Go here for the code.

Monitoring Data Pipelines with Airflow and Tcl/Tk – Airflow is terrific for scheduling and monitoring data pipeline components. But we also want to monitor in real-time what’s happening inside those components. Go here for the code.

Add Columns to Polars Dataframes Quickly – There are straightforward, slow ways to do things, and then there are faster ways. Know how to choose. Go here for the code.

Deep Reinforcement Learning and Rainbow – How does a computer learn to play video games?

Information Theory for Toddlers – A low-entropy bedtime story

SHAP Tutorial – How do we use Shapley values to interpret machine learning models? Go here for the code.

Case Study: How to Translate a Healthcare Problem into a Predictive Modeling Problem – How do we correctly select cases for our training data?

The Peanuts Project – Charlie Brown, Snoopy, Lucy, Linus . . . who was the most important character? Which of their relationships was the strongest? Indulge some nostalgia and hum some Guaraldi!

Classifying Medicine – How do patients experience conventional and alternative medicine differently? Yelp, random forests, ROC curves, and so much more!


Recent posts

Recent posts, mostly links to interesting articles that I have been reading:

  • What I Read: impossible languages
    https://www.quantamagazine.org/can-ai-models-show-us-how-people-learn-impossible-languages-point-a-way-20250113/ Can AI Models Show Us How People Learn? Impossible Languages Point a Way.Ben Brubaker1/13/25 11:00 AM “Certain grammatical rules never appear in any known language. By constructing artificial languagesContinue readingWhat I Read: impossible languages
  • What I Read: Statistical Intuitions
    https://arxiv.org/abs/2409.18842 Classical Statistical (In-Sample) Intuitions Don’t Generalize Well: A Note on Bias-Variance Tradeoffs, Overfitting and Moving from Fixed to Random DesignsAlicia Curth27 Sep 2024 “…we show that classical intuitions relatingContinue readingWhat I Read: Statistical Intuitions
  • What I Read: transfer learning
    https://lunar-joke-35b.notion.site/Transfer-Learning-101-133ba4b6a3fa800e8cede11ee3f1c1cd Transfer Learning 101Himanshu DubeyNov 5, 2024 “Let’s understand Transfer Learning in greater detail.”
  • What I Read: Model Merging
    https://planetbanatt.net/articles/modelmerging.html Model Merging and YouEryk BanattAugust 2024 “Model Merging is a weird and experimental technique which lets you take two models and combine them together to get a new model.”
  • What I Read: optimizing softmax
    https://maharshi.bearblog.dev/optimizing-softmax-cuda Learning CUDA by optimizing softmax: A worklogMaharshi Pandya04 Jan, 2025 “Optimizing softmax, especially in the context of GPU programming with CUDA, presents many opportunities for learning.”
  • What I Read: agents
    https://huyenchip.com/2025/01/07/agents.html AgentsChip HuyenJan 7, 2025 “This section will start with an overview of agents and then continue with two aspects that determine the capabilities of an agent: tools and planning….Continue readingWhat I Read: agents
  • What I Read: Age of Data
    https://amatria.in/blog/ageofdata The end of the “Age of Data”? Enter the age of superhuman data and AIXavier AmatriainDecember 24, 2024 “This post will argue that the age of data is farContinue readingWhat I Read: Age of Data
  • What I Read: ScyllaDB
    https://medium.com/@abdurohman/mind-blowing-postgresql-meets-scylladbs-lightning-speed-and-monstrous-scalability-7dcda1eb1cea Mind-blowing: PostgreSQL Meets ScyllaDB’s Lightning Speed and Monstrous ScalabilityAbdurohmanDecember 24, 2024 “While PostgreSQL excels in many use cases, our experience shows that for write-heavy, high-scale operations, the distributed architectureContinue readingWhat I Read: ScyllaDB
  • What I Read: Building agents
    https://www.anthropic.com/research/building-effective-agents Building effective agentsErik Schluntz and Barry ZhangDec 19, 2024 “Over the past year, we’ve worked with dozens of teams building large language model (LLM) agents across industries. Consistently, theContinue readingWhat I Read: Building agents
  • What I Read: Shapley Interactions
    https://mindfulmodeler.substack.com/p/what-are-shapley-interactions-and What Are Shapley Interactions, and Why Should You Care?Christoph MolnarDec 03, 2024 “Shapley values are the go-to method for explainable AI because they are easy to interpret and theoreticallyContinue readingWhat I Read: Shapley Interactions

Browse posts