Work Experience
Senior Data Science Engineer, SimSpace, 2021 – 2023
- Trained Q-learning reinforcement learning (RL) agent for cybersecurity penetration and analyst training scenarios
- Visualized agent actions for rapid and intuitive evaluation of agent performance
- Redesigned agent architecture and persuaded team to adopt multi-agent vision
- Created decision-making framework with 4 adaptability types using multinomial-Dirichlet modeling for a low-data domain
- Rigorously tested framework production code and often practiced test-driven development
- Interviewed cybersecurity experts and mathematized their expertise to inform agent decisions
- Presented ~15 seminars on RL, active learning, cluster similarity, Plackett-Luce model, Bayesian modeling, etc.
- Conducted ~50 coding interviews for data science and engineering job candidates from intern to manager levels
Principal Data Scientist, Geneia, 2018 – 2021
- Domains: hierarchical condition categories, COVID-19, unplanned readmissions, social determinants of health, pharmaceutical adverse events, care management, workforce allocation, medical vocabulary mapping
- Methods: regression, binary and multi-label classification, Bayesian hierarchical modeling, SHAP, supervised clustering
- Rapidly devised and implemented predictive model to address a sudden, crucial shift in business needs, despite lack of data
- Researched, designed, and implemented multi-stage solution for important client revenue problem
- Created customized metrics for comparing ROI of 2 different modeling approaches
- Consulted cross-functionally to align project outputs with client needs
- Documented projects and answered client questions about modeling approaches and AI social bias
- Evaluated, mapped, or created data resources to accelerate data science team productivity and inform clients
- Initiated data science seminar series
- Successfully encouraged 7 other data scientists to present seminars
- Presented 80+ seminars on neural networks, embeddings, interpretability, statistics, study design, causal inference, etc.
- Created internal Python package of data science utilities
- Advised colleagues on clustering evaluation, sampling, data types, virtual environments, modularized coding, etc.
- Wrote company blog post Interpretability and the promise of healthcare AI
- Interviewed for company podcast
Insight Health Data Science Fellow, 2018
- Predicted hospital-acquired infection scores used in determining Medicare payment rates for hospitals
- Modeling included linear regression and random forests with imputation for missing data
- Deployed web application to deliver predicted scores to hospital administrators
Medical Science Liaison (MSL), Rheumatology, Bristol-Myers Squibb, 2014 – 2016
- Territory: southeast Texas, Louisiana
- Discussed company research, company pipeline, clinical practice, and basic immunology with physicians and other healthcare providers (HCPs) at one-on-one and group presentations
- Coordinated MSL support of Phase II, III, and IV clinical trials for marketed (Orencia / abatacept) and investigational compounds in rheumatoid arthritis, lupus, psoriatic arthritis, and scleroderma
- Developed slide deck and used it to train MSL team on clinical trial responsibilities, rules and regulations, and design
- Evaluated potential sites for Phase II, III, and IV clinical trials in rheumatology
- Co-mentored and trained new MSL on marketed product and effective communication with HCPs
- Trained sales team members in basic and clinical immunology
- Co-developed training materials for MSL team
Medical Science Liaison (MSL), Neurology, EMD Serono, 2012 – 2014
- Territory: Pennsylvania, upstate New York, Delaware
- Established ongoing dialogues with Key Opinion Leaders (KOLs) concerning clinical practice, the therapeutic landscape, scientific advances, and company pipeline in multiple sclerosis
- Presented information to patients about clinical trials of approved products (Rebif / interferon β1-a)
- Developed and updated company materials for presentation to KOLs and other HCPs
- Reported medical developments and competitive intelligence learned from KOLs and congresses
- Educated healthcare providers about the disease state and scientific issues using approved materials
- Disseminated information on investigator-initiated and other grant programs sponsored by the company
- Strategically selected and nominated KOLs with appropriate expertise for advisory boards
- Evaluated potential sites for Phase III and Phase IV clinical trials in neurology
- Identified and trained 4 speakers; organized 13 speaker programs
- 2013 President’s Award for overall performance
Certification
Amazon Web Services (AWS) Training and Certification, AWS Certified Cloud Practitioner, 2021-2027
Education
University of Pennsylvania, Philadelphia, PA, Ph.D., Neuroscience, 2011
- Characterized development of social behaviors in a rodent model relevant to autism and identified brain structures and neurotransmitters that influence these behaviors
- Self-taught and implemented analyses in robust statistics, model selection, hierarchical models, intraclass correlation as well as classical hypothesis testing and linear regression
- Statistically analyzed and visualized data with custom R scripts
- Collaborated with statistician to re-analyze and interpret archival data
- Wrote and published 4 peer-reviewed scientific articles
- Presented thesis data at 3 university seminars with 30 – 100 attendees and 6 posters at scientific conferences
- Simultaneously coordinated up to 3 major projects in different phases (planning, data collection, writing)
- R packages included: ggplot2, lattice (graphing, exploratory analysis); lme4 (linear mixed effects models); irr (intraclass correlations); psych (descriptive statistics); functions from Wilcox 2005 (robust statistics)
Coursera, Johns Hopkins University Data Science Specialization (10 online courses), 2015
- Capstone project: Distinguishing reviews about conventional & alternative medicine using textual analysis
- Developed decision tree to predict whether Yelp reviews described conventional or alternative medicine
- Courses included: R Programming, Regression Models, Practical Machine Learning, Developing Data Products
- R packages/tools included: plyr, dplyr (data munging); caret (machine learning); qdap, tm (textual analysis); jsonlite (reading JSON data); rmysql (reading SQL data); data.table (fast data processing); shiny (interactive applications); knitr (documentation, reproducible research); RStudio
Coursera, Stanford University Machine Learning (online course), 2016
- Linear and logistic regression, neural networks, support vector machines, cluster analysis, principal components analysis
Coursera, University of Washington Machine Learning Specialization (4 online courses), 2016
- Linear and logistic regression, support vector machines, cluster analysis, principal components, latent Dirichlet allocation
- Python packages/tools included: NumPy, SciPy, Pandas, scikit-learn, Jupyter, Anaconda, virtualenv, GraphLab Create
Coursera, Deep Learning Specialization (5 online courses), 2017
- Neural networks, convolutional neural networks, recurrent neural networks
Coursera, Amazon Web Services (AWS) Fundamentals Specialization (4 online courses), 2021
- Security, migrating to the cloud, building serverless applications
Coursera, Data Engineering with Google Cloud Specialization (6 online courses), 2021
- Data lakes/warehouses, batch pipelines, streaming analytics, machine learning