Senior Data Scientist & Machine Learning Engineer
I transform data into actionable insights and build intelligent solutions that drive business value
I'm a Senior Data Scientist at Ally Bank specializing in quantitative modeling and credit loss prediction. With a Master's degree in Data Science from Wayne State University and a background in Statistics from the University of Michigan, I bring both theoretical knowledge and practical expertise to complex data challenges.
I develop production-grade models that score billions of dollars in originations, build automated reporting systems, and deliver actionable insights to executive leadership. My work spans machine learning, statistical modeling, data engineering, and business intelligence.
Develop structure and methodology for new credit loss scoring models. Designed and implemented model for smaller lease portfolio. Lead documentation and implementation with modeling governance teams.
Revamped predicted credit losses process scoring $3B+ monthly originations. Model recalibration for improved accuracy. Monthly executive reporting on loss KPIs.
Developed and maintained internal Logistic Regression Model for auto loan pricing. Bi-weekly executive presentations using R, SAS, and PowerBI. Automated weekly reporting and data pipeline development.
Developed Tableau dashboards enabling automated client reporting. Built data visualization solutions for Fortune 100 marketing campaigns.
Ensured accuracy of automated data pipelines using Google DCM data. Performed cluster analysis on 300M+ rows using AWS to create national high-value audiences for major mortgage company. Awarded best marketing campaign pitch for Fortune 100 client.
Data Science and Business Analytics
December 2023Double Major: Statistics & PPE (Political Science, Philosophy & Economics)
May 2020A selection of production work I cannot share publicly.
NorthStar Care Community struggles to identify which hospice patients are nearing death. If a patient isn't flagged in time, Medicare/Medicaid reimbursement for end-of-life care is lost. Built a recall-optimized model using 20 years of EHR data to catch as many at-risk patients as possible, delivering a risk scorecard that nurses can reference to prioritize care and ensure no patient falls through the cracks.
Educational game developers need to know if students are learning, but traditional assessments interrupt the experience. Built models to predict whether students would answer questions correctly based solely on their in-game behavior (26M+ gameplay events), enabling real-time learning assessment without ever pulling students out of the game.
Kaggle competition to predict enzyme thermostability (Tm) for 2,413 mutations of a single wildtype protein. Feature engineering on protein sequences using the R bioseq package, AlphaFold2 PDB b-factor scores, and BLOSUM substitution matrices. Built an XGBoost and Random Forest ensemble across three target formulations, finishing in the top 50% of the leaderboard.
Classifying sentiment across 1.6 million tweets using Logistic Regression, SVM, a bidirectional LSTM with GloVe embeddings, and VADER.
Used Monte Carlo simulation to compare the statistical power of the t-test vs. Wilcoxon-Mann-Whitney test under both normal and t-distributions, then applied permutation testing with 2,000 iterations to test whether states with major metropolitan areas recovered differently from the 2008 recession than rural states.
I point my AI agent at a problem, give it some direction, and let it run. This repo is where those experiments live. Part collaboration, part chaos, all for fun. Stop by and see what we've been up to.
Have a project in mind? Let's work together!
Senior Data Scientist @ Ally Bank
Ask me anything about my work, experience, or data science!
Suggested questions: