Extended Resume

Tahmeed Tureen

Data Scientist - Senior Associate at KPMG US

| |


*If curious about the hidden metrics & sensitive information on this page, please contact me directly via email or LinkedIn

Executive Summary

Senior Data Scientist with 5+ years of industry experience working on fast-moving and high-impact projects related to product pricing, fraud detection, customer demand, supply chain, & healthcare analysis. Proven record of translating complex models into executive-level strategies, influencing large scale revenue and client pursuits at KPMG. Proficient in Python, SQL, R, and modern dashboard tools like PowerBI and Streamlit


Professional Experience

Senior Data Scientist, KPMG | Chicago, IL & Ann Arbor, MI (Hybrid)

Timeline: Sep 2022 - Present

Client(s): KPMG Go-To-Market Teams

  • Implemented a hierarchical multi-agent AI orchestration system using LangChain, Python, SQL, and Streamlit, enabling dynamic interaction between users and KPMG-owned industry & company financial data and documents

  • Built a supervisor-agent architecture, where a central interface “supervsior” agent routes queries to domain-specific supervisor agents, each managing their own specialized task agents (e.g., data retrieval, document summarization, & company screening etc.)

  • Deployed the AI framework via a Streamlit frontend enabling KPMG sales engineers & analysts to explore proprietary datasets and accelerate client and project proposal development with context-aware AI agent workflows

  • Applied Latent Dirichlet Allocation (LDA) in Python to perform topic modeling across large sets of transcript headlines, uncovering key topic clusters that informed the design of filter options and improved the user experience within the AI-enabled analytics dashboard

Client(s): KPMG Healthcare & Life Sciences Go-To-Market Teams

  • Serve as co-lead data scientist and database manager for KPMG’s healthcare and life sciences teams, owning end-to-end data workflows using SQL and Python to support client strategy and analytics.

  • Implemented multiple scalable, modular SQL & Python scripts to manipulate electronic health claims data and address specific research questions on Snowflake; Results have been adopted across 30+ client projects, delivering actionable insights to healthcare professionals and influencing multi-million-dollar revenue outcomes

Client(s): Small Budget-Airline & KPMG Audit Team

  • Led the development of a dynamic time series regression and quasi-experimental modeling pipeline to estimate the causal impact of route strategies on ticket sales for a budget airline, integrating client data with external U.S. flight data from the Bureau of Transportation Statistics (BTS)

  • Delivered findings via an interactive PowerBI dashboard, enabling business & audit analysts to run A/B-style scenario simulations and explore counterfactual outcomes for ticket sale optimization

Client(s): KPMG Supply Chain & Procurement Go-To-Market Teams

  • Developed a novel supply chain “stress” metric in collaboration with domain experts, using latent variable analysis, time series forecasting, and feature engineering in R and Python to quantify systemic disruptions across U.S. supply chains

  • Managed a cross-functional team within an Agile (JIRA-based) framework to deliver a scalable, auto-refreshing analytics pipeline

  • The metric was published at the ASCM Conference and subsequently reused across multiple KPMG client pursuits, demonstrating its strategic value for answering critical operational and financial questions

Data Scientist, KPMG | Chicago, IL (Hybrid)

Timeline: Sep 2020 - Aug 2022

Client: Mid-sized Regional Insurance Company

  • Improved KPMG’s homeowners insurance underwriting pipeline by building and validating generalized additive models (GAMs) and GLM-based regressions (Poisson, Gamma, Tweedie) to estimate loss-cost and pure premiums across various perils, integrating external environmental features to improve predictive accuracy

  • Designed these models to mirror causal frameworks by simulating counterfactual scenarios (e.g. renovations to a home, age of roofing, and distance to nearest fire station etc.)

  • Automated data extraction and feature engineering workflows using R and internal APIs, enabling efficient, repeatable updates to the pricing models and delivering insights via a stakeholder-facing PowerBI dashboard

Client(s): Global Pharmaceutical Company & KPMG Procurement & Supply Chain team

  • Developed a machine learning framework leveraging Extended Isolation Forests to assign anomaly scores, allowing teams to prioritize investigations based on adjustable risk thresholds. Integrated model outputs into a PowerBI dashboard, providing real-time visibility into transaction risk and supporting data-driven client decision-making.

  • Designed and implemented a Python-based data anonymization pipeline, using random name generators and join-preserving keys to securely handle B2B transaction data from five disparate sources.

  • Engineered and merged these anonymized datasets into a unified modeling set with pandas, enabling exploratory analysis and anomaly detection across pharmaceutical vendor–retailer interactions.

  • Collaborated with KPMG stakeholders and supply chain experts to define a high-impact feature space for detecting fraud and malpractice.

Client: KPMG Financial Services Go-To-Market Teams

Project: ESG Score Calculator

  • Built a scalable ESG (Environmental, Social, and Governance) scoring calculator using logit-transformed penalized linear regression in Python, enabling usage by 80+ engagement teams and influencing $hidden* in tagged revenue in ~3 years; work recognized as key driver for promotion to Senior Data Scientist

Project: Reusable & Refreshing Data Curations

  • Built scalable data curation pipelines in Python and SQL to extract, process, and maintain public and commercial datasets on a KPMG database (Azure), enabling seamless collaboration with internal & external project teams by delivering refreshed, analysis-ready data for integration into modeling and analytics workflows with minimal engineering overhead

Project: Spatial Interpolation for Climate Risk

  • Transformed commercial climate risk data for 15,000+ global companies into multiple geographic levels (e.g., ZIP code, county) and applied spatial interpolation techniques (Kriging and Inverse Distance Weighting) in Python to impute missing scores, enabling auxiliary risk analysis offerings in KPMG client engagements across sectors

Project: Latent Variable Analysis on U.S. Neighborhoods

  • Directed the end-to-end development of 35+ latent market indicators (e.g., “environmental friendliness”) using interpretable factor analysis in Python, enabling white space analyses and feature selection across 10+ go-to-market pursuits and client modeling pipelines

Data Science Intern, Nielsen | Chicago, IL

Timeline: Jun 2019 - Jul 2019

  • Implemented and applied an Isolation Forest-based anomaly detection layer within Nielsen’s media data integration pipeline using Python to identify irregularities – such as bots – and improve the accuracy of audience data samples; this work directly contributed to earning a return offer

  • Maintained robust version control using Git & Bitbucket, enabling reproducible, modular commits that teammates could pull and merge into the main data science pipeline for seamless integration and collaboration

  • Presented and demoed end-to-end data science workflow to both of Nielsen’s Data Science pillars using a custom-built RShiny app, showcasing analytical insights and technical implementation at the conclusion of the internship