MSF
Initialising
Franklin, TN — Available for opportunities
Mohamed
Shabeer

Applied data scientist & analytics engineer.
6+ years building attribution models, audience LTV frameworks,
and measurement systems that inform spend and release decisions.

Scroll
6+
Years Experience
4+
DSPs Integrated
12+
Analytics Models Deployed
10B+
Rows Processed
About

Where measurement science
meets real decisions.

Mohamed Shabeer
Mohamed Shabeer Fnu
Applied Data Scientist & Analytics Engineer
Franklin, TN — Open to opportunities

Applied data science and analytics professional with 6+ years building measurement systems, predictive models, and attribution frameworks across streaming media, audience analytics, and enterprise data.

From designing multi-platform attribution dashboards and audience LTV models for music clients in Nashville, to leading cross-functional engineering at Bosch, to deploying custom data integrity systems at the Brooklyn DA's Office.

I bridge raw data and real decisions — modeling attribution, forecasting audience value, and surfacing the signal that drives better spend and release decisions.

Sources
Transform
Warehouse
Attribution Model
Decision Layer
Stack
BigQueryData
SnowflakeData
SQL / ETL / ELTData
TableauAnalytics
Python / Node.jsDev
TensorFlow / PyTorchML
Terraform / CI/CDDevOps
Experience

Where I've built.

May 2024 — Present
INTJ Analytics
Consulting Co.
Nashville, TN
Technical Solutions & Analytics Specialist
Designed multi-platform attribution models and audience analytics systems for music clients — integrating cross-DSP streaming data, automating BigQuery and Snowflake ingestion pipelines, and building Tableau dashboards used to benchmark release performance against promotional spend, identify high-LTV listener segments, and support release timing decisions.
BigQuerySnowflakeTableauPythonETLPredictive Modeling
Dec 2024 — Jul 2025
Brooklyn District
Attorney's Office
New York, NY
Multimedia Technician — Special Operations
Built a custom exhibit-labeling program that reduced attorney pre-trial prep time. Conducted field evidence retrieval using forensic protocols, produced 360° scene reconstructions with OSCR360, and designed audit-trail systems maintaining full chain-of-custody integrity across active case files.
Custom SoftwareForensicsOSCR360Hardware
Mar 2019 — Nov 2023
Bosch Systems
Braga, Portugal
Solutions Engineer — Manufacturing
Led end-to-end engineering projects across requirements, schedule, and budget — delivering ~30% ahead of schedule. Negotiated vendor contracts and translated stakeholder needs into precise technical requirements through design reviews and change control.
AutoCADSolidWorksProject MgmtSupply Chain
Projects

Selected work.

01 — AnalyticsView Details ↓
Music Analytics Dashboard
Multi-source attribution and audience analytics platform — integrating cross-DSP stream data (Spotify, Apple Music, Amazon Music), modeling release performance attribution across platforms, and surfacing LTV-weighted audience segments to inform promotional budget allocation and release timing decisions.
BigQueryTableauPythonETL
02 — Audit & Data IntegrityView Details ↓
Evidence Chain-of-Custody System
Automated data integrity and audit-trail system for the Brooklyn DA's Special Operations unit — replacing a manual evidence-tagging workflow with a fully logged, schema-enforced system maintaining 100% chain-of-custody integrity across hundreds of exhibits per active case.
Data GovernanceAudit TrailWorkflow Automation
03 — Applied EngineeringView Details ↓
Surveillance Retrieval Rover
End-to-end hardware and software solution for physically inaccessible field data retrieval — demonstrating full-stack systems design from problem definition through field deployment.
Systems DesignHardwareField Ops
04 — Open SourceView Details ↓
Data Engineering Portfolio
Open-source collection of attribution models, audience segmentation pipelines, and predictive analytics frameworks — built on real-world streaming and media schemas, with documented methodology for LTV modeling, multi-source data integration, and automated reporting.
GitHubML / AISnowflakescikit-learn
01 — Analytics
Music Analytics
Dashboard

Multi-platform streaming attribution system — isolating per-DSP performance contribution, modeling audience retention cohorts across releases, and generating the decision-layer data used to allocate promotional spend and time releases. Live demo below tracks "Wish You Back" by Clayton Johnson across its full 108-week run.

BigQuerySnowflakeTableauPythonETL/ELTdbt
Production — Active
10B+
Rows Processed
~40%
Faster Reporting
12+
Dashboards Delivered
Problem It Solved

Music clients had no unified attribution view of their data. Streaming numbers lived in Spotify for Artists, revenue data in distributor portals, audience data in social tools — none of it connected.

Without a unified attribution view, promotional budget was being allocated based on anecdote rather than platform-level contribution data. Labels needed a single source of truth to answer: which campaign drove streams, which DSP is growing the highest-value audience, and where to allocate spend next release cycle.

Role & Process
01
Data Audit & Source Mapping
Identified all data sources across Spotify, Apple Music, distributors and social APIs — mapped schema inconsistencies across 8 platforms.
02
Pipeline Architecture
Designed ELT pipelines in BigQuery with dbt transformations, normalising across divergent schemas into a unified analytics layer.
03
Dashboard Build
Built Tableau dashboards for release performance, audience segmentation, and revenue attribution with drill-down capability.
04
Automation & Alerting
Scheduled ingestion jobs, anomaly alerts, and weekly automated reports delivered directly to label stakeholders.
Artist Tracker — Clayton Johnson
"Wish You Back" Feb 24, 2024 → Mar 2026
108 Weeks Tracked
Week
1
Week Of
Feb 29, 2024
This Week
37,771
Cumulative
37,771
FEB 2024 — RELEASE MAR 2026 — PRESENT
Other Singles Released During Period
Code Snippet — BigQuery Ingestion Pipeline
# Automated ingestion — normalises streaming data across platforms def ingest_streaming_data(platform: str, date_range: dict): raw = fetch_platform_data(platform, date_range) # Normalise column names across Spotify / Apple / TIDAL schemas df = raw.rename(columns=SCHEMA_MAP[platform]) df['stream_date'] = pd.to_datetime(df['stream_date']) df['revenue_usd'] = df['revenue_local'] * FX_RATES[df['currency']] # Deduplicate on (track_id, stream_date, market) df = df.drop_duplicates(subset=['track_id', 'stream_date', 'market']) # Load into BigQuery partitioned table bq_client.load_table_from_dataframe( df, f"analytics.streaming_{platform}_clean", job_config=bigquery.LoadJobConfig(write_disposition="WRITE_APPEND") ) return df
DSP Distribution — Randy Savvy
Platform breakdown by single
15 Singles
Total Streams
Leading Platform
Release Date
02 — Audit & Data Integrity
Evidence Chain-of-
Custody System

Automated data integrity and audit-trail system for the Brooklyn DA's Special Operations unit — replacing a manual evidence-tagging workflow with a fully logged, schema-enforced system. Demonstrates the same data provenance and lineage design principles that underpin reliable attribution infrastructure.

Custom SoftwareForensic ProtocolsWorkflow AutomationOSCR360
Deployed — Brooklyn DA
↓60%
Pre-Trial Prep Time
100%
Chain-of-Custody Integrity
0
Mislabeled Exhibits
Problem It Solved

Prosecutors were manually labeling hundreds of exhibits per trial using handwritten tags and spreadsheets. The process was slow, inconsistent, and legally risky — any gap in data provenance can compromise an entire case.

The unit needed a system that enforced schema integrity, automated audit logging, and maintained full data lineage from retrieval through courtroom presentation — the same class of problems that make attribution infrastructure reliable in any data-critical environment.

Role & Process
01
Workflow Audit
Shadowed prosecutors and technicians to map every step of the existing evidence handling process and identify failure points.
02
System Design
Designed a labeling schema covering exhibit ID, retrieval location, timestamp, handler, and full custody transfer log.
03
Build & Test
Built the program, integrated with OSCR360 capture workflow, and tested rigorously against live case files before deployment.
04
Training & Deployment
Trained prosecutors, detectives, and technicians. System deployed across active felony cases in Brooklyn.
System — Custody Chain Live View
System Online
Case BKN-2024-0391
Exhibits 0
Integrity 100%
RETRIEVED Mar 14 — 07:22 UTC
3x Surveillance MP4 — Business premises, 147 Atlantic Ave
SHA256: a3f9c2...4d81e7  ·  Handler: M.Shabeer
TAGGED & LABELED Mar 14 — 08:05 UTC
Exhibit IDs assigned: EXH-391-001 through EXH-391-003
Auto-labeled via system  ·  Metadata locked
TRANSFERRED Mar 14 — 09:42 UTC
M.Shabeer → ADA Rodriguez — Pre-trial package
Hash verified at transfer  ·  No tampering detected
ADMITTED — COURT Mar 19 — 10:15 UTC
All 3 exhibits admitted. Chain-of-custody verified by judge.
✓ Integrity confirmed  ·  Case: State v. [Redacted]
File Integrity
0%
Code Snippet — Exhibit Labeling Engine
# Auto-generates exhibit labels with full custody metadata class ExhibitLabel: def __init__(self, case_id: str, retrieval_data: dict): self.exhibit_id = generate_exhibit_id(case_id) self.timestamp = datetime.utcnow().isoformat() self.handler = retrieval_data['retrieved_by'] self.location = retrieval_data['source_address'] self.file_hash = sha256_file(retrieval_data['file_path']) self.chain = [{ 'action': 'RETRIEVED', 'by': self.handler, 'at': self.timestamp, 'hash': self.file_hash }] def transfer(self, to_handler: str): # Appends to immutable chain log — cannot be edited self.chain.append({ 'action': 'TRANSFERRED', 'from': self.handler, 'to': to_handler, 'at': datetime.utcnow().isoformat() }) self.handler = to_handler
03 — Applied Engineering
Surveillance
Retrieval Rover

A custom-engineered rover designed and built from scratch to retrieve surveillance footage from physically inaccessible or hazardous locations during active field evidence operations.

Hardware EngineeringField OperationsCustom BuildElectronics
Deployed — Field Operations
100%
Mission Success Rate
0
Officer Exposures
3+
Cases Supported
Problem It Solved

Evidence retrieval near active crime scenes often required officers or technicians to physically enter unsafe or inaccessible areas — narrow crawl spaces, compromised structures, or active scene perimeters.

Standard equipment couldn't fit or couldn't be safely operated. The unit needed a remote-operated solution that could navigate tight spaces, capture footage, and extract device data without human exposure.

Role & Process
01
Requirement Definition
Worked with detectives to define size constraints, payload requirements, terrain types, and camera specs needed in the field.
02
Mechanical Design
Designed chassis in AutoCAD — low-profile, all-terrain treads, modular camera mount, cable management system.
03
Electronics & Control
Integrated motor controllers, wireless transmission module, camera capture, and remote operation interface.
04
Field Testing & Deployment
Tested in simulated environments, refined based on detective feedback, and deployed to active field operations.
3D Model — Interactive
Drag to rotate
Technical Specifications
ChassisCustom — Low Profile All-Terrain
📡Control Range50m Wireless
📷Camera1080p + Night Vision
🔋Battery Life~90 Min Field Op
📦Clearance Height6 in — crawl space
📶Drive System4WD All-Terrain Treads
🌍Weight~3.2 kg Field Ready
PowerLiPo 5200mAh Pack
04 — Open Source
Data Engineering
Portfolio

Open-source collection of attribution models, audience segmentation pipelines, and predictive analytics frameworks — built on real-world streaming and media schemas, with documented methodology for LTV modeling, multi-source data integration, and automated reporting.

PythonSnowflakeBigQueryscikit-learnTensorFlowdbtTerraform
Open Source — Public
0
Repositories
0
ML Models
0%
Model Accuracy
0
Cloud Platforms
Architecture — End-to-End Data Pipeline
Sources
Spotify API
Apple Music
Distributors
Social APIs
Ingest
Python ETL
Schema validate
Deduplication
Error alerts
Warehouse
Snowflake
BigQuery
dbt models
Data Vault 2.0
ML / BI
scikit-learn
TensorFlow
Tableau
REST APIs
Real-time feeds
~2min latency
Daily refresh
On-demand
Problem It Solved

Built to demonstrate applied analytics methodology on real-world problems — multi-platform attribution, audience LTV cohort modeling, and decision-layer dashboard design — not synthetic or classroom data.

Each project demonstrates a specific measurement discipline: attribution pipeline design, LTV model deployment, audience segmentation, or warehouse architecture built for analytical querying — not just notebooks.

Stack Breakdown
Python / SQLCore
BigQuery / SnowflakeWarehouse
scikit-learn / TensorFlowML
Terraform / CI/CDDevOps
dbt / Data Vault 2.0Transform
Key Projects Inside
01
Streaming ETL Pipeline
Real-time ingestion using Python + BigQuery with schema validation, deduplication, and Slack error alerting. Handles 500K+ events/day.
02
Churn Prediction Model
scikit-learn classification model predicting music subscriber churn with 87% accuracy on holdout set. Deployed as REST API.
03
Snowflake Data Vault 2.0
Full DV2.0 implementation — hubs, links, satellites — for a multi-source music metadata warehouse with historical tracking.
04
TensorFlow Recommender
Collaborative filtering model for track recommendations, trained on 50M listen events. +18% engagement lift in A/B test.
Pipeline — Live Status
spotify_ingest_daily Success
Last run: 03:12 UTC1.2M rows
dbt_transform_core Success
Last run: 03:48 UTC42 models
ml_churn_predict Success
Last run: 04:00 UTC87.3% acc
snowflake_vault_load Running
Started: 06:00 UTCETA 12min
Code Snippet — Snowflake Data Vault Hub Load
-- Hub load pattern: deduplicated, hash-keyed, source-tracked INSERT INTO hub_track (track_hk, track_id, load_dts, rec_src) SELECT MD5(UPPER(TRIM(track_id))) AS track_hk, track_id, CURRENT_TIMESTAMP() AS load_dts, 'spotify_api' AS rec_src FROM stg_spotify_tracks src WHERE NOT EXISTS ( SELECT 1 FROM hub_track h WHERE h.track_hk = MD5(UPPER(TRIM(src.track_id))) );
Contact

Let's connect.

Whether you're building out marketing measurement capabilities, working on attribution modeling or LTV frameworks, or need applied analytics that connects data to real spend and release decisions — reach out.