Applied Data Scientist — Marketing Attribution & Audience Analytics

About

Where measurement science
meets real decisions.

Mohamed Shabeer Fnu

Applied Data Scientist & Analytics Engineer

Franklin, TN — Open to opportunities

Applied data science and analytics professional with 6+ years building measurement systems, predictive models, and attribution frameworks across streaming media, audience analytics, and enterprise data.

From designing multi-platform attribution dashboards and audience LTV models for music clients in Nashville, to leading cross-functional engineering at Bosch, to deploying custom data integrity systems at the Brooklyn DA's Office.

I bridge raw data and real decisions — modeling attribution, forecasting audience value, and surfacing the signal that drives better spend and release decisions.

⬡

Sources

⚙

Transform

❄

Warehouse

◈

Attribution Model

▣

Decision Layer

Stack

BigQueryData

SnowflakeData

SQL / ETL / ELTData

TableauAnalytics

Python / Node.jsDev

TensorFlow / PyTorchML

Terraform / CI/CDDevOps

Experience

Where I've built.

May 2024 — Present

INTJ Analytics
Consulting Co.

Nashville, TN

Technical Solutions & Analytics Specialist

Designed multi-platform attribution models and audience analytics systems for music clients — integrating cross-DSP streaming data, automating BigQuery and Snowflake ingestion pipelines, and building Tableau dashboards used to benchmark release performance against promotional spend, identify high-LTV listener segments, and support release timing decisions.

BigQuerySnowflakeTableauPythonETLPredictive Modeling

Dec 2024 — Jul 2025

Brooklyn District
Attorney's Office

New York, NY

Multimedia Technician — Special Operations

Built a custom exhibit-labeling program that reduced attorney pre-trial prep time. Conducted field evidence retrieval using forensic protocols, produced 360° scene reconstructions with OSCR360, and designed audit-trail systems maintaining full chain-of-custody integrity across active case files.

Custom SoftwareForensicsOSCR360Hardware

Mar 2019 — Nov 2023

Bosch Systems

Braga, Portugal

Solutions Engineer — Manufacturing

Led end-to-end engineering projects across requirements, schedule, and budget — delivering ~30% ahead of schedule. Negotiated vendor contracts and translated stakeholder needs into precise technical requirements through design reviews and change control.

AutoCADSolidWorksProject MgmtSupply Chain

Projects

Selected work.

01 — AnalyticsView Details ↓

Music Analytics Dashboard

Multi-source attribution and audience analytics platform — integrating cross-DSP stream data (Spotify, Apple Music, Amazon Music), modeling release performance attribution across platforms, and surfacing LTV-weighted audience segments to inform promotional budget allocation and release timing decisions.

BigQueryTableauPythonETL

02 — Audit & Data IntegrityView Details ↓

Evidence Chain-of-Custody System

Automated data integrity and audit-trail system for the Brooklyn DA's Special Operations unit — replacing a manual evidence-tagging workflow with a fully logged, schema-enforced system maintaining 100% chain-of-custody integrity across hundreds of exhibits per active case.

Data GovernanceAudit TrailWorkflow Automation

03 — Applied EngineeringView Details ↓

Surveillance Retrieval Rover

End-to-end hardware and software solution for physically inaccessible field data retrieval — demonstrating full-stack systems design from problem definition through field deployment.

Systems DesignHardwareField Ops

04 — Open SourceView Details ↓

Data Engineering Portfolio

Open-source collection of attribution models, audience segmentation pipelines, and predictive analytics frameworks — built on real-world streaming and media schemas, with documented methodology for LTV modeling, multi-source data integration, and automated reporting.

GitHubML / AISnowflakescikit-learn

01 — Analytics

Music Analytics
Dashboard

Multi-platform streaming attribution system — isolating per-DSP performance contribution, modeling audience retention cohorts across releases, and generating the decision-layer data used to allocate promotional spend and time releases. Live demo below tracks "Wish You Back" by Clayton Johnson across its full 108-week run.

BigQuerySnowflakeTableauPythonETL/ELTdbt

Production — Active

10B+

Rows Processed

~40%

Faster Reporting

12+

Dashboards Delivered

Problem It Solved

Music clients had no unified attribution view of their data. Streaming numbers lived in Spotify for Artists, revenue data in distributor portals, audience data in social tools — none of it connected.

Without a unified attribution view, promotional budget was being allocated based on anecdote rather than platform-level contribution data. Labels needed a single source of truth to answer: which campaign drove streams, which DSP is growing the highest-value audience, and where to allocate spend next release cycle.

Role & Process

Data Audit & Source Mapping

Identified all data sources across Spotify, Apple Music, distributors and social APIs — mapped schema inconsistencies across 8 platforms.

Pipeline Architecture

Designed ELT pipelines in BigQuery with dbt transformations, normalising across divergent schemas into a unified analytics layer.

Dashboard Build

Built Tableau dashboards for release performance, audience segmentation, and revenue attribution with drill-down capability.

Automation & Alerting

Scheduled ingestion jobs, anomaly alerts, and weekly automated reports delivered directly to label stakeholders.

Artist Tracker — Clayton Johnson

"Wish You Back" Feb 24, 2024 → Mar 2026

108 Weeks Tracked

Week

Week Of

Feb 29, 2024

This Week

37,771

Cumulative

37,771

FEB 2024 — RELEASE MAR 2026 — PRESENT

Other Singles Released During Period

Code Snippet — BigQuery Ingestion Pipeline

# Automated ingestion — normalises streaming data across platforms
def ingest_streaming_data(platform: str, date_range: dict):
    raw = fetch_platform_data(platform, date_range)

    # Normalise column names across Spotify / Apple / TIDAL schemas
    df = raw.rename(columns=SCHEMA_MAP[platform])
    df['stream_date'] = pd.to_datetime(df['stream_date'])
    df['revenue_usd'] = df['revenue_local'] * FX_RATES[df['currency']]

    # Deduplicate on (track_id, stream_date, market)
    df = df.drop_duplicates(subset=['track_id', 'stream_date', 'market'])

    # Load into BigQuery partitioned table
    bq_client.load_table_from_dataframe(
        df, f"analytics.streaming_{platform}_clean",
        job_config=bigquery.LoadJobConfig(write_disposition="WRITE_APPEND")
    )
    return df

DSP Distribution — Randy Savvy

Platform breakdown by single

15 Singles

Total Streams

—

Leading Platform

—

Release Date

—

02 — Audit & Data Integrity

Evidence Chain-of-
Custody System

Automated data integrity and audit-trail system for the Brooklyn DA's Special Operations unit — replacing a manual evidence-tagging workflow with a fully logged, schema-enforced system. Demonstrates the same data provenance and lineage design principles that underpin reliable attribution infrastructure.

Custom SoftwareForensic ProtocolsWorkflow AutomationOSCR360

Deployed — Brooklyn DA

↓60%

Pre-Trial Prep Time

100%

Chain-of-Custody Integrity

Mislabeled Exhibits

Problem It Solved

Prosecutors were manually labeling hundreds of exhibits per trial using handwritten tags and spreadsheets. The process was slow, inconsistent, and legally risky — any gap in data provenance can compromise an entire case.

The unit needed a system that enforced schema integrity, automated audit logging, and maintained full data lineage from retrieval through courtroom presentation — the same class of problems that make attribution infrastructure reliable in any data-critical environment.

Role & Process

Workflow Audit

Shadowed prosecutors and technicians to map every step of the existing evidence handling process and identify failure points.

System Design

Designed a labeling schema covering exhibit ID, retrieval location, timestamp, handler, and full custody transfer log.

Build & Test

Built the program, integrated with OSCR360 capture workflow, and tested rigorously against live case files before deployment.

Training & Deployment

Trained prosecutors, detectives, and technicians. System deployed across active felony cases in Brooklyn.

System — Custody Chain Live View

System Online

Case BKN-2024-0391

Exhibits 0

Integrity 100%

RETRIEVED Mar 14 — 07:22 UTC

3x Surveillance MP4 — Business premises, 147 Atlantic Ave

SHA256: a3f9c2...4d81e7  ·  Handler: M.Shabeer

TAGGED & LABELED Mar 14 — 08:05 UTC

Exhibit IDs assigned: EXH-391-001 through EXH-391-003

Auto-labeled via system  ·  Metadata locked

TRANSFERRED Mar 14 — 09:42 UTC

M.Shabeer → ADA Rodriguez — Pre-trial package

Hash verified at transfer  ·  No tampering detected

ADMITTED — COURT Mar 19 — 10:15 UTC

All 3 exhibits admitted. Chain-of-custody verified by judge.

✓ Integrity confirmed  ·  Case: State v. [Redacted]

File Integrity

Code Snippet — Exhibit Labeling Engine

# Auto-generates exhibit labels with full custody metadata
class ExhibitLabel:
    def __init__(self, case_id: str, retrieval_data: dict):
        self.exhibit_id = generate_exhibit_id(case_id)
        self.timestamp  = datetime.utcnow().isoformat()
        self.handler    = retrieval_data['retrieved_by']
        self.location   = retrieval_data['source_address']
        self.file_hash  = sha256_file(retrieval_data['file_path'])
        self.chain      = [{
            'action': 'RETRIEVED', 'by': self.handler,
            'at': self.timestamp,  'hash': self.file_hash
        }]

    def transfer(self, to_handler: str):
        # Appends to immutable chain log — cannot be edited
        self.chain.append({
            'action': 'TRANSFERRED', 'from': self.handler,
            'to': to_handler,
            'at': datetime.utcnow().isoformat()
        })
        self.handler = to_handler

03 — Applied Engineering

Surveillance
Retrieval Rover

A custom-engineered rover designed and built from scratch to retrieve surveillance footage from physically inaccessible or hazardous locations during active field evidence operations.

Hardware EngineeringField OperationsCustom BuildElectronics

Deployed — Field Operations

100%

Mission Success Rate

Officer Exposures

Cases Supported

Problem It Solved

Evidence retrieval near active crime scenes often required officers or technicians to physically enter unsafe or inaccessible areas — narrow crawl spaces, compromised structures, or active scene perimeters.

Standard equipment couldn't fit or couldn't be safely operated. The unit needed a remote-operated solution that could navigate tight spaces, capture footage, and extract device data without human exposure.

Role & Process

Requirement Definition

Worked with detectives to define size constraints, payload requirements, terrain types, and camera specs needed in the field.

Mechanical Design

Designed chassis in AutoCAD — low-profile, all-terrain treads, modular camera mount, cable management system.

Electronics & Control

Integrated motor controllers, wireless transmission module, camera capture, and remote operation interface.

Field Testing & Deployment

Tested in simulated environments, refined based on detective feedback, and deployed to active field operations.

3D Model — Interactive

Drag to rotate

Technical Specifications

⚙ChassisCustom — Low Profile All-Terrain

📡Control Range50m Wireless

📷Camera1080p + Night Vision

🔋Battery Life~90 Min Field Op

📦Clearance Height6 in — crawl space

📶Drive System4WD All-Terrain Treads

🌍Weight~3.2 kg Field Ready

⚡PowerLiPo 5200mAh Pack

04 — Open Source

Data Engineering
Portfolio

PythonSnowflakeBigQueryscikit-learnTensorFlowdbtTerraform

Open Source — Public

Repositories

ML Models

Model Accuracy

Cloud Platforms

Architecture — End-to-End Data Pipeline

Sources

Spotify API
Apple Music
Distributors
Social APIs

▶

Ingest

Python ETL
Schema validate
Deduplication
Error alerts

▶

Warehouse

Snowflake
BigQuery
dbt models
Data Vault 2.0

▶

ML / BI

scikit-learn
TensorFlow
Tableau
REST APIs

Real-time feeds

~2min latency

Daily refresh

On-demand

Problem It Solved

Built to demonstrate applied analytics methodology on real-world problems — multi-platform attribution, audience LTV cohort modeling, and decision-layer dashboard design — not synthetic or classroom data.

Each project demonstrates a specific measurement discipline: attribution pipeline design, LTV model deployment, audience segmentation, or warehouse architecture built for analytical querying — not just notebooks.

Stack Breakdown

Python / SQLCore

BigQuery / SnowflakeWarehouse

scikit-learn / TensorFlowML

Terraform / CI/CDDevOps

dbt / Data Vault 2.0Transform

Key Projects Inside

Streaming ETL Pipeline

Real-time ingestion using Python + BigQuery with schema validation, deduplication, and Slack error alerting. Handles 500K+ events/day.

Churn Prediction Model

scikit-learn classification model predicting music subscriber churn with 87% accuracy on holdout set. Deployed as REST API.

Snowflake Data Vault 2.0

Full DV2.0 implementation — hubs, links, satellites — for a multi-source music metadata warehouse with historical tracking.

TensorFlow Recommender

Collaborative filtering model for track recommendations, trained on 50M listen events. +18% engagement lift in A/B test.

Pipeline — Live Status

spotify_ingest_daily Success

Last run: 03:12 UTC1.2M rows

dbt_transform_core Success

Last run: 03:48 UTC42 models

ml_churn_predict Success

Last run: 04:00 UTC87.3% acc

snowflake_vault_load Running

Started: 06:00 UTCETA 12min

Code Snippet — Snowflake Data Vault Hub Load

-- Hub load pattern: deduplicated, hash-keyed, source-tracked
INSERT INTO hub_track (track_hk, track_id, load_dts, rec_src)
SELECT
    MD5(UPPER(TRIM(track_id)))     AS track_hk,
    track_id,
    CURRENT_TIMESTAMP()          AS load_dts,
    'spotify_api'                 AS rec_src
FROM stg_spotify_tracks src
WHERE NOT EXISTS (
    SELECT 1 FROM hub_track h
    WHERE h.track_hk = MD5(UPPER(TRIM(src.track_id)))
);

Where measurement sciencemeets real decisions.

Where I've built.

Selected work.

Let's connect.

Where measurement science
meets real decisions.