MSF
Initialising
Franklin, TN — Available for opportunities
Mohamed
Shabeer

Data engineer & analytics specialist.
6+ years turning raw data into systems
that scale, predict, and persist.

Scroll
6+
Years Experience
3
Industries Spanned
12+
Systems Shipped
10B+
Rows Processed
About

Where data meets
engineering precision.

Mohamed Shabeer
Mohamed Shabeer Fnu
Data Engineer & Analytics Specialist
Franklin, TN — Open to opportunities

Technical solutions and analytics professional with 6+ years across data engineering, manufacturing operations, and forensic evidence handling.

From building scalable BigQuery & Snowflake pipelines for music clients in Nashville, to leading cross-functional engineering at Bosch, to deploying custom forensic tools at the Brooklyn DA's Office.

I bridge raw data and real decisions — building systems that don't just report, but predict, scale, and persist.

Ingest
ETL
Warehouse
Model
Visualize
Stack
BigQueryData
SnowflakeData
SQL / ETL / ELTData
TableauAnalytics
Python / Node.jsDev
TensorFlow / PyTorchML
Terraform / CI/CDDevOps
AutoCAD / SolidWorksEng
Experience

Where I've built.

May 2024 — Present
INTJ Analytics
Consulting Co.
Nashville, TN
Technical Solutions & Analytics Specialist
Built full-stack analytics and reporting systems for music clients — cleaning and normalizing large datasets, automating ingestion workflows in BigQuery and Snowflake, and developing Tableau dashboards tracking audience growth, streaming performance, and revenue.
BigQuerySnowflakeTableauPythonETLPredictive Modeling
Dec 2024 — Jul 2025
Brooklyn District
Attorney's Office
New York, NY
Multimedia Technician — Special Operations
Built a custom exhibit-labeling program that reduced attorney pre-trial prep time. Conducted field evidence retrieval using forensic protocols, produced 360° scene reconstructions with OSCR360, and engineered a custom rover for hard-to-reach surveillance retrieval.
Custom SoftwareForensicsOSCR360Hardware
Mar 2019 — Nov 2023
Bosch Systems
Braga, Portugal
Solutions Engineer — Manufacturing
Led end-to-end engineering projects across requirements, schedule, and budget — delivering ~30% ahead of schedule. Negotiated vendor contracts and translated stakeholder needs into precise technical requirements through design reviews and change control.
AutoCADSolidWorksProject MgmtSupply Chain
Projects

Selected work.

01 — AnalyticsView Details ↓
Music Analytics Dashboard
Full-stack reporting system tracking audience growth, streaming performance, and revenue with automated BigQuery pipelines and real-time Tableau dashboards.
BigQueryTableauPythonETL
02 — ForensicsView Details ↓
Evidence Chain-of-Custody System
Custom exhibit-labeling software for the Brooklyn DA's Office. Streamlined tagging, chain-of-custody tracking, and significantly cut attorney pre-trial preparation time.
Custom SoftwareWorkflowForensics
03 — HardwareView Details ↓
Surveillance Retrieval Rover
Designed and built a custom rover for retrieving surveillance footage from physically inaccessible locations during active field evidence operations at crime scenes.
HardwareEngineeringField Ops
04 — Open SourceView Details ↓
Data Engineering Portfolio
Public collection of production-grade data pipelines, analytics tools, and ML experiments across Snowflake, BigQuery, scikit-learn, and TensorFlow.
GitHubML / AISnowflakescikit-learn
01 — Analytics
Music Analytics
Dashboard

A full-stack data platform built for music industry clients — transforming raw streaming data into actionable decisions on audience, revenue, and campaign strategy.

BigQuerySnowflakeTableauPythonETL/ELTdbt
Production — Active
10B+
Rows Processed
~40%
Faster Reporting
12+
Dashboards Delivered
Problem It Solved

Music clients had no unified view of their data. Streaming numbers lived in Spotify for Artists, revenue data in distributor portals, audience data in social tools — none of it connected.

Labels needed a single source of truth that could answer: which campaign drove streams, which market is growing, and where to allocate budget next release cycle.

Role & Process
01
Data Audit & Source Mapping
Identified all data sources across Spotify, Apple Music, distributors and social APIs — mapped schema inconsistencies across 8 platforms.
02
Pipeline Architecture
Designed ELT pipelines in BigQuery with dbt transformations, normalising across divergent schemas into a unified analytics layer.
03
Dashboard Build
Built Tableau dashboards for release performance, audience segmentation, and revenue attribution with drill-down capability.
04
Automation & Alerting
Scheduled ingestion jobs, anomaly alerts, and weekly automated reports delivered directly to label stakeholders.
Live Dashboard — Streaming & Revenue Analytics
0
Total Streams
$0
Gross Revenue
0%
MoM Growth
0
Markets
Monthly Streams
Streams Revenue
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Platform Split
Spotify
58%
Apple
24%
TIDAL
11%
Other
7%
Top Markets
United States38%
United Kingdom18%
UAE12%
Canada9%
Rest of World23%
Fastest Segment
18–24 Female
+22% MoM
Pipeline Runs
Daily — 03:00
Last run: 6h ago
Data Freshness
Real-time
BigQuery live feed
Code Snippet — BigQuery Ingestion Pipeline
# Automated ingestion — normalises streaming data across platforms def ingest_streaming_data(platform: str, date_range: dict): raw = fetch_platform_data(platform, date_range) # Normalise column names across Spotify / Apple / TIDAL schemas df = raw.rename(columns=SCHEMA_MAP[platform]) df['stream_date'] = pd.to_datetime(df['stream_date']) df['revenue_usd'] = df['revenue_local'] * FX_RATES[df['currency']] # Deduplicate on (track_id, stream_date, market) df = df.drop_duplicates(subset=['track_id', 'stream_date', 'market']) # Load into BigQuery partitioned table bq_client.load_table_from_dataframe( df, f"analytics.streaming_{platform}_clean", job_config=bigquery.LoadJobConfig(write_disposition="WRITE_APPEND") ) return df
02 — Forensics
Evidence Chain-of-
Custody System

Custom software built for the Brooklyn District Attorney's Office Special Operations unit — replacing a manual, error-prone evidence labeling workflow with a fast, auditable digital system.

Custom SoftwareForensic ProtocolsWorkflow AutomationOSCR360
Deployed — Brooklyn DA
↓60%
Pre-Trial Prep Time
100%
Chain-of-Custody Integrity
0
Mislabeled Exhibits
Problem It Solved

Prosecutors were manually labeling hundreds of exhibits per trial using handwritten tags and spreadsheets. The process was slow, inconsistent, and legally risky — a mislabeled exhibit can compromise an entire case.

The Special Operations unit needed a system that could tag, track, and audit every piece of digital evidence from retrieval through courtroom presentation with a full chain-of-custody log.

Role & Process
01
Workflow Audit
Shadowed prosecutors and technicians to map every step of the existing evidence handling process and identify failure points.
02
System Design
Designed a labeling schema covering exhibit ID, retrieval location, timestamp, handler, and full custody transfer log.
03
Build & Test
Built the program, integrated with OSCR360 capture workflow, and tested rigorously against live case files before deployment.
04
Training & Deployment
Trained prosecutors, detectives, and technicians. System deployed across active felony cases in Brooklyn.
System — Custody Chain Live View
System Online
Case BKN-2024-0391
Exhibits 0
Integrity 100%
RETRIEVED Mar 14 — 07:22 UTC
3x Surveillance MP4 — Business premises, 147 Atlantic Ave
SHA256: a3f9c2...4d81e7  ·  Handler: M.Shabeer
TAGGED & LABELED Mar 14 — 08:05 UTC
Exhibit IDs assigned: EXH-391-001 through EXH-391-003
Auto-labeled via system  ·  Metadata locked
TRANSFERRED Mar 14 — 09:42 UTC
M.Shabeer → ADA Rodriguez — Pre-trial package
Hash verified at transfer  ·  No tampering detected
ADMITTED — COURT Mar 19 — 10:15 UTC
All 3 exhibits admitted. Chain-of-custody verified by judge.
✓ Integrity confirmed  ·  Case: State v. [Redacted]
File Integrity
0%
Code Snippet — Exhibit Labeling Engine
# Auto-generates exhibit labels with full custody metadata class ExhibitLabel: def __init__(self, case_id: str, retrieval_data: dict): self.exhibit_id = generate_exhibit_id(case_id) self.timestamp = datetime.utcnow().isoformat() self.handler = retrieval_data['retrieved_by'] self.location = retrieval_data['source_address'] self.file_hash = sha256_file(retrieval_data['file_path']) self.chain = [{ 'action': 'RETRIEVED', 'by': self.handler, 'at': self.timestamp, 'hash': self.file_hash }] def transfer(self, to_handler: str): # Appends to immutable chain log — cannot be edited self.chain.append({ 'action': 'TRANSFERRED', 'from': self.handler, 'to': to_handler, 'at': datetime.utcnow().isoformat() }) self.handler = to_handler
03 — Hardware
Surveillance
Retrieval Rover

A custom-engineered rover designed and built from scratch to retrieve surveillance footage from physically inaccessible or hazardous locations during active field evidence operations.

Hardware EngineeringField OperationsCustom BuildElectronics
Deployed — Field Operations
100%
Mission Success Rate
0
Officer Exposures
3+
Cases Supported
Problem It Solved

Evidence retrieval near active crime scenes often required officers or technicians to physically enter unsafe or inaccessible areas — narrow crawl spaces, compromised structures, or active scene perimeters.

Standard equipment couldn't fit or couldn't be safely operated. The unit needed a remote-operated solution that could navigate tight spaces, capture footage, and extract device data without human exposure.

Role & Process
01
Requirement Definition
Worked with detectives to define size constraints, payload requirements, terrain types, and camera specs needed in the field.
02
Mechanical Design
Designed chassis in AutoCAD — low-profile, all-terrain treads, modular camera mount, cable management system.
03
Electronics & Control
Integrated motor controllers, wireless transmission module, camera capture, and remote operation interface.
04
Field Testing & Deployment
Tested in simulated environments, refined based on detective feedback, and deployed to active field operations.
3D Model — Interactive
Drag to rotate
Technical Specifications
ChassisCustom — Low Profile All-Terrain
📡Control Range50m Wireless
📷Camera1080p + Night Vision
🔋Battery Life~90 Min Field Op
📦Clearance Height6 in — crawl space
📶Drive System4WD All-Terrain Treads
🌍Weight~3.2 kg Field Ready
PowerLiPo 5200mAh Pack
04 — Open Source
Data Engineering
Portfolio

Production-grade data pipelines, ML models, and warehouse architectures — built on real-world schemas, edge cases, and performance constraints, not toy datasets.

PythonSnowflakeBigQueryscikit-learnTensorFlowdbtTerraform
Open Source — Public
0
Repositories
0
ML Models
0%
Model Accuracy
0
Cloud Platforms
Architecture — End-to-End Data Pipeline
Sources
Spotify API
Apple Music
Distributors
Social APIs
Ingest
Python ETL
Schema validate
Deduplication
Error alerts
Warehouse
Snowflake
BigQuery
dbt models
Data Vault 2.0
ML / BI
scikit-learn
TensorFlow
Tableau
REST APIs
Real-time feeds
~2min latency
Daily refresh
On-demand
Problem It Solved

Most data engineering portfolios are theoretical or toy datasets. This collection is built on real-world patterns — schemas, edge cases, and performance constraints from actual production work.

Each project demonstrates a specific engineering discipline: pipeline design, model deployment, data quality, or warehouse architecture — not just notebooks.

Stack Breakdown
Python / SQLCore
BigQuery / SnowflakeWarehouse
scikit-learn / TensorFlowML
Terraform / CI/CDDevOps
dbt / Data Vault 2.0Transform
Key Projects Inside
01
Streaming ETL Pipeline
Real-time ingestion using Python + BigQuery with schema validation, deduplication, and Slack error alerting. Handles 500K+ events/day.
02
Churn Prediction Model
scikit-learn classification model predicting music subscriber churn with 87% accuracy on holdout set. Deployed as REST API.
03
Snowflake Data Vault 2.0
Full DV2.0 implementation — hubs, links, satellites — for a multi-source music metadata warehouse with historical tracking.
04
TensorFlow Recommender
Collaborative filtering model for track recommendations, trained on 50M listen events. +18% engagement lift in A/B test.
Pipeline — Live Status
spotify_ingest_daily Success
Last run: 03:12 UTC1.2M rows
dbt_transform_core Success
Last run: 03:48 UTC42 models
ml_churn_predict Success
Last run: 04:00 UTC87.3% acc
snowflake_vault_load Running
Started: 06:00 UTCETA 12min
Code Snippet — Snowflake Data Vault Hub Load
-- Hub load pattern: deduplicated, hash-keyed, source-tracked INSERT INTO hub_track (track_hk, track_id, load_dts, rec_src) SELECT MD5(UPPER(TRIM(track_id))) AS track_hk, track_id, CURRENT_TIMESTAMP() AS load_dts, 'spotify_api' AS rec_src FROM stg_spotify_tracks src WHERE NOT EXISTS ( SELECT 1 FROM hub_track h WHERE h.track_hk = MD5(UPPER(TRIM(src.track_id))) );
Contact

Let's connect.

Whether you have a data engineering challenge, an analytics role, or just want to talk pipelines — reach out.