Client: Retail logistics platform · 800+ stores across India · ~$180M GMV
Engagement: 12-week lakehouse build · November 2025–February 2026
The Challenge
Fragmented data infrastructure: 14 source systems, 12-hour data latency, no unified analytics layer. Operations teams making decisions on stale data. The 2025 holiday season exposed the cost — $2M+ in inventory mismatches.
Approach
Weeks 1–2 · Audit + use-case prioritization
Inventoried 14 source systems. Workshops with operations + finance leadership to rank the top 8 use cases by P&L impact. Real-time inventory and demand forecasting led the list.
Weeks 3–5 · Streaming pipeline
Apache Kafka with schema registry. Debezium CDC connectors against MySQL/Postgres source DBs. Delta Lake on S3 (Bronze/Silver/Gold) for unified storage. End-to-end exactly-once semantics.
Weeks 6–9 · Transformations + serving
140 dbt models for the analytics layer. Snowflake as BI-facing query engine via Iceberg external tables. Power BI dashboards for store managers; refresh via Direct Lake.
Weeks 10–12 · Productionization
Grafana + PagerDuty monitoring with SLO-based alerting. Team enablement workshops. Runbooks. Pipeline tested with simulated holiday-volume traffic.
Sample: Inventory Reconciliation dbt Model
-- models/marts/inventory_realtime.sql
{{ config(
materialized='incremental',
unique_key=['store_id', 'sku'],
on_schema_change='sync_all_columns',
incremental_strategy='merge'
) }}
with pos_movements as (
select
store_id,
sku,
sum(quantity_change) as net_movement,
max(event_ts) as last_event
from {{ ref('silver_pos_events') }}
{% if is_incremental() %}
where event_ts > (select max(last_event) from {{ this }})
{% endif %}
group by 1, 2
),
current_inventory as (
select
store_id,
sku,
on_hand_qty,
reserved_qty,
on_hand_qty - reserved_qty as available_qty,
last_count_ts
from {{ ref('silver_inventory_snapshots') }}
)
select
coalesce(p.store_id, c.store_id) as store_id,
coalesce(p.sku, c.sku) as sku,
c.available_qty,
p.net_movement,
c.available_qty + coalesce(p.net_movement, 0) as projected_qty,
case
when c.available_qty + coalesce(p.net_movement, 0) < 0 then 'OVERSOLD'
when c.available_qty + coalesce(p.net_movement, 0) < 5 then 'LOW'
else 'OK'
end as status,
greatest(p.last_event, c.last_count_ts) as last_updated
from pos_movements p
full outer join current_inventory c
on p.store_id = c.store_id and p.sku = c.sku
Results
| Metric | Before | After |
|---|---|---|
| Data freshness | 12 hours | <2 minutes |
| Events / day processed | ~250K (batch) | 2M+ (streaming) |
| Operational query p95 | 4–8s | <50ms |
| Pipeline uptime (90 days) | — | 99.99% |
“We went from 12-hour stale data to real-time operational dashboards in under 12 weeks. The platform Ohveda built is the foundation of every decision we make today.” — Rahul Sharma, Head of Engineering