How We Built a Data Pipeline for a Dutch Logistics Company

When a mid-sized logistics company in the Randstad reached out to us, their problem was not a lack of data. It was the opposite. Data was everywhere — in warehouse management systems, in spreadsheets emailed between depots, in ERP exports nobody fully trusted. The challenge was not collecting information. It was making it usable.

This is how we helped them build a custom data pipeline that turned fragmented operational data into something their team could actually act on.

The situation

The company operates three warehouses across the Netherlands, handling distribution for retail and e-commerce clients. They process thousands of shipments per day. On paper, they had good systems. A WMS for warehouse operations. An ERP for finance. A TMS for transport planning. Each system worked fine on its own.

The problem was in between. Data lived in silos. Getting a single view of daily performance — how many orders were picked, packed, shipped, and delivered on time — required someone to manually pull reports from three different tools, paste them into Excel, and cross-reference numbers. That process took half a day, every day.

By the time leadership had the numbers, they were already stale. Decisions about staffing, routing, and capacity were based on yesterday’s reality, not today’s.

What they needed

The logistics team did not need another dashboard tool. They had tried two. Both required too much manual configuration and could not connect to their legacy WMS without custom middleware.

What they needed was a data pipeline — a system that automatically pulls data from their existing tools, transforms it into a consistent format, and feeds it into a single source of truth. From there, they could build dashboards, alerts, and reports that actually reflected what was happening in real time.

The requirements were clear:

Ingest data from WMS, ERP, and TMS (a mix of APIs, SFTP exports, and database connections)
Normalize shipment, inventory, and delivery data into a unified schema
Run transformations every 15 minutes during operating hours
Expose clean data through an internal API for dashboards and alerting
Handle errors gracefully — if one source fails, the rest should keep running

How we built it

We started with a one-week discovery sprint. We mapped every data source, documented the formats, and identified the key metrics the operations team cared about. This upfront work saved weeks of back-and-forth later.

The pipeline architecture was intentionally simple. We used Python for the ingestion and transformation layers, Apache Airflow for orchestration, and PostgreSQL as the central data store. No data lake. No Spark cluster. The scale did not require it, and overengineering would have made the system harder to maintain with their small IT team.

Ingestion

Each data source got its own connector. The WMS exposed a REST API, so we built a lightweight Python client that polls for new records. The ERP exported CSV files to an SFTP server every hour — we wrote a watcher that picks up new files, validates the schema, and loads them. The TMS had a PostgreSQL database we could query directly with read-only credentials.

Every connector follows the same pattern: extract, validate, load into a staging table. If validation fails, the record gets flagged and logged, but the pipeline keeps running. This was critical. In logistics, a partial view is still better than no view.

Transformation

Raw data from three systems uses different identifiers, date formats, and status codes. A shipment marked as "completed" in the WMS might be "delivered" in the TMS and "invoiced" in the ERP. We built a mapping layer that translates these into a single, consistent status model.

The transformation step also calculates derived metrics: order-to-ship time, pick accuracy, delivery success rate, and warehouse utilisation by zone. These are the numbers the operations team actually uses to make decisions.

Serving

Clean, transformed data lands in a set of purpose-built PostgreSQL views. We exposed these through a lightweight FastAPI service that the front-end team used to build their internal dashboard. We also set up automated alerts — if pick accuracy drops below 98% or a depot falls behind on shipments, the warehouse manager gets a Slack notification within 15 minutes.

What changed

The daily reporting process went from four hours of manual work to zero. The operations team now starts every morning with a live dashboard that shows exactly where things stand across all three locations.

Within the first month, they identified a recurring bottleneck at one depot that had been invisible in the old reports. A specific loading dock was consistently causing delays during the afternoon shift. The fix was simple — a scheduling adjustment — but they would never have spotted the pattern without the data.

More importantly, the system is maintainable. Their internal IT team can add new data sources, adjust transformations, and extend the dashboard without calling us every time. We designed it that way on purpose. Custom software development should make your team more capable, not more dependent.

Lessons from the project

A few things we reinforced during this project:

Start with the questions, not the data. We mapped the decisions the team needed to make before touching any code. That kept the scope tight and the outcome useful.
Simple beats clever. Python, PostgreSQL, and Airflow. No exotic stack. The operations team can understand and maintain it. That matters more than architectural elegance.
Partial data is still valuable. The pipeline handles failures per source, not globally. If one connector goes down, the rest keep feeding. Perfect is the enemy of operational.
Build for the team you are handing it to. We documented everything, paired with their developers during the build, and made sure they could extend the system independently.

Thinking about your own data challenges?

If your team spends more time collecting and formatting data than actually using it, that is a solvable problem. Data pipeline development in the Netherlands does not have to mean a massive enterprise project. Sometimes the right solution is a focused, well-built pipeline that connects what you already have.

At Emplex, we specialise in custom software development that fits your team and your scale. No overengineering. No vendor lock-in. Just software that works.

Let’s talk about your project.