Data Modeling 🚧
🚧 Roadmap Item - Unified data-modelling builds on our existing data contract foundation to provide a comprehensive approach to industrial data modeling.
UMH Core's unified data-modelling system provides a structured approach to defining, validating, and processing industrial data. It bridges the gap between raw sensor data and meaningful business information through a clear hierarchy of components.
Why Data Modeling Matters
Manufacturing companies typically start with implicit data modeling - using bridges to contextualize data factory by factory. This bottom-up approach works well for single sites: look at what's available in your PLC/Kepware, add basic metadata, rename cryptic tags like XYDAG324
to temperature
, and publish to the UNS.
But as companies scale across multiple factories, they hit a wall:
Inconsistent schemas: Each site names the same equipment differently (
motor_speed
vsrpm
vsrotational_velocity
)No standardization: Pump data from Factory A has different fields than identical pumps in Factory B
Analytics nightmares: Cross-site dashboards and analytics require custom mapping for every location
Knowledge silos: Each site's contextualization is trapped in local configurations
Explicit data modeling solves this by defining standardized templates that enforce consistency across the entire enterprise. Instead of each factory doing its own contextualization, you define once: "Every Pump has these exact fields: pressure
, temperature
, motor.current
, motor.rpm
" - then apply that template everywhere.
From Implicit to Explicit
Implicit (Current Bridges)
Per-factory contextualization
Quick setup, site-specific optimization
Inconsistent across sites, no templates
Explicit (Data Modeling)
Enterprise-wide standardization
Consistent schemas, reusable templates, cross-site analytics
Requires upfront design, more rigid
UMH's unified data-modelling bridges this gap: keep the flexibility of per-site bridges for raw data collection, but add explicit modeling on top for enterprise standardization.
Object Hierarchy
The unified data-modelling system uses a four-layer hierarchy:
Canonical schema fragment (timeseries default)
timeseries
, blob
Reusable class; tree of fields, folders, sub-models
Motor
, Pump
, Temperature
Binds model version; decides retention & sinks
_temperature:v1
, _pump:v1
Runtime pipeline for model instances
furnaceTemp_sp
, pump41_sp
Quick Example
Here's how the system transforms raw PLC data into structured, validated information:
1. Raw Data Input
2. Data Model Definition
3. Data Contract
4. Stream Processor
5. Structured Output
Key Benefits
Unified YAML Dialect: Single configuration language for all transformations
Generic ISA-95 Support: Built-in hierarchical naming (level0-4)
Schema Registry Integration: All layers pushed to Redpanda Schema Registry
Automatic Validation: UNS output plugin rejects non-compliant messages
Sub-Model Reusability: Define once, use across multiple assets
Enterprise Reliability: Combines MQTT simplicity with data-center-grade features
Generic Hierarchical Support: Built-in hierarchical naming (level0-4+) supports ISA-95, KKS, or custom standards
Getting Started
Architecture Context
This unified approach builds on UMH's hybrid architecture, combining:
MQTT for lightweight edge communication
Kafka for reliable enterprise messaging
Data Contracts for application-level guarantees
Schema Registry for centralized validation
Related Documentation
Last updated