Data Modeling
Data modeling in UMH Core transforms device data into business-ready information through structured schemas and validation.
The Component Chain
Payload Shapes → Data Models → Data Contracts → Data Flows
↓ ↓ ↓ ↓
Value types Structure Enforcement Execution
Each component builds on the previous:
Payload Shapes define what types of values are allowed
Data Models use shapes to create hierarchical structure
Data Contracts enforce models at runtime
Data Flows execute with or without contracts
What You Can Model
In any UNS topic, data modeling controls specific portions:
umh.v1.enterprise.site.area.line._contract.virtual.path.name
└───────── fixed ─────────┘ └── modeled ──┘
Fixed: Location hierarchy comes from bridge configuration
Modeled: Everything after the contract is defined by your data model
Virtual path: Organizational folders (e.g.,
motor.electrical
)Name: The actual data point (e.g.,
current
)
See Topic Convention for complete structure.
Data Flow Patterns
Data can follow these patterns based on your needs:
Device Language (_raw)
Start by exploring your equipment data with original naming:
Device → Bridge → _raw → Topic Browser
Example: umh.v1.enterprise.chicago.line-1.pump._raw.DB1.DW20
Exploration, debugging, quick connectivity
Site engineers who know the PLC addressing
Original tags preserved (e.g.,
s7_address: "DB1.DW20"
)
Device Models
Apply business naming directly in bridges:
Device → Bridge + Model → _pump_v1 → Applications
(one step)
Example: umh.v1.enterprise.chicago.line-1.pump._pump_v1.inlet_temperature
Consistent naming across equipment types
Operations teams, local dashboards
Most implementations start here
Business Models
Transform device data into business KPIs:
Multiple _pump_v1 → Stream Processor → _maintenance_v1 → Enterprise Apps
Example: umh.v1.enterprise.chicago._maintenance_v1.work_orders.create
Aggregated metrics, business records
Enterprise systems, management dashboards
Required when scaling across sites
The Two-Layer Architecture
Sites and HQ both need their view of the same data. This isn't a choice between approaches - it's about enabling both layers to work together:
Layer 1: Device Models (Data Structure Within Equipment)
Define WHAT data points exist in equipment (temperature, vibration)
NOT the organizational structure (that's location_path)
Sites maintain control of their data definitions
Original tags preserved in metadata
Applied directly in bridges (one step)
Primarily time-series data
Result: Sites trust the system because they built it
Layer 2: Business Models (Enterprise Metrics)
Created in TWO ways:
Aggregation: Stream processors combine device models into KPIs
Direct: Bridges connect to ERP/MES systems for business data
Creates consistent metrics across sites
Doesn't disturb site operations
Multiple departments can create their own views
Primarily relational data
Result: Everyone gets the metrics they need
The key: Device models describe equipment internals, business models describe enterprise needs.
Why Both Layers Matter
Common failure patterns:
Only device models: Chaos across multiple sites, no standardization
Only business models: Sites lose control, engineers reject the system
The success formula: Sites own their data structure (device models), everyone creates their views (business models). This is why stream processors exist - to bridge these layers without forcing change on sites.
Key Concepts
Location Path vs Device Model
Location Path: WHERE equipment sits in your organization
Example:
enterprise.chicago.packaging.line-1.pump-01
Defined in bridge configuration
This is your factory hierarchy
Device Model: WHAT data points exist within that equipment
Example:
_pump_v1.temperature
,_pump_v1.vibration.x-axis
Defines internal data structure of a single device
NOT the organizational structure
Name vs Tag
Name: The data point identifier in UNS topics
Tag: Industry term for a sensor/data point
We use "name" for broader applicability (not just time-series)
Virtual Path
Organizational folders within your data model:
Example:
motor.electrical
,diagnostics.vibration
Purpose: Groups related data points logically within a device
Data Contract vs Data Model
Data Model: Defines structure (template)
Data Contract: Enforces structure (runtime validation)
Example: Creating
pump
model auto-creates_pump_v1
contract
Time-Series vs Relational
Time-Series: Single value with timestamp
Example:
{"timestamp_ms": 1733904005123, "value": 42.5}
Relational: Multiple fields in one message
Example: Work order with 10 fields
See Payload Formats for details.
Processing Methods
In Bridges:
tag_processor
: For time-series data from PLCs/sensorsnodered_js
: For relational data from ERP/MES systems
In Stream Processors:
JavaScript expressions: For aggregating and transforming data
Example:
total: "sensor1 + sensor2 + sensor3"
Choosing Your Data Flow
Based on your data source, choose the appropriate path:
PLC/Sensor
Device Model
tag_processor
Direct to _pump_v1
PLC/Sensor
Raw
tag_processor
Direct to _raw (exploration)
ERP/MES
Business Model
nodered_js
Direct to _workorder_v1
Multiple Device Models
Business Model
-
Via Stream Processor
Data Type Alignment
Device Models: 90% time-series data
Temperature, pressure, vibration readings
Status indicators, counters, running hours
Process with:
tag_processor
in bridges
Business Models: 90% relational data
Work orders, maintenance schedules
Production batches, quality reports
Process with:
nodered_js
in bridges OR aggregate via stream processors
Implementation Patterns
Pattern 1: Equipment Monitoring
PLC tags → Bridge + tag_processor → Device Model (_pump_v1)
Location: enterprise.site.line.pump-01
Model adds: .temperature, .pressure, .status
When: Connecting industrial equipment with time-series data
Pattern 2: ERP Integration
SAP work orders → Bridge + nodered_js → Business Model (_workorder_v1)
When: Connecting business systems with relational data
Pattern 3: Multi-Site Aggregation
site1._pump_v1 ─┐
site2._pump_v1 ─├→ Stream Processor → _maintenance_v1
site3._pump_v1 ─┘ (JavaScript expressions)
When: Creating KPIs from multiple device models
Pattern 4: Exploration First
Unknown PLC → Bridge + tag_processor → _raw → Topic Browser
↓
Design device model
↓
Apply in bridge
When: Understanding new equipment before modeling
Why Models Are Immutable
The scenario: A data scientist builds a dashboard using _pump_v1
for 500 pumps across 10 sites.
Without immutability: Someone modifies _pump_v1
:
Dashboard breaks - fields renamed
Historical queries fail - structure changed
Other sites stop working
Weekend ruined
With immutability:
Create
_pump_v2
with changesTest thoroughly
Migrate gradually
Deprecate v1 when safe
Dashboard keeps working throughout
Common Questions
When should I use data models?
Start with _raw
for exploration. Add models when:
You need consistent naming across sites
Multiple applications consume the data
Validation becomes critical
You're ready for production
How do I handle different equipment versions?
Create separate models:
_pump_v1
for older pumps_pump_v2
for newer pumps with more sensorsStream processors can aggregate both
What about equipment-specific data?
Use virtual paths to organize:
_pump_v1.motor.temperature
_pump_v1.motor.current
_pump_v1.diagnostics.vibration
Next Steps
Define value types: Payload Shapes
Create structure: Data Models
Enforce validation: Data Contracts
Transform data: Stream Processors
Learn More
Getting Started Guide - See data modeling in action
Bridges - Apply models during ingestion
Topic Convention - Understand topic structure
Last updated