Version: 14.x

Farm Data Overview

Farm Data is a high-performance microservice that serves as a core building block in Mia Platform's Data Fabric ecosystem. It implements sophisticated stream processing capabilities for Fast Data aggregation scenarios, enabling real-time data consolidation across multiple streaming sources.

Purpose and Architecture

Core Functionality

Farm Data specializes in stream aggregation and joining, allowing organizations to:

Merge Multiple Data Streams: combine events from different Kafka topics into coherent Single Views
Maintain Stateful Processing: leverage persistent state management to correlate events across time and streams
Generate Single Views: produce unified data representations by aggregating related events from multiple sources
Process in Real-Time: handle high-throughput streaming data with minimal latency

Service Architecture

The service follows a modular, event-driven architecture consisting of several key components:

┌─────────────────┐    ┌───────────────┐    ┌──────────────────┐
│   Kafka Input   │───▶│  Aggregation  │───▶│   Kafka Output   │
│     Streams     │    │   Processor   │    │   (Single View)  │
└─────────────────┘    └───────────────┘    └──────────────────┘
                               │
                               ▼
                       ┌───────────────┐
                       │   MongoDB     │
                       │ (Persistent   │
                       │    State)     │
                       └───────────────┘

Key Architectural Components

Stream Management Layer: Handles Kafka consumption and production with advanced features like:
- Consumer group coordination and rebalancing
- Producer delivery guarantees
- Message compression and batching
- Comprehensive telemetry and metrics
Aggregation Processing Layer: Implements two distinct aggregation strategies:
- Incremental Updates (IU) Aggregation: optimized for scenarios with frequent updates to existing records
- Full Aggregation: complete recomputation approach for complex dependencies
Persistence Layer: MongoDB-based state management supporting:
- Atomic read-write operations for consistency
- Advanced querying with aggregation pipelines
- Legacy and modern MongoDB driver support
- Efficient parent-child relationship handling

Processing Models

Aggregation Graph

Farm Data operates on user-defined Directed Acyclic Graphs (DAGs) where:

Nodes represent individual data streams (Kafka topics)
Edges define relationships and filtering logic between streams
Filters specify how data flows between connected nodes

Example aggregation scenario:

User Events ──┐
              ├─▶ User Profile Single View
Order Events ─┘

Farm Data Overview

Purpose and Architecture

Core Functionality

Service Architecture

Key Architectural Components

Processing Models

Aggregation Graph

Processing Strategies

1. Incremental Updates (IU) Aggregation

2. Full Aggregation

Purpose and Architecture​

Core Functionality​

Service Architecture​

Key Architectural Components​

Processing Models​

Aggregation Graph​

Processing Strategies​

1. Incremental Updates (IU) Aggregation​

2. Full Aggregation​

Purpose and Architecture

Core Functionality

Service Architecture

Key Architectural Components

Processing Models

Aggregation Graph

Processing Strategies

1. Incremental Updates (IU) Aggregation

2. Full Aggregation