Version: 4.0.0-incubating

Stateful Server Load Balancing

All stateful services in BifroMQ are built on top of the base-kv distributed storage engine.
base-kv provides strong consistency, automatic sharding, and fault tolerance, forming the foundation for high availability and elastic scaling of stateful workloads. Load distribution and availability are jointly managed by replicated shards and the Balancer framework, which continuously adapts the cluster topology to runtime conditions.

Core Principles

Replicated Shards and Quorum

Each stateful service cluster continuously replicates its data across multiple nodes using the Raft protocol.

As long as a quorum of replicas for a shard (Range) remains alive, the service continues to serve reads and writes.
Node failures are tolerated transparently without manual intervention.

Leader-Based Access Model

Each shard has a designated leader responsible for handling writes.
Leaders are deliberately distributed across the cluster to avoid hotspots and ensure balanced utilization.

High availability therefore emerges from:

Replica redundancy
Deterministic leader placement
Continuous topology adjustment

Stateful Services Built on base-kv

All BifroMQ stateful servers share the same architectural foundation but optimize their storage schema and access paths based on workload characteristics:

Service	store_name (for API headers)	Role/Workload
DistWorker	`dist.worker`	subscription routing
InboxStore	`inbox.store`	persistent offline message queues
RetainStore	`retain.store`	retained message storage

This design allows each service to:

Use the same base-kv primitives
Apply deep, workload-specific optimizations
Reuse the same balancing and recovery mechanisms

Store Landscape

Stateful services run on an overlay cluster just like RPC services. You can inspect the server topology via API and also inspect how Range replicas are placed across storage nodes.

Get store landscape

List the nodes of the stateful service overlay cluster.

Request

GET /store/landscape
Headers:
  store_name: <STORE_NAME_HEADER>

Response

[
  {
    "hostId": "c3RvcmUtaWQ=", // Identity of the node in the Underlay Cluster
    "id": "710dc192-4641-4b31-bde1-a36329b33273", // Identity of the stateful server instance in the Overlay Cluster
    "address": "10.0.0.2", // server bind address
    "port": 36801, // server bind port
    "attributes": {
        ...
    }
  }
]

Get range placement in a store

List Range replicas hosted on a specific store server.

Request

GET /store/ranges
Headers:
  store_name: <STORE_NAME_HEADER>
  store_id: <STORE_ID_HEADER>

Response

[
  {
    "id": "115240914861228032_0", // the range id
    "ver": 14, // the version of range
    "boundary": {
      // the range boundary
      "startKey": null,
      "endKey": null
    },
    "state": "Normal", // the range state
    "role": "Leader", // the replica role
    "clusterConfig": {
      "voters": [
        "710dc192-4641-4b31-bde1-a36329b33273",
        "c2784a36-4509-41be-96bc-5809026bce99",
        "cd360c5f-7693-40c6-af9c-541cc2467a00"
      ],
      "learners": [],
      "nextVoters": [],
      "nextLearners": []
    }
  }
]

Balancer Framework Overview

The Balancer framework continuously shapes the base-kv cluster topology. Although a centralized coordinator is straightforward to implement, the framework is designed to enable distributed convergence, meaning there is:

No central coordinator
No single point of control
No out-of-band orchestration

Achieving distributed convergence requires each balancer implementation to be deterministic:

Every node observes the strong eventually consistent global cluster state
Each balancer deterministically derives the same expected Range topology (the built-in balancers follow this pattern)
Each node independently executes balance commands for the Ranges it currently leads

Built-in StoreBalancers

BifroMQ ships with several built-in Balancers that cover common scenarios and can serve as references for custom implementations.
The framework lets Balancer implementations expose runtime-tunable rules and be started/paused via API; which balancers to enable and their initial rules are set in configuration (for example, BalancerOptions.balancers keyed by balancerFactory FQN).

Balancer	Focus	Rules
`RangeLeaderBalancer`	Evenly spread Range leaders to avoid hotspots
`ReplicaCntBalancer`	Keep replica counts aligned with goals (voters/learners)	- `votersPerRange`: target voters per range (must be odd) - `learnersPerRange`: target learners per range (-1 means no limit)
`RangeSplitBalancer`	Split "hot" Ranges to sustain throughput	- `maxCpuUsagePerRange`: CPU threshold - `maxIODensityPerRange`: IO density cap - `ioNanosLimitPerRange`: IO latency cap (ns) - `maxRangesPerStore`: per-store range cap

BalancerOptions

BalancerOptions tells a BifroMQ process with DistWorker enabled which balancers to instantiate at startup and the initial values of their rules. BalancerOptions.balancers is a map keyed by the balancerFactory FQN, with a Struct payload for initial rules. For example, to start a ReplicaCntBalancer on DistWorker with default replica targets:

distWorkerConfig:
  balanceConfig:
    balancers:
      org.apache.bifromq.dist.worker.balance.ReplicaCntBalancerFactory:
        votersPerRange: 3
        learnersPerRange: -1

Enable/Disable Balancer at Runtime

Balancers can be started or paused via the runtime API without restarting the service. This lets operators temporarily disable a balancer for observation or mitigation, then re-enable it as needed; deterministic behavior preserves convergence when reactivated. Note: ensure the balancer_factory_class is correct when enabling the balancer instances initialized via BalancerOptions.

Enable the balancer instances

PUT /store/balancer/enable
Headers:
  store_name: <STORE_NAME_HEADER>
  balancer_factory_class: <BALANCER_FACTORY_CLASS_FQN>

Disable the balancer instances

PUT /store/balancer/disable
Headers:
  store_name: <STORE_NAME_HEADER>
  balancer_factory_class: <BALANCER_FACTORY_CLASS_FQN>

Update Balancer's rules at runtime

Balancer rules (the same Struct schema used in BalancerOptions.balancers) can be updated at runtime through the API. New rules take effect immediately, and subsequent balance cycles converge using the updated values.

Get rules override

Retrieve the rules override; a 404 is returned if no override is set.

Request

GET /store/balancer/rules
Headers:
  store_name: <STORE_NAME_HEADER>
  balancer_factory_class: <BALANCER_FACTORY_CLASS_FQN>

Response

{
  "votersPerRange": 1.0
}

Merge rules override with existing rules

Merge a rules override with existing rules; the caller must ensure the payload is structurally valid.

PUT /store/balancer/rules
Headers:
  store_name: <STORE_NAME_HEADER>
  balancer_factory_class: <BALANCER_FACTORY_CLASS_FQN>
Body: balance rules override json

Get balancer states

Get the latest state of the balancer instances running on each stateful server.

Request

GET /store/balancer
Headers:
  store_name: <STORE_NAME_HEADER>
  balancer_factory_class: <BALANCER_FACTORY_CLASS_FQN>

Response

{
  "org.apache.bifromq.dist.worker.balance.ReplicaCntBalancerFactory": {
    "710dc192-4641-4b31-bde1-a36329b33273": { // store id running the balancer instance
      "disable": false, // whether the balancer is disabled
      "loadRules": { // effective load rules
        "votersPerRange": 1.0,
        "learnersPerRange": -1.0
      },
      "hlc": "115526170745044992" // hlc timestamp of last update
    },
    ...
  }
}

Core Principles​

Replicated Shards and Quorum​

Leader-Based Access Model​

Stateful Services Built on base-kv​

Store Landscape​

Get store landscape​

Get range placement in a store​

Balancer Framework Overview​

Built-in StoreBalancers​

BalancerOptions​

Enable/Disable Balancer at Runtime​

Enable the balancer instances​

Disable the balancer instances​

Update Balancer's rules at runtime​

Get rules override​

Merge rules override with existing rules​

Get balancer states​

Core Principles

Replicated Shards and Quorum

Leader-Based Access Model

Stateful Services Built on base-kv

Store Landscape

Get store landscape

Get range placement in a store

Balancer Framework Overview

Built-in StoreBalancers

BalancerOptions

Enable/Disable Balancer at Runtime

Enable the balancer instances

Disable the balancer instances

Update Balancer's rules at runtime

Get rules override

Merge rules override with existing rules

Get balancer states