πŸ“„ By LabelΒΆ

operators.balancers.classification.by_label

type: DeterministicBalancer
fields: 
  - reference_fields/label
[source]

Explanation about DeterministicBalancerΒΆ

A class used to balance streams deterministically.

For each instance, a signature is constructed from the values of the instance in specified input β€˜fields’. By discarding instances from the input stream, DeterministicBalancer maintains equal number of instances for all signatures. When also input β€˜max_instances’ is specified, DeterministicBalancer maintains a total instance count not exceeding β€˜max_instances’. The total number of discarded instances is as few as possible.

Attributes:

fields (List[str]): A list of field names to be used in producing the instance’s signature. max_instances (Optional, int)

Usage:

balancer = DeterministicBalancer(fields=[β€œfield1”, β€œfield2”], max_instances=200) balanced_stream = balancer.process(stream)

Example:

When input [{β€œa”: 1, β€œb”: 1},{β€œa”: 1, β€œb”: 2},{β€œa”: 2},{β€œa”: 3},{β€œa”: 4}] is fed into DeterministicBalancer(fields=[β€œa”]) the resulting stream will be: [{β€œa”: 1, β€œb”: 1},{β€œa”: 2},{β€œa”: 3},{β€œa”: 4}]

Read more about catalog usage here.