πŸ“„ Minimum One Example Per ClassΒΆ

operators.balancers.classification.minimum_one_example_per_class

type: MinimumOneExamplePerLabelRefiner
fields: 
  - reference_fields/label
[source]

Explanation about MinimumOneExamplePerLabelRefinerΒΆ

A class used to return a specified number instances ensuring at least one example per label.

For each instance, a signature value is constructed from the values of the instance in specified input β€˜fields’. MinimumOneExamplePerLabelRefiner takes first instance that appears from each label (each unique signature), and then adds more elements up to the max_instances limit. In general, the refiner takes the first elements in the stream that meet the required conditions. MinimumOneExamplePerLabelRefiner then shuffles the results to avoid having one instance from each class first and then the rest . If max instance is not set, the original stream will be used

Attributes:

fields (List[str]): A list of field names to be used in producing the instance’s signature. max_instances (Optional, int): Number of elements to select. Note that max_instances of StreamRefiners that are passed to the recipe (e.g. β€˜train_refiner’. test_refiner) are overridden by the recipe parameters ( max_train_instances, max_test_instances)

Usage:

balancer = MinimumOneExamplePerLabelRefiner(fields=[β€œfield1”, β€œfield2”], max_instances=200) balanced_stream = balancer.process(stream)

Example:

When input [{β€œa”: 1, β€œb”: 1},{β€œa”: 1, β€œb”: 2},{β€œa”: 1, β€œb”: 3},{β€œa”: 1, β€œb”: 4},{β€œa”: 2, β€œb”: 5}] is fed into MinimumOneExamplePerLabelRefiner(fields=[β€œa”], max_instances=3) the resulting stream will be: [{β€˜a’: 1, β€˜b’: 1}, {β€˜a’: 1, β€˜b’: 2}, {β€˜a’: 2, β€˜b’: 5}] (order may be different)

Read more about catalog usage here.