π Minimum One Example Per ClassΒΆ
operators.balancers.classification.minimum_one_example_per_class
type: MinimumOneExamplePerLabelRefiner
fields:
- reference_fields/label
[source]Explanation about MinimumOneExamplePerLabelRefinerΒΆ
A class used to return a specified number instances ensuring at least one example per label.
For each instance, a signature value is constructed from the values of the instance in specified input βfieldsβ. MinimumOneExamplePerLabelRefiner takes first instance that appears from each label (each unique signature), and then adds more elements up to the max_instances limit. In general, the refiner takes the first elements in the stream that meet the required conditions. MinimumOneExamplePerLabelRefiner then shuffles the results to avoid having one instance from each class first and then the rest . If max instance is not set, the original stream will be used
- Attributes:
fields (List[str]): A list of field names to be used in producing the instanceβs signature. max_instances (Optional, int): Number of elements to select. Note that max_instances of StreamRefiners that are passed to the recipe (e.g. βtrain_refinerβ. test_refiner) are overridden by the recipe parameters ( max_train_instances, max_test_instances)
- Usage:
balancer = MinimumOneExamplePerLabelRefiner(fields=[βfield1β, βfield2β], max_instances=200) balanced_stream = balancer.process(stream)
- Example:
When input [{βaβ: 1, βbβ: 1},{βaβ: 1, βbβ: 2},{βaβ: 1, βbβ: 3},{βaβ: 1, βbβ: 4},{βaβ: 2, βbβ: 5}] is fed into MinimumOneExamplePerLabelRefiner(fields=[βaβ], max_instances=3) the resulting stream will be: [{βaβ: 1, βbβ: 1}, {βaβ: 1, βbβ: 2}, {βaβ: 2, βbβ: 5}] (order may be different)
Read more about catalog usage here.