π Diverse Labels SamplerΒΆ
splitters.diverse_labels_sampler
Explanation about DiverseLabelsSamplerΒΆ
Selects a balanced sample of instances based on an output field.
(used for selecting demonstrations in-context learning)
The field must contain list of values e.g [βdogβ], [βcatβ], [βdogβ,βcatβ,βcowβ]. The balancing is done such that each value or combination of values appears as equals as possible in the samples.
The choices param is required and determines which values should be considered.
- Example:
If choices is [βdogβ,βcatβ] , then the following combinations will be considered. [ββ] [βcatβ] [βdogβ] [βdogβ,βcatβ]
If the instance contains a value not in the βchoiceβ param, it is ignored. For example, if choices is [βdogβ,βcatβ] and the instance field is [βdogβ,βcatβ,βcowβ], then βcowβ is ignored then the instance is considered as [βdogβ,βcatβ].
- Args:
- sample_size (int):
number of samples to extract
- choices (str):
name of input field that contains the list of values to balance on
- labels (str):
name of output field with labels that must be balanced
Read more about catalog usage here.