πŸ“„ Diverse Labels SamplerΒΆ

splitters.diverse_labels_sampler

Explanation about DiverseLabelsSamplerΒΆ

Selects a balanced sample of instances based on an output field.

(used for selecting demonstrations in-context learning)

The field must contain list of values e.g [β€˜dog’], [β€˜cat’], [β€˜dog’,’cat’,’cow’]. The balancing is done such that each value or combination of values appears as equals as possible in the samples.

The choices param is required and determines which values should be considered.

Example:

If choices is [β€˜dog’,’cat’] , then the following combinations will be considered. [β€˜β€™] [β€˜cat’] [β€˜dog’] [β€˜dog’,’cat’]

If the instance contains a value not in the β€˜choice’ param, it is ignored. For example, if choices is [β€˜dog’,’cat’] and the instance field is [β€˜dog’,’cat’,’cow’], then β€˜cow’ is ignored then the instance is considered as [β€˜dog’,’cat’].

Args:
sample_size (int):

number of samples to extract

choices (str):

name of input field that contains the list of values to balance on

labels (str):

name of output field with labels that must be balanced

Read more about catalog usage here.