unitxt.fusion module

class unitxt.fusion.BaseFusion(__tags__: ~typing.Dict[str, str] = {}, caching: bool = None, origins: ~typing.List[~unitxt.operator.SourceOperator], include_splits: ~typing.List[str] | None = None)

Bases: SourceOperator

BaseFusion operator that combines multiple streams into one.

Parameters:

include_splits – List of splits to include. If None, all splits are included.

class unitxt.fusion.FixedFusion(__tags__: ~typing.Dict[str, str] = {}, caching: bool = None, origins: ~typing.List[~unitxt.operator.SourceOperator], include_splits: ~typing.List[str] | None = None, max_instances_per_origin: int | None = None)

Bases: BaseFusion

FixedFusion operator that combines multiple streams into one based on a fixed number of examples per task.

Parameters:
  • origins – List of SourceOperator objects.

  • examples_per_task – Number of examples per task. If None, all examples are returned.

  • splits – List of splits to include. If None, all splits are included.

class unitxt.fusion.WeightedFusion(__tags__: Dict[str, str] = {}, caching: bool = None, origins: List[SourceOperator] = None, include_splits: List[str] | None = None, weights: List[float] = None, max_total_examples: int = None)

Bases: BaseFusion

Fusion operator that combines multiple streams based.

Parameters:
  • origins – List of SourceOperator objects.

  • weights – List of weights for each origin.

  • max_total_examples – Total number of examples to return. If None, all examples are returned.