unitxt.stream module¶
- class unitxt.stream.MultiStream(data=None)¶
Bases:
dictA class for handling multiple streams of data in a dictionary-like format.
This class extends dict and its values should be instances of the Stream class.
- data¶
A dictionary of Stream objects.
- Type:
dict
- classmethod from_generators(generators: Dict[str, ReusableGenerator], caching=False, copying=False)¶
Creates a MultiStream from a dictionary of ReusableGenerators.
- Parameters:
generators (Dict[str, ReusableGenerator]) – A dictionary of ReusableGenerators.
caching (bool, optional) – Whether the data should be cached or not. Defaults to False.
copying (bool, optional) – Whether the data should be copied or not. Defaults to False.
- Returns:
A MultiStream object.
- Return type:
- classmethod from_iterables(iterables: Dict[str, Iterable], caching=False, copying=False)¶
Creates a MultiStream from a dictionary of iterables.
- Parameters:
iterables (Dict[str, Iterable]) – A dictionary of iterables.
caching (bool, optional) – Whether the data should be cached or not. Defaults to False.
copying (bool, optional) – Whether the data should be copied or not. Defaults to False.
- Returns:
A MultiStream object.
- Return type:
- get_generator(key)¶
Gets a generator for a specified key.
- Parameters:
key (str) – The key for the generator.
- Yields:
object – The next value in the stream.
- set_caching(caching: bool)¶
- set_copying(copying: bool)¶
- to_dataset(disable_cache=True, cache_dir=None) DatasetDict¶
- to_iterable_dataset() IterableDatasetDict¶
- class unitxt.stream.Stream(gen_kwargs: Dict[str, any] = {}, caching: bool = False, copying: bool = False)¶
Bases:
DataclassA class for handling streaming data in a customizable way.
This class provides methods for generating, caching, and manipulating streaming data.
- generator¶
A generator function for streaming data. :no-index:
- Type:
function
- gen_kwargs¶
A dictionary of keyword arguments for the generator function. :no-index:
- Type:
dict, optional
- caching¶
Whether the data is cached or not. :no-index:
- Type:
bool
- peek()¶
- take(n)¶