unitxt.utils module

class unitxt.utils.DistStub(project_name, version)[source]

Bases: object

exception unitxt.utils.DistributionNotFound(requirement)[source]

Bases: Exception

class unitxt.utils.LRUCache(max_size: int | None = 10)[source]

Bases: object

clear()[source]

Clear all items from the cache.

get(key, default=None)[source]
class unitxt.utils.LongString(value, *, repr_str=None)[source]

Bases: str

class unitxt.utils.Singleton[source]

Bases: type

exception unitxt.utils.VersionConflict(dist, req)[source]

Bases: Exception

unitxt.utils.artifacts_json_cache(artifact_path)[source]
unitxt.utils.deep_copy(obj)[source]

Creates a deep copy of the given object.

Parameters:

obj – The object to be deep copied.

Returns:

A deep copy of the original object.

unitxt.utils.flatten_dict(d: Dict[str, Any], parent_key: str = '', sep: str = '_') Dict[str, Any][source]
unitxt.utils.import_module_from_file(file_path)[source]
unitxt.utils.is_module_available(module_name)[source]

Check if a module is available in the current Python environment.

Parameters: - module_name (str): The name of the module to check.

Returns: - bool: True if the module is available, False otherwise.

unitxt.utils.is_package_installed(package_name)[source]

Check if a package is installed.

Parameters: - package_name (str): The name of the package to check.

Returns: - bool: True if the package is installed, False otherwise.

unitxt.utils.json_dump(data)[source]
unitxt.utils.load_json(path)[source]
unitxt.utils.lru_cache_decorator(max_size=128)[source]
unitxt.utils.recursive_copy(obj, internal_copy=None)[source]

Recursively copies an object with a selective copy method.

For list, dict, and tuple types, it recursively copies their contents. For other types, it uses the provided internal_copy function if available. Objects without a copy method are returned as is.

Parameters:
  • obj – The object to be copied.

  • internal_copy (callable, optional) – The copy function to use for non-container objects. If None, objects without a copy method are returned as is.

Returns:

The recursively copied object.

unitxt.utils.recursive_deep_copy(obj)[source]

Performs a recursive deep copy of the given object.

This function uses deep_copy as the internal copy method for non-container objects.

Parameters:

obj – The object to be deep copied.

Returns:

A recursively deep-copied version of the original object.

unitxt.utils.recursive_shallow_copy(obj)[source]

Performs a recursive shallow copy of the given object.

This function uses shallow_copy as the internal copy method for non-container objects.

Parameters:

obj – The object to be shallow copied.

Returns:

A recursively shallow-copied version of the original object.

unitxt.utils.remove_numerics_and_quoted_texts(input_str)[source]
unitxt.utils.require(requirements)[source]

Minimal drop-in replacement for pkg_resources.require.

Accepts a single requirement string or a list of them. Raises DistributionNotFound or VersionConflict. Returns nothing (side-effect only).

unitxt.utils.retry_connection_with_exponential_backoff(max_retries=None, retry_exceptions=(<class 'requests.exceptions.ConnectionError'>, <class 'requests.exceptions.Timeout'>, <class 'requests.exceptions.HTTPError'>, <class 'FileNotFoundError'>, <class 'urllib.error.HTTPError'>), backoff_factor=1)[source]

Decorator that implements retry with exponential backoff for network operations.

Also handles errors that were triggered by the specified retry exceptions, whether they’re direct causes or part of the exception context.

Parameters:
  • max_retries – Maximum number of retry attempts (falls back to settings if None)

  • retry_exceptions – Tuple of exceptions that should trigger a retry

  • backoff_factor – Base delay factor in seconds for backoff calculation

Returns:

The decorated function with retry logic

unitxt.utils.safe_eval(expression: str, context: dict, allowed_tokens: list) any[source]

Evaluates a given expression in a restricted environment, allowing only specified tokens and context variables.

Parameters:
  • expression (str) – The expression to evaluate.

  • context (dict) – A dictionary mapping variable names to their values, which can be used in the expression.

  • allowed_tokens (list) – A list of strings representing allowed tokens (such as operators, function names, etc.) that can be used in the expression.

Returns:

The result of evaluating the expression.

Return type:

any

Raises:

ValueError – If the expression contains tokens not in the allowed list or context keys.

Note

This function should be used carefully, as it employs eval, which can execute arbitrary code. The function attempts to mitigate security risks by restricting the available tokens and not exposing built-in functions.

unitxt.utils.save_to_file(path, data)[source]
unitxt.utils.shallow_copy(obj)[source]

Creates a shallow copy of the given object.

Parameters:

obj – The object to be shallow copied.

Returns:

A shallow copy of the original object.