unitxt.type_utils module¶
- class unitxt.type_utils.NormalizedType(origin: None | type | TypeVar, args: tuple | frozenset = ())¶
Bases:
tuple
Normalized type, made it possible to compare, hash between types.
- args: tuple | frozenset¶
Alias for field number 1
- origin: None | type | TypeVar¶
Alias for field number 0
- exception unitxt.type_utils.UnsupportedTypeError(type_object)¶
Bases:
ValueError
- unitxt.type_utils.convert_union_type(type_string: str) str ¶
Converts Python 3.10 union type hints into form compatible with Python 3.9 version.
- Parameters:
type_string (str) –
A string representation of a Python type hint. It can be any valid Python type, which does not contain strings (e.g. ‘Literal’). Examples include ‘List[int|float]’, ‘str|float|bool’ etc.
Formally, the function depends on the input string adhering to the following rules. Assuming that the input is a valid type hint the function does not check that ‘word’ is ‘str’, ‘bool’, ‘List’ etc. It just depends on the following general structure (spaces ignored): type -> word OR type( | type)* OR word[type( , type)*] word is a sequence of (0 or more) chars, each being any char but: [ ] , | This implies that if any of these 4 chars shows not as a meta char of the input type_string, but inside some constant string (of Literal, for example), the scheme will not work.
Cases like Literal, that might contain occurrences of the four chars above not as meta chars in the type string, must be handled as special cases by this function, as shown for Literal, as an example. Because ‘format_type_string’ serves as preprocessing for ‘parse_type_string’, which has a list of allowed types, of which Literal is not a member, Literal and such are not relevant at all now; and the case is brought here just for an example for future use.
- Returns:
A type string with converted union types, which is compatible with typing module.
- Return type:
str
Examples
convert_union_type(‘List[int|float]’) -> ‘List[Union[int,float]]’ convert_union_type(‘Optional[int|float|bool]’) -> ‘Optional[Union[int,float,bool]]’
- unitxt.type_utils.eval_forward_ref(ref, forward_refs=None)¶
Eval forward_refs in all cPython versions.
- unitxt.type_utils.format_type_string(type_string: str) str ¶
Formats a string representing a valid Python type hint so that it is compatible with Python 3.9 notation.
- Parameters:
type_string (str) – A string representation of a Python type hint. This can be any valid type, which does not contain strings (e.g. ‘Literal’). Examples include ‘List[int]’, ‘Dict[str, Any]’, ‘Optional[List[str]]’, etc.
- Returns:
A formatted type string.
- Return type:
str
Examples
format_type_string(‘list[int | float]’) -> ‘List[Union[int,float]]’ format_type_string(‘dict[str, Optional[str]]’) -> ‘Dict[str,Optional[str]]’
The function formats valid type string (either after or before Python 3.10) into a form compatible with 3.9. This is done by captilizing the first letter of a lower-cased type name and transferring the ‘bitwise or operator’ into ‘Union’ notation. The function also removes whitespaces and redundant module name in type names imported from ‘typing’ module, e.g. ‘typing.Tuple’ -> ‘Tuple’.
Currently, the capitalization is applied only to types which unitxt allows, i.e. ‘list’, ‘dict’, ‘tuple’. Moreover, the function expects the input to not contain types which contain strings, for example ‘Literal’.
- unitxt.type_utils.get_args(type_) Tuple ¶
Get type arguments with all substitutions performed.
For unions, basic simplifications used by Union constructor are performed.
Examples
Here are some code examples using get_args from the typing_utils module:
from typing_utils import get_args # Examples of get_args usage get_args(Dict[str, int]) == (str, int) # True get_args(int) == () # True get_args(Union[int, Union[T, int], str][int]) == (int, str) # True get_args(Union[int, Tuple[T, int]][str]) == (int, Tuple[str, int]) # True get_args(Callable[[], T][int]) == ([], int) # True
- unitxt.type_utils.get_origin(type_)¶
Get the unsubscripted version of a type.
This supports generic types, Callable, Tuple, Union, Literal, Final and ClassVar. Return None for unsupported types.
Examples
Here are some code examples using get_origin from the typing_utils module:
from typing_utils import get_origin # Examples of get_origin usage get_origin(Literal[42]) is Literal # True get_origin(int) is None # True get_origin(ClassVar[int]) is ClassVar # True get_origin(Generic) is Generic # True get_origin(Generic[T]) is Generic # True get_origin(Union[T, int]) is Union # True get_origin(List[Tuple[T, T]][int]) == list # True
- unitxt.type_utils.infer_type(obj) Any ¶
- unitxt.type_utils.infer_type_string(obj: Any) str ¶
Encodes the type of a given object into a string.
- Parameters:
obj – Any
- Returns:
a string representation of the type of the object. e.g. ‘str’, ‘List[int]’, ‘Dict[str, Any]’
formal definition of the returned string: Type -> basic | List[Type] | Dict[Type, Type] | Union[Type (, Type)* | Tuple[Type (,Type)*] basic -> bool,str,int,float,Any no spaces at all.
Examples
infer_type_string({“how_much”: 7}) returns “Dict[str,int]” infer_type_string([1, 2]) returns “List[int]” infer_type_string([]) returns “List[Any]”) no contents to list to indicate any type infer_type_string([[], [7]]) returns “List[List[int]]” type of parent list indicated by the type
of the non-empty child list. The empty child list is indeed, by default, also of that type of the non-empty child.
infer_type_string([[], 7, True]) returns “List[Union[List[Any],int]]” because bool is also an int
- unitxt.type_utils.is_type(object)¶
- unitxt.type_utils.is_type_dict(object)¶
- unitxt.type_utils.isoftype(object, typing_type)¶
Checks if an object is of a certain typing type, including nested types.
This function supports simple types (like int, str), typing types (like List[int], Tuple[str, int], Dict[str, int]), and nested typing types (like List[List[int]], Tuple[List[str], int], Dict[str, List[int]]).
- Parameters:
object – The object to check.
typing_type – The typing type to check against.
- Returns:
True if the object is of the specified type, False otherwise.
- Return type:
bool
Examples: .. highlight:: python .. code-block:: python
isoftype(1, int) # True isoftype([1, 2, 3], typing.List[int]) # True isoftype([1, 2, 3], typing.List[str]) # False isoftype([[1, 2], [3, 4]], typing.List[typing.List[int]]) # True
- unitxt.type_utils.issubtype(left: None | type | TypeVar, right: None | type | TypeVar, forward_refs: dict | None = None) bool | None ¶
Check that the left argument is a subtype of the right.
For unions, check if the type arguments of the left is a subset of the right. Also works for nested types including ForwardRefs.
Examples
Here are some code examples using issubtype from the typing_utils module:
from typing_utils import issubtype # Examples of issubtype checks issubtype(typing.List, typing.Any) # True issubtype(list, list) # True issubtype(list, typing.List) # True issubtype(list, typing.Sequence) # True issubtype(typing.List[int], list) # True issubtype(typing.List[typing.List], list) # True issubtype(list, typing.List[int]) # False issubtype(list, typing.Union[typing.Tuple, typing.Set]) # False issubtype(typing.List[typing.List], typing.List[typing.Sequence]) # True # Example with custom JSON type JSON = typing.Union[ int, float, bool, str, None, typing.Sequence["JSON"], typing.Mapping[str, "JSON"] ] issubtype(str, JSON, forward_refs={'JSON': JSON}) # True issubtype(typing.Dict[str, str], JSON, forward_refs={'JSON': JSON}) # True issubtype(typing.Dict[str, bytes], JSON, forward_refs={'JSON': JSON}) # False
- unitxt.type_utils.normalize(type_: None | type | TypeVar) NormalizedType ¶
Convert types to NormalizedType instances.
- unitxt.type_utils.optional_all(elements) bool | None ¶
- unitxt.type_utils.optional_any(elements) bool | None ¶
- unitxt.type_utils.parse_type_dict(type_dict)¶
- unitxt.type_utils.parse_type_string(type_string: str) Any ¶
Parses a string representing a Python type hint and evaluates it to return the corresponding type object.
This function uses a safe evaluation context to mitigate the risks of executing arbitrary code.
- Parameters:
type_string (str) – A string representation of a Python type hint. Examples include ‘List[int]’, ‘Dict[str, Any]’, ‘Optional[List[str]]’, etc.
- Returns:
The Python type object corresponding to the given type string.
- Return type:
Any
- Raises:
ValueError – If the type string contains elements not allowed in the safe context or tokens list.
The function formats the string first if it represents a new Python type hint (i.e. valid since Python 3.10), which uses lowercased names for some types and ‘bitwise or operator’ instead of ‘Union’, for example: ‘list[int|float]’ instead of ‘List[Union[int,float]]’ etc.
The function uses a predefined safe context with common types from the typing module and basic Python data types. It also defines a list of safe tokens that are allowed in the type string.
- unitxt.type_utils.to_float_or_default(v, failure_default=0)¶
- unitxt.type_utils.to_type_dict(dict_of_typing_types)¶
- unitxt.type_utils.to_type_string(typing_type)¶
- unitxt.type_utils.verify_required_schema(required_schema_dict: Dict[str, type], input_dict: Dict[str, Any]) None ¶
Verifies if passed input_dict has all required fields, and they are of proper types according to required_schema_dict.
- Parameters:
required_schema_dict (Dict[str, str]) – Schema where a key is name of a field and a value is a string representing a type of its value.
input_dict (Dict[str, Any]) – Dict with input fields and their respective values.