unitxt.type_utils module¶
- class unitxt.type_utils.NormalizedType(origin: None | type | TypeVar, args: tuple | frozenset = ())[source]¶
Bases:
tuple
Normalized type, made it possible to compare, hash between types.
- args: tuple | frozenset¶
Alias for field number 1
- origin: None | type | TypeVar¶
Alias for field number 0
- unitxt.type_utils.convert_union_type(type_string: str) str [source]¶
Converts Python 3.10 union type hints into form compatible with Python 3.9 version.
- Parameters:
type_string (str) –
A string representation of a Python type hint. It can be any valid Python type, which does not contain strings (e.g. ‘Literal’). Examples include ‘List[int|float]’, ‘str|float|bool’ etc.
Formally, the function depends on the input string adhering to the following rules. Assuming that the input is a valid type hint the function does not check that ‘word’ is ‘str’, ‘bool’, ‘List’ etc. It just depends on the following general structure (spaces ignored): type -> word OR type( | type)* OR word[type( , type)*] word is a sequence of (0 or more) chars, each being any char but: [ ] , | This implies that if any of these 4 chars shows not as a meta char of the input type_string, but inside some constant string (of Literal, for example), the scheme will not work.
Cases like Literal, that might contain occurrences of the four chars above not as meta chars in the type string, must be handled as special cases by this function, as shown for Literal, as an example. Because ‘format_type_string’ serves as preprocessing for ‘parse_type_string’, which has a list of allowed types, of which Literal is not a member, Literal and such are not relevant at all now; and the case is brought here just for an example for future use.
- Returns:
A type string with converted union types, which is compatible with typing module.
- Return type:
str
Examples
convert_union_type(‘List[int|float]’) -> ‘List[Union[int,float]]’ convert_union_type(‘Optional[int|float|bool]’) -> ‘Optional[Union[int,float,bool]]’
- unitxt.type_utils.eval_forward_ref(ref, forward_refs=None)[source]¶
Eval forward_refs in all cPython versions.
- unitxt.type_utils.format_type_string(type_string: str) str [source]¶
Formats a string representing a valid Python type hint so that it is compatible with Python 3.9 notation.
- Parameters:
type_string (str) – A string representation of a Python type hint. This can be any valid type, which does not contain strings (e.g. ‘Literal’). Examples include ‘List[int]’, ‘Dict[str, Any]’, ‘Optional[List[str]]’, etc.
- Returns:
A formatted type string.
- Return type:
str
Examples
format_type_string(‘list[int | float]’) -> ‘List[Union[int,float]]’ format_type_string(‘dict[str, Optional[str]]’) -> ‘Dict[str,Optional[str]]’
The function formats valid type string (either after or before Python 3.10) into a form compatible with 3.9. This is done by captilizing the first letter of a lower-cased type name and transferring the ‘bitwise or operator’ into ‘Union’ notation. The function also removes whitespaces and redundant module name in type names imported from ‘typing’ module, e.g. ‘typing.Tuple’ -> ‘Tuple’.
Currently, the capitalization is applied only to types which unitxt allows, i.e. ‘list’, ‘dict’, ‘tuple’. Moreover, the function expects the input to not contain types which contain strings, for example ‘Literal’.
- unitxt.type_utils.get_args(type_) Tuple [source]¶
Get type arguments with all substitutions performed.
For unions, basic simplifications used by Union constructor are performed.
Examples
Here are some code examples using get_args from the typing_utils module:
from typing_utils import get_args # Examples of get_args usage get_args(Dict[str, int]) == (str, int) # True get_args(int) == () # True get_args(Union[int, Union[T, int], str][int]) == (int, str) # True get_args(Union[int, Tuple[T, int]][str]) == (int, Tuple[str, int]) # True get_args(Callable[[], T][int]) == ([], int) # True
- unitxt.type_utils.get_origin(type_)[source]¶
Get the unsubscripted version of a type.
This supports generic types, Callable, Tuple, Union, Literal, Final and ClassVar. Return None for unsupported types.
Examples
Here are some code examples using get_origin from the typing_utils module:
from typing_utils import get_origin # Examples of get_origin usage get_origin(Literal[42]) is Literal # True get_origin(int) is None # True get_origin(ClassVar[int]) is ClassVar # True get_origin(Generic) is Generic # True get_origin(Generic[T]) is Generic # True get_origin(Union[T, int]) is Union # True get_origin(List[Tuple[T, T]][int]) == list # True
- unitxt.type_utils.infer_type_string(obj: Any) str [source]¶
Encodes the type of a given object into a string.
- Parameters:
obj – Any
- Returns:
a string representation of the type of the object. e.g.
"str"
,"List[int]"
,"Dict[str, Any]"
formal definition of the returned string:Type -> basic | List[Type] | Dict[Type, Type] | Union[Type(, Type)*] | Tuple[Type(, Type)*]basic ->bool
|str
|int
|float
|Any
Examples
infer_type_string({"how_much": 7})
returns"Dict[str,int]"
infer_type_string([1, 2])
returns"List[int]"
infer_type_string([])
returns"List[Any]")
no contents to list to indicate any typeinfer_type_string([[], [7]])
returns"List[List[int]]"
type of parent list indicated by the type of the non-empty child list. The empty child list is indeed, by default, also of that type of the non-empty child.infer_type_string([[], 7, True])
returns"List[Union[List[Any],int]]"
becausebool
is also anint
- unitxt.type_utils.is_type(object)[source]¶
Checks if the provided object is a type, including generics, Literal, TypedDict, and NewType.
- unitxt.type_utils.isoftype(object, typing_type)[source]¶
Checks if an object is of a certain typing type, including nested types.
This function supports simple types, typing types (List[int], Tuple[str, int]), nested typing types (List[List[int]], Tuple[List[str], int]), Literal, TypedDict, and NewType.
- Parameters:
object – The object to check.
typing_type – The typing type to check against.
- Returns:
True if the object is of the specified type, False otherwise.
- Return type:
bool
- unitxt.type_utils.issubtype(left: None | type | TypeVar, right: None | type | TypeVar, forward_refs: dict | None = None) bool | None [source]¶
Check that the left argument is a subtype of the right.
For unions, check if the type arguments of the left is a subset of the right. Also works for nested types including ForwardRefs.
Examples
Here are some code examples using issubtype from the typing_utils module:
from typing_utils import issubtype # Examples of issubtype checks issubtype(typing.List, typing.Any) # True issubtype(list, list) # True issubtype(list, typing.List) # True issubtype(list, typing.Sequence) # True issubtype(typing.List[int], list) # True issubtype(typing.List[typing.List], list) # True issubtype(list, typing.List[int]) # False issubtype(list, typing.Union[typing.Tuple, typing.Set]) # False issubtype(typing.List[typing.List], typing.List[typing.Sequence]) # True # Example with custom JSON type JSON = typing.Union[ int, float, bool, str, None, typing.Sequence["JSON"], typing.Mapping[str, "JSON"] ] issubtype(str, JSON, forward_refs={'JSON': JSON}) # True issubtype(typing.Dict[str, str], JSON, forward_refs={'JSON': JSON}) # True issubtype(typing.Dict[str, bytes], JSON, forward_refs={'JSON': JSON}) # False
- unitxt.type_utils.normalize(type_: None | type | TypeVar) NormalizedType [source]¶
Convert types to NormalizedType instances.
- unitxt.type_utils.parse_type_string(type_string: str) Any [source]¶
Parses a string representing a Python type hint and evaluates it to return the corresponding type object.
This function uses a safe evaluation context to mitigate the risks of executing arbitrary code.
- Parameters:
type_string (str) – A string representation of a Python type hint. Examples include ‘List[int]’, ‘Dict[str, Any]’, ‘Optional[List[str]]’, etc.
- Returns:
The Python type object corresponding to the given type string.
- Return type:
Any
- Raises:
ValueError – If the type string contains elements not allowed in the safe context or tokens list.
The function formats the string first if it represents a new Python type hint (i.e. valid since Python 3.10), which uses lowercased names for some types and ‘bitwise or operator’ instead of ‘Union’, for example: ‘list[int|float]’ instead of ‘List[Union[int,float]]’ etc.
The function uses a predefined safe context with common types from the typing module and basic Python data types. It also defines a list of safe tokens that are allowed in the type string.
- unitxt.type_utils.strtype(typing_type) str [source]¶
Converts a typing type to its string representation.
- Parameters:
typing_type (Any) – The typing type to be converted. This can include standard types, custom types, or types from the typing module, such as Literal, Union, List, Dict, Tuple, TypedDict, and NewType.
- Returns:
The string representation of the provided typing type.
- Return type:
str
- Raises:
UnsupportedTypeError – If the provided typing_type is not a recognized type.
Notes
If typing_type is Literal, NewType, or TypedDict, the function returns the name of the type.
If typing_type is Any, it returns the string “Any”.
For other typing constructs like Union, List, Dict, and Tuple, the function recursively converts each part of the type to its string representation.
The function checks the __origin__ attribute to determine the base type and formats the type arguments accordingly.
- unitxt.type_utils.verify_required_schema(required_schema_dict: Dict[str, type], input_dict: Dict[str, Any], class_name: str, id: str | None = '', description: str | None = '') None [source]¶
Verifies if passed input_dict has all required fields, and they are of proper types according to required_schema_dict.
- Parameters:
required_schema_dict (Dict[str, str]) – Schema where a key is name of a field and a value is a string representing a type of its value.
input_dict (Dict[str, Any]) – Dict with input fields and their respective values.