unitxt.type_utils module¶
- class unitxt.type_utils.NormalizedType(origin: None | type | TypeVar, args: tuple | frozenset = ())¶
Bases:
tupleNormalized type, made it possible to compare, hash between types.
- args: tuple | frozenset¶
Alias for field number 1
- origin: None | type | TypeVar¶
Alias for field number 0
- unitxt.type_utils.convert_union_type(type_string: str) str¶
Converts Python 3.10 union type hints into form compatible with Python 3.9 version.
- Parameters:
type_string (of Literal, for example) – A string representation of a Python type hint. It can be any valid Python type, which does not contain strings (e.g. ‘Literal’). Examples include ‘List[int|float]’, ‘str|float|bool’ etc.
Formally –
rules. (the function depends on the input string adhering to the following) –
is (Assuming that the input is a valid type hint the function does not check that 'word') –
'str' (spaces ignored) –
'bool' (spaces ignored) –
structure ('List' etc. It just depends on the following general) –
type (type -> word OR) –
of (word is a sequence) – [ ] , |
input (This implies that if any of these 4 chars shows not as a meta char of the) –
type_string –
string (in the type) –
work. (will not) –
Literal (as shown for) –
chars (that might contain occurrences of the four chars above not as meta) –
string –
function (must be handled as special cases by this) –
Literal –
:param : :param as an example. Because ‘format_type_string’ serves as preprocessing for ‘parse_type_string’: :param : :param which has a list of allowed types: :param of which Literal is not a member: :param Literal and such are not: :param relevant at all now; and the case is brought here just for an example for future use.:
- Returns:
A type string with converted union types, which is compatible with typing module.
- Return type:
str
Examples
convert_union_type(‘List[int|float]’) -> ‘List[Union[int,float]]’ convert_union_type(‘Optional[int|float|bool]’) -> ‘Optional[Union[int,float,bool]]’
- unitxt.type_utils.eval_forward_ref(ref, forward_refs=None)¶
Eval forward_refs in all cPython versions.
- unitxt.type_utils.format_type_string(type_string: str) str¶
Formats a string representing a valid Python type hint so that it is compatible with Python 3.9 notation.
- Parameters:
type_string (str) – A string representation of a Python type hint. This can be any valid type, which does not contain strings (e.g. ‘Literal’). Examples include ‘List[int]’, ‘Dict[str, Any]’, ‘Optional[List[str]]’, etc.
- Returns:
A formatted type string.
- Return type:
str
Examples
format_type_string(‘list[int | float]’) -> ‘List[Union[int,float]]’ format_type_string(‘dict[str, Optional[str]]’) -> ‘Dict[str,Optional[str]]’
The function formats valid type string (either after or before Python 3.10) into a form compatible with 3.9. This is done by captilizing the first letter of a lower-cased type name and transferring the ‘bitwise or operator’ into ‘Union’ notation. The function also removes whitespaces and redundant module name in type names imported from ‘typing’ module, e.g. ‘typing.Tuple’ -> ‘Tuple’.
Currently, the capitalization is applied only to types which unitxt allows, i.e. ‘list’, ‘dict’, ‘tuple’. Moreover, the function expects the input to not contain types which contain strings, for example ‘Literal’.
- unitxt.type_utils.get_args(type_) Tuple¶
Get type arguments with all substitutions performed.
For unions, basic simplifications used by Union constructor are performed.
Examples
Here are some code examples using get_args from the typing_utils module:
from typing_utils import get_args # Examples of get_args usage get_args(Dict[str, int]) == (str, int) # True get_args(int) == () # True get_args(Union[int, Union[T, int], str][int]) == (int, str) # True get_args(Union[int, Tuple[T, int]][str]) == (int, Tuple[str, int]) # True get_args(Callable[[], T][int]) == ([], int) # True
- unitxt.type_utils.get_origin(type_)¶
Get the unsubscripted version of a type.
This supports generic types, Callable, Tuple, Union, Literal, Final and ClassVar. Return None for unsupported types.
Examples
Here are some code examples using get_origin from the typing_utils module:
from typing_utils import get_origin # Examples of get_origin usage get_origin(Literal[42]) is Literal # True get_origin(int) is None # True get_origin(ClassVar[int]) is ClassVar # True get_origin(Generic) is Generic # True get_origin(Generic[T]) is Generic # True get_origin(Union[T, int]) is Union # True get_origin(List[Tuple[T, T]][int]) == list # True
- unitxt.type_utils.infer_type(obj) Any¶
- unitxt.type_utils.infer_type_string(obj: Any) str¶
Encodes the type of a given object into a string.
- Parameters:
obj – Any
- Returns:
a string representation of the type of the object. e.g. ‘str’, ‘List[int]’, ‘Dict[str, Any]’
formal definition of the returned string: Type -> basic | List[Type] | Dict[Type, Type] | Union[Type (, Type)* | Tuple[Type (,Type)*] basic -> bool,str,int,float,Any no spaces at all.
Examples
infer_type_string({“how_much”: 7}) returns “Dict[str,int]” infer_type_string([1, 2]) returns “List[int]” infer_type_string([]) returns “List[Any]”) no contents to list to indicate any type infer_type_string([[], [7]]) returns “List[List[int]]” type of parent list indicated by the type
of the non-empty child list. The empty child list is indeed, by default, also of that type of the non-empty child.
infer_type_string([[], 7, True]) returns “List[Union[List[Any],int]]” because bool is also an int
- unitxt.type_utils.isoftype(object, type)¶
Checks if an object is of a certain typing type, including nested types.
This function supports simple types (like int, str), typing types (like List[int], Tuple[str, int], Dict[str, int]), and nested typing types (like List[List[int]], Tuple[List[str], int], Dict[str, List[int]]).
- Parameters:
object – The object to check.
type – The typing type to check against.
- Returns:
True if the object is of the specified type, False otherwise.
- Return type:
bool
Examples: .. highlight:: python .. code-block:: python
isoftype(1, int) # True isoftype([1, 2, 3], typing.List[int]) # True isoftype([1, 2, 3], typing.List[str]) # False isoftype([[1, 2], [3, 4]], typing.List[typing.List[int]]) # True
- unitxt.type_utils.issubtype(left: None | type | TypeVar, right: None | type | TypeVar, forward_refs: dict | None = None) bool | None¶
Check that the left argument is a subtype of the right.
For unions, check if the type arguments of the left is a subset of the right. Also works for nested types including ForwardRefs.
Examples
Here are some code examples using issubtype from the typing_utils module:
from typing_utils import issubtype # Examples of issubtype checks issubtype(typing.List, typing.Any) # True issubtype(list, list) # True issubtype(list, typing.List) # True issubtype(list, typing.Sequence) # True issubtype(typing.List[int], list) # True issubtype(typing.List[typing.List], list) # True issubtype(list, typing.List[int]) # False issubtype(list, typing.Union[typing.Tuple, typing.Set]) # False issubtype(typing.List[typing.List], typing.List[typing.Sequence]) # True # Example with custom JSON type JSON = typing.Union[ int, float, bool, str, None, typing.Sequence["JSON"], typing.Mapping[str, "JSON"] ] issubtype(str, JSON, forward_refs={'JSON': JSON}) # True issubtype(typing.Dict[str, str], JSON, forward_refs={'JSON': JSON}) # True issubtype(typing.Dict[str, bytes], JSON, forward_refs={'JSON': JSON}) # False
- unitxt.type_utils.normalize(type_: None | type | TypeVar) NormalizedType¶
Convert types to NormalizedType instances.
- unitxt.type_utils.optional_all(elements) bool | None¶
- unitxt.type_utils.optional_any(elements) bool | None¶
- unitxt.type_utils.parse_type_string(type_string: str) Any¶
Parses a string representing a Python type hint and evaluates it to return the corresponding type object.
This function uses a safe evaluation context to mitigate the risks of executing arbitrary code.
- Parameters:
type_string (str) – A string representation of a Python type hint. Examples include ‘List[int]’, ‘Dict[str, Any]’, ‘Optional[List[str]]’, etc.
- Returns:
The Python type object corresponding to the given type string.
- Return type:
Any
- Raises:
ValueError – If the type string contains elements not allowed in the safe context or tokens list.
The function formats the string first if it represents a new Python type hint (i.e. valid since Python 3.10), which uses lowercased names for some types and ‘bitwise or operator’ instead of ‘Union’, for example: ‘list[int|float]’ instead of ‘List[Union[int,float]]’ etc.
The function uses a predefined safe context with common types from the typing module and basic Python data types. It also defines a list of safe tokens that are allowed in the type string.
- unitxt.type_utils.to_float_or_default(v, failure_default=0)¶
- unitxt.type_utils.verify_required_schema(required_schema_dict: Dict[str, str], input_dict: Dict[str, Any]) None¶
Verifies if passed input_dict has all required fields, and they are of proper types according to required_schema_dict.
- Parameters:
required_schema_dict (Dict[str, str]) – Schema where a key is name of a field and a value is a string representing a type of its value.
input_dict (Dict[str, Any]) – Dict with input fields and their respective values.