unitxt.string_operators module¶
- class unitxt.string_operators.Join(__tags__: ~typing.Dict[str, str] = {}, caching: bool = None, apply_to_streams: ~typing.List[str] = None, dont_apply_to_streams: ~typing.List[str] = None, field: str | None = None, to_field: str | None = None, field_to_field: ~typing.List[~typing.List[str]] | ~typing.Dict[str, str] | None = None, use_query: bool, process_every_value: bool = False, get_default: ~typing.Any = None, not_exist_ok: bool = False, by: str)¶
Bases:
FieldOperator
- class unitxt.string_operators.RegexSplit(__tags__: ~typing.Dict[str, str] = {}, caching: bool = None, apply_to_streams: ~typing.List[str] = None, dont_apply_to_streams: ~typing.List[str] = None, field: str | None = None, to_field: str | None = None, field_to_field: ~typing.List[~typing.List[str]] | ~typing.Dict[str, str] | None = None, use_query: bool, process_every_value: bool = False, get_default: ~typing.Any = None, not_exist_ok: bool = False, by: str)¶
Bases:
FieldOperator
- class unitxt.string_operators.Replace(__tags__: ~typing.Dict[str, str] = {}, caching: bool = None, apply_to_streams: ~typing.List[str] = None, dont_apply_to_streams: ~typing.List[str] = None, field: str | None = None, to_field: str | None = None, field_to_field: ~typing.List[~typing.List[str]] | ~typing.Dict[str, str] | None = None, use_query: bool, process_every_value: bool = False, get_default: ~typing.Any = None, not_exist_ok: bool = False, old: str, new: str)¶
Bases:
FieldOperator
- class unitxt.string_operators.Split(__tags__: ~typing.Dict[str, str] = {}, caching: bool = None, apply_to_streams: ~typing.List[str] = None, dont_apply_to_streams: ~typing.List[str] = None, field: str | None = None, to_field: str | None = None, field_to_field: ~typing.List[~typing.List[str]] | ~typing.Dict[str, str] | None = None, use_query: bool, process_every_value: bool = False, get_default: ~typing.Any = None, not_exist_ok: bool = False, by: str)¶
Bases:
FieldOperator
- class unitxt.string_operators.Strip(__tags__: ~typing.Dict[str, str] = {}, caching: bool = None, apply_to_streams: ~typing.List[str] = None, dont_apply_to_streams: ~typing.List[str] = None, field: str | None = None, to_field: str | None = None, field_to_field: ~typing.List[~typing.List[str]] | ~typing.Dict[str, str] | None = None, use_query: bool, process_every_value: bool = False, get_default: ~typing.Any = None, not_exist_ok: bool = False)¶
Bases:
FieldOperator
- class unitxt.string_operators.TokensSplit(__tags__: ~typing.Dict[str, str] = {}, _requirements_list: ~typing.List[str] | ~typing.Dict[str, str] = ['transformers'], caching: bool = None, apply_to_streams: ~typing.List[str] = None, dont_apply_to_streams: ~typing.List[str] = None, field: str | None = None, to_field: str | None = None, field_to_field: ~typing.List[~typing.List[str]] | ~typing.Dict[str, str] | None = None, use_query: bool, process_every_value: bool = False, get_default: ~typing.Any = None, not_exist_ok: bool = False, model: str)¶
Bases:
FieldOperator