unitxt.inference module¶

class unitxt.inference.GenericInferenceEngine(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None, default: str | None = None)¶: Bases: InferenceEngine

class unitxt.inference.HFLlavaInferenceEngine(__tags__: ~typing.Dict[str, str] = {}, data_classification_policy: ~typing.List[str] = None, lazy_load: bool = True, model_name: str, max_new_tokens: int)¶: Bases: InferenceEngine, LazyLoadMixin

class unitxt.inference.HFPipelineBasedInferenceEngine(__tags__: ~typing.Dict[str, str] = {}, data_classification_policy: ~typing.List[str] = None, lazy_load: bool = False, _requirements_list: ~typing.List[str] | ~typing.Dict[str, str] = {'transformers': "Install huggingface package using 'pip install --upgrade transformers"}, model_name: str, max_new_tokens: int, use_fp16: bool = True)¶: Bases: InferenceEngine, PackageRequirementsMixin, LazyLoadMixin

class unitxt.inference.IbmGenAiInferenceEngine(__tags__: ~typing.Dict[str, str] = {}, data_classification_policy: ~typing.List[str] = ['public', 'proprietary'], _requirements_list: ~typing.List[str] | ~typing.Dict[str, str] = {'genai': "Install ibm-genai package using 'pip install --upgrade ibm-generative-ai"}, beam_width: int | None = None, decoding_method: ~typing.Literal['greedy', 'sample'] | None = None, include_stop_sequence: bool | None = None, length_penalty: ~typing.Any = None, max_new_tokens: int | None = None, min_new_tokens: int | None = None, random_seed: int | None = None, repetition_penalty: float | None = None, return_options: ~typing.Any = None, stop_sequences: ~typing.List[str] | None = None, temperature: float | None = None, time_limit: int | None = None, top_k: int | None = None, top_p: float | None = None, truncate_input_tokens: int | None = None, typical_p: float | None = None, label: str = 'ibm_genai', model_name: str, parameters: ~unitxt.inference.IbmGenAiInferenceEngineParams | None = None)¶

Bases: InferenceEngine, IbmGenAiInferenceEngineParamsMixin, PackageRequirementsMixin, LogProbInferenceEngine

data_classification_policy: List[str] = ['public', 'proprietary']¶

class unitxt.inference.IbmGenAiInferenceEngineParamsMixin(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None, beam_width: int | None = None, decoding_method: Literal['greedy', 'sample'] | None = None, include_stop_sequence: bool | None = None, length_penalty: Any = None, max_new_tokens: int | None = None, min_new_tokens: int | None = None, random_seed: int | None = None, repetition_penalty: float | None = None, return_options: Any = None, stop_sequences: List[str] | None = None, temperature: float | None = None, time_limit: int | None = None, top_k: int | None = None, top_p: float | None = None, truncate_input_tokens: int | None = None, typical_p: float | None = None)¶: Bases: Artifact

class unitxt.inference.InferenceEngine(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None)¶

Bases: ABC, Artifact

Abstract base class for inference.

class unitxt.inference.LazyLoadMixin(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None, lazy_load: bool = False)¶: Bases: Artifact

class unitxt.inference.LogProbInferenceEngine(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None)¶

Bases: ABC, Artifact

Abstract base class for inference with log probs.

class unitxt.inference.MockInferenceEngine(__tags__: ~typing.Dict[str, str] = {}, data_classification_policy: ~typing.List[str] = None, model_name: str, default_inference_value: str = '[[10]]')¶: Bases: InferenceEngine

class unitxt.inference.MockModeMixin(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None, mock_mode: bool = False)¶: Bases: Artifact

class unitxt.inference.OllamaInferenceEngine(__tags__: ~typing.Dict[str, str] = {}, data_classification_policy: ~typing.List[str] = ['public', 'proprietary'], _requirements_list: ~typing.List[str] | ~typing.Dict[str, str] = {'ollama': "Install ollama package using 'pip install --upgrade ollama"}, label: str = 'ollama', model_name: str)¶

Bases: InferenceEngine, PackageRequirementsMixin

data_classification_policy: List[str] = ['public', 'proprietary']¶

class unitxt.inference.OpenAiInferenceEngine(__tags__: ~typing.Dict[str, str] = {}, data_classification_policy: ~typing.List[str] = ['public'], _requirements_list: ~typing.List[str] | ~typing.Dict[str, str] = {'openai': "Install openai package using 'pip install --upgrade openai"}, frequency_penalty: float | None = None, presence_penalty: float | None = None, max_tokens: int | None = None, seed: int | None = None, stop: str | None | ~typing.List[str] = None, temperature: float | None = None, top_p: float | None = None, top_logprobs: int | None = 20, logit_bias: ~typing.Dict[str, int] | None = None, logprobs: bool | None = True, n: int | None = None, parallel_tool_calls: bool | None = None, service_tier: ~typing.Literal['auto', 'default'] | None = None, label: str = 'openai', model_name: str, parameters: ~unitxt.inference.OpenAiInferenceEngineParams | None = None)¶

Bases: InferenceEngine, LogProbInferenceEngine, OpenAiInferenceEngineParamsMixin, PackageRequirementsMixin

data_classification_policy: List[str] = ['public']¶

class unitxt.inference.OpenAiInferenceEngineParamsMixin(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None, frequency_penalty: float | None = None, presence_penalty: float | None = None, max_tokens: int | None = None, seed: int | None = None, stop: str | None | List[str] = None, temperature: float | None = None, top_p: float | None = None, top_logprobs: int | None = 20, logit_bias: Dict[str, int] | None = None, logprobs: bool | None = True, n: int | None = None, parallel_tool_calls: bool | None = None, service_tier: Literal['auto', 'default'] | None = None)¶: Bases: Artifact

Bases: object

Contains the prediction results and metadata for the inference.

Args: prediction (Union[str, List[Dict[str, Any]]]): If this is the result of an _infer call, the string predicted by the model. If this is the results of an _infer_log_probs call, a list of dictionaries. The i’th dictionary represents the i’th token in the response. The entry “top_tokens” in the dictionary holds a sorted list of the top tokens for this position and their probabilities. For example: [ {.. “top_tokens”: [ {“text”: “a”, ‘logprob’: }, {“text”: “b”, ‘logprob’: } ….]},

{.. “top_tokens”: [ {“text”: “c”, ‘logprob’: }, {“text”: “d”, ‘logprob’: } ….]}

]

input_tokens (int) : number of input tokens to the model. output_tokens (int) : number of output tokens to the model. model_name (str): the model_name as kept in the InferenceEngine. inference_type (str): The label stating the type of the InferenceEngine.

class unitxt.inference.TogetherAiInferenceEngine(__tags__: ~typing.Dict[str, str] = {}, data_classification_policy: ~typing.List[str] = ['public'], _requirements_list: ~typing.List[str] | ~typing.Dict[str, str] = {'together': "Install together package using 'pip install --upgrade together"}, max_tokens: int | None = None, stop: ~typing.List[str] | None = None, temperature: float | None = None, top_p: float | None = None, top_k: int | None = None, repetition_penalty: float | None = None, logprobs: int | None = None, echo: bool | None = None, n: int | None = None, min_p: float | None = None, presence_penalty: float | None = None, frequency_penalty: float | None = None, label: str = 'together', model_name: str, parameters: ~unitxt.inference.TogetherAiInferenceEngineParamsMixin | None = None)¶

Bases: InferenceEngine, TogetherAiInferenceEngineParamsMixin, PackageRequirementsMixin

data_classification_policy: List[str] = ['public']¶

class unitxt.inference.TogetherAiInferenceEngineParamsMixin(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None, max_tokens: int | None = None, stop: List[str] | None = None, temperature: float | None = None, top_p: float | None = None, top_k: int | None = None, repetition_penalty: float | None = None, logprobs: int | None = None, echo: bool | None = None, n: int | None = None, min_p: float | None = None, presence_penalty: float | None = None, frequency_penalty: float | None = None)¶: Bases: Artifact

class unitxt.inference.VLLMRemoteInferenceEngine(__tags__: ~typing.Dict[str, str] = {}, data_classification_policy: ~typing.List[str] = ['public'], _requirements_list: ~typing.List[str] | ~typing.Dict[str, str] = {'openai': "Install openai package using 'pip install --upgrade openai"}, frequency_penalty: float | None = None, presence_penalty: float | None = None, max_tokens: int | None = None, seed: int | None = None, stop: str | None | ~typing.List[str] = None, temperature: float | None = None, top_p: float | None = None, top_logprobs: int | None = 20, logit_bias: ~typing.Dict[str, int] | None = None, logprobs: bool | None = True, n: int | None = None, parallel_tool_calls: bool | None = None, service_tier: ~typing.Literal['auto', 'default'] | None = None, label: str = 'vllm', model_name: str, parameters: ~unitxt.inference.OpenAiInferenceEngineParams | None = None)¶: Bases: OpenAiInferenceEngine

class unitxt.inference.WMLInferenceEngine(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = ['public', 'proprietary'], _requirements_list: List[str] | Dict[str, str] = {'ibm_watsonx_ai': "Install ibm-watsonx-ai package using 'pip install --upgrade ibm-watsonx-ai'. It is advised to have Python version >=3.10 installed, as at lower version this package may cause conflicts with other installed packages."}, decoding_method: Literal['greedy', 'sample'] | None = None, length_penalty: Dict[str, float | int] | None = None, temperature: float | None = None, top_p: float | None = None, top_k: int | None = None, random_seed: int | None = None, repetition_penalty: float | None = None, min_new_tokens: int | None = None, max_new_tokens: int | None = None, stop_sequences: List[str] | None = None, time_limit: int | None = None, truncate_input_tokens: int | None = None, prompt_variables: Dict[str, Any] | None = None, return_options: Dict[str, bool] | None = None, credentials: Dict[Literal['url', 'apikey', 'project_id'], str] | None = None, model_name: str | None = None, deployment_id: str | None = None, label: str = 'wml', parameters: WMLInferenceEngineParams | None = None, concurrency_limit: int = 10)¶

Bases: InferenceEngine, WMLInferenceEngineParamsMixin, PackageRequirementsMixin, LogProbInferenceEngine

Runs inference using ibm-watsonx-ai.

credentials¶

By default, it is created by a class instance which tries to retrieve proper environment variables (“WML_URL”, “WML_PROJECT_ID”, “WML_APIKEY”). However, a dictionary with the following keys: “url”, “apikey”, “project_id” can be directly provided instead.

Type:: Dict[str, str], optional

model_name¶

ID of a model to be used for inference. Mutually exclusive with ‘deployment_id’.

Type:: str, optional

deployment_id¶

Deployment ID of a tuned model to be used for inference. Mutually exclusive with ‘model_name’.

Type:: str, optional

parameters¶

Instance of WMLInferenceEngineParams which defines inference parameters and their values. Deprecated attribute, please pass respective parameters directly to the WMLInferenceEngine class instead.

Type:: WMLInferenceEngineParams, optional

concurrency_limit¶

number of requests that will be sent in parallel, max is 10.

Type:: int

Examples

from .api import load_dataset

wml_credentials = {: “url”: “some_url”, “project_id”: “some_id”, “api_key”: “some_key”

} model_name = “google/flan-t5-xxl” wml_inference = WMLInferenceEngine(

credentials=wml_credentials, model_name=model_name, data_classification_policy=[“public”], top_p=0.5, random_seed=123,

)

dataset = load_dataset(: dataset_query=”card=cards.argument_topic,template_card_index=0,loader_limit=5”

) results = wml_inference.infer(dataset[“test”])

data_classification_policy: List[str] = ['public', 'proprietary']¶

class unitxt.inference.WMLInferenceEngineParamsMixin(__tags__: Dict[str, str] = {}, data_classification_policy: List[str] = None, decoding_method: Literal['greedy', 'sample'] | None = None, length_penalty: Dict[str, float | int] | None = None, temperature: float | None = None, top_p: float | None = None, top_k: int | None = None, random_seed: int | None = None, repetition_penalty: float | None = None, min_new_tokens: int | None = None, max_new_tokens: int | None = None, stop_sequences: List[str] | None = None, time_limit: int | None = None, truncate_input_tokens: int | None = None, prompt_variables: Dict[str, Any] | None = None, return_options: Dict[str, bool] | None = None)¶: Bases: Artifact

unitxt.inference.get_model_and_label_id(model_name, label)¶