📄 Llama 3 1 70B Instruct Wml

engines.classification.llama_3_1_70b_instruct_wml

type: WMLInferenceEngine
model_name: meta-llama/llama-3-1-70b-instruct
max_new_tokens: 5
random_seed: 42
decoding_method: greedy
[source]

Explanation about WMLInferenceEngine

Runs inference using ibm-watsonx-ai.

Attributes:
credentials (Dict[str, str], optional): By default, it is created by a class

instance which tries to retrieve proper environment variables (“WML_URL”, “WML_PROJECT_ID”, “WML_APIKEY”). However, a dictionary with the following keys: “url”, “apikey”, “project_id” can be directly provided instead.

model_name (str, optional): ID of a model to be used for inference. Mutually

exclusive with ‘deployment_id’.

deployment_id (str, optional): Deployment ID of a tuned model to be used for

inference. Mutually exclusive with ‘model_name’.

parameters (WMLInferenceEngineParams, optional): Instance of WMLInferenceEngineParams

which defines inference parameters and their values. Deprecated attribute, please pass respective parameters directly to the WMLInferenceEngine class instead.

concurrency_limit (int): number of requests that will be sent in parallel, max is 10.

Examples:

from .api import load_dataset

wml_credentials = {

“url”: “some_url”, “project_id”: “some_id”, “api_key”: “some_key”

} model_name = “google/flan-t5-xxl” wml_inference = WMLInferenceEngine(

credentials=wml_credentials, model_name=model_name, data_classification_policy=[“public”], top_p=0.5, random_seed=123,

)

dataset = load_dataset(

dataset_query=”card=cards.argument_topic,template_card_index=0,loader_limit=5”

) results = wml_inference.infer(dataset[“test”])

Read more about catalog usage here.