📄 Llama 3 1 70B Instruct Wml¶
engines.classification.llama_3_1_70b_instruct_wml
type: WMLInferenceEngine
model_name: meta-llama/llama-3-1-70b-instruct
max_new_tokens: 5
random_seed: 42
decoding_method: greedy
[source]Explanation about WMLInferenceEngine¶
Runs inference using ibm-watsonx-ai.
- Attributes:
- credentials (Dict[str, str], optional): By default, it is created by a class
instance which tries to retrieve proper environment variables (“WML_URL”, “WML_PROJECT_ID”, “WML_APIKEY”). However, a dictionary with the following keys: “url”, “apikey”, “project_id” can be directly provided instead.
- model_name (str, optional): ID of a model to be used for inference. Mutually
exclusive with ‘deployment_id’.
- deployment_id (str, optional): Deployment ID of a tuned model to be used for
inference. Mutually exclusive with ‘model_name’.
- parameters (WMLInferenceEngineParams, optional): Instance of WMLInferenceEngineParams
which defines inference parameters and their values. Deprecated attribute, please pass respective parameters directly to the WMLInferenceEngine class instead.
concurrency_limit (int): number of requests that will be sent in parallel, max is 10.
- Examples:
from .api import load_dataset
- wml_credentials = {
“url”: “some_url”, “project_id”: “some_id”, “api_key”: “some_key”
} model_name = “google/flan-t5-xxl” wml_inference = WMLInferenceEngine(
credentials=wml_credentials, model_name=model_name, data_classification_policy=[“public”], top_p=0.5, random_seed=123,
)
- dataset = load_dataset(
dataset_query=”card=cards.argument_topic,template_card_index=0,loader_limit=5”
) results = wml_inference.infer(dataset[“test”])
Read more about catalog usage here.