๐ Llama 3 1 70B Instruct Wmlยถ
engines.classification.llama_3_1_70b_instruct_wml
type: WMLInferenceEngineGeneration
model_name: meta-llama/llama-3-1-70b-instruct
max_new_tokens: 5
random_seed: 42
decoding_method: greedy
[source]Explanation about WMLInferenceEngineGenerationยถ
Generates text for textual inputs.
If you want to include images in your input, please use โWMLInferenceEngineChatโ instead.
- Args:
- concurrency_limit (int):
Number of concurrent requests sent to a model. Default is 10, which is also the maximum value.
- Examples:
from .api import load_dataset wml_credentials = { "url": "some_url", "project_id": "some_id", "api_key": "some_key" } model_name = "google/flan-t5-xxl" wml_inference = WMLInferenceEngineGeneration( credentials=wml_credentials, model_name=model_name, data_classification_policy=["public"], top_p=0.5, random_seed=123, ) dataset = load_dataset( dataset_query="card=cards.argument_topic,template_card_index=0,loader_limit=5" ) results = wml_inference.infer(dataset["test"])
Read more about catalog usage here.