llm_load_test run

Load test the wisdom model

Parameters

host

The host endpoint of the gRPC call

port

The gRPC port on the specified host

duration

The duration of the load testing

plugin

The llm-load-test plugin to use (tgis_grpc_plugin or caikit_client_plugin for now)
default value: tgis_grpc_plugin

interface

(http or grpc) the interface to use for llm-load-test-plugins that support both
default value: grpc

model_id

The ID of the model to pass along with the GRPC call
default value: /mnt/models/

src_path

Path where llm-load-test has been cloned
default value: projects/llm_load_test/subprojects/llm-load-test/

streaming

Whether to stream the llm-load-test requests
default value: True

use_tls

Whether to set use_tls: True (grpc in Serverless mode)

concurrency

Number of concurrent simulated users sending requests
default value: 16

min_input_tokens

Min input tokens in llm load test to filter the dataset

min_output_tokens

Min output tokens in llm load test to filter the dataset

max_input_tokens

Max input tokens in llm load test to filter the dataset
default value: 1024

max_output_tokens

Max output tokens in llm load test to filter the dataset
default value: 512

max_sequence_tokens

Max sequence tokens in llm load test to filter the dataset
default value: 1536

endpoint

Name of the endpoint to query (for openai plugin only)
default value: /v1/completions

python_cmd

Command to use to launch Python
default value: python3