mac_ai remote_llama_cpp_run_model

Runs a model with llama_cpp, on a remote host

Parameters

base_work_dir

The base directory where to store things

path

The path to the llama-server binary

port

The port number on which llama-cpp should listen

name

The name of the model to run

prefix

The prefix to get the llama-server running

ngl

Number of layers to store in VRAM
default value: 99