Headless Configuration and Usage

The rlmstudio package provides robust support for running LM Studio in completely headless environments. This is ideal for Linux servers, Docker containers, remote cloud instances, and automated CI/CD pipelines where a visual desktop application is unavailable or inconvenient.

To operate without a GUI, LM Studio relies on a background process called the llmster daemon. This vignette will walk you through managing the daemon, starting the local server, and fully automating your local LLM workflows.

Setup and Installation

If you are setting up a fresh remote server, you can use the package to download and install the LM Studio CLI automatically via the terminal. Run install_lmstudio(method = "headless") in your console to execute the automated installation script.

# Verify the CLI is available before proceeding
has_lms()
#>  lms 
#> TRUE

Step-by-Step Guide

1. Start the Background Daemon

Unlike the desktop version where opening the app initializes the backend engine, a headless environment requires you to start the engine manually. You must start the llmster daemon before attempting to load models or start the API server.

# Start the headless engine in the background
lms_daemon_start()
#> ✔ LM Studio daemon started in the background.

2. Start the Local Server

With the daemon running, you can now spin up the REST API server to accept HTTP requests.

# Start the local server on the default port
lms_server_start()
#> ✔ LM Studio server started successfully on the default port.

3. Finding and Managing Models

Because you do not have the GUI’s visual search tool, you will need to know the Hugging Face repository or the LM Studio catalog identifier for the model you want to use.

# Download a model using its identifier
job_id <- lms_download("qwen/qwen3-4b-2507")
#> ℹ Initiating download for model: "qwen/qwen3-4b-2507"...
#> ✔ Initiating download for model: "qwen/qwen3-4b-2507"... [1.1s]
#> 
#> ✔ Model "qwen/qwen3-4b-2507" is already downloaded.
lms_download_status(job_id)
#> 
#> ── Download Job: "N/A"
#> Status: already_downloaded

# View all downloaded models
models <- list_models()

# Filter for unloaded text models
unloaded_llms <- models |>
  subset(type == "llm" & state == "unloaded")
unloaded_llms
#>      state type  display_name                 key architecture size_gb
#> 1 unloaded  llm   Gemma 4 E2B  google/gemma-4-e2b       gemma4    4.11
#> 2 unloaded  llm   Gemma 4 E4B  google/gemma-4-e4b       gemma4    5.89
#> 3 unloaded  llm Qwen3 4B 2507  qwen/qwen3-4b-2507        qwen3    2.12
#> 4 unloaded  llm    Gemma 3 1B   google/gemma-3-1b  gemma3_text    0.72
#> 5 unloaded  llm    Gemma 3 4B   google/gemma-3-4b       gemma3    2.83
#> 6 unloaded  llm  Gemma 3n E4B google/gemma-3n-e4b      gemma3n    5.46
#> 7 unloaded  llm   Gemma 3 12B  google/gemma-3-12b       gemma3    7.51

4. Loading Models

Allocate the model to your system’s memory (RAM/VRAM) so it is ready for inference.

# Load the model
lms_load("google/gemma-3-1b", flash_attention = TRUE)
#> ℹ Loading model: "google/gemma-3-1b"...
#> ✔ Model "google/gemma-3-1b" loaded and verified. [7.7s]
#>

5. Chatting

Interact with the model exactly as you would in a desktop environment.

response <- lms_chat(
  model = "google/gemma-3-1b",
  input = "Provide just the str_extract() pattern to match all text after the third comma.",
  system_prompt = "You are an expert R programmer familiar with the tidyverse."
)

cat(response)

#> ```r
#> str_extract(text, ".*,(.*)")
#> ```
#> 
#> This is the most concise and correct way to extract all text *after* the third comma in a string using the tidyverse's `str_extract()` function.  It directly targets the desired pattern without requiring further refinement.

6. Teardown and Cleanup

In a headless environment, managing your system resources is critical. When your script finishes, you should explicitly tear down the entire stack to free up memory and stop background processes.

# 1. Unload the model from memory
lms_unload("google/gemma-3-1b")
#> ℹ Unloading model: "google/gemma-3-1b"...
#> ✔ Model "google/gemma-3-1b" unloaded successfully. [384ms]
#> 

# 2. Stop the API server
lms_server_stop()
#> ✔ LM Studio server stopped successfully.

# 3. Stop the background daemon
lms_daemon_stop()
#> ℹ The daemon is managed by the LM Studio GUI and will remain running.

Bonus: Pipeline Automation

If you are writing a script that just needs to run a quick job and exit, managing the daemon state manually can be tedious. The with_lms_daemon() wrapper handles the setup and guaranteed teardown of the background engine automatically.

# The daemon will start, the code will run, and the daemon will stop on exit.
results <- with_lms_daemon({
  lms_server_start()
  lms_load("google/gemma-3-1b")

  res <- lms_chat("google/gemma-3-1b", "Is the daemon running?")

  lms_server_stop()
  res
})
#> ✔ LM Studio daemon started in the background.
#> ✔ LM Studio server started successfully on the default port.
#> ℹ Loading model: "google/gemma-3-1b"...
#> ✔ Model "google/gemma-3-1b" loaded and verified. [6.9s]
#> 
#> ✔ LM Studio server stopped successfully.
#> ℹ The daemon is managed by the LM Studio GUI and will remain running.