LM Studio Beginner Guide 2026 — From Zero to Your First LLM

Installing LM Studio

LM Studio is available for free for Windows, macOS and Linux. Here's how to install it:

Go to lmstudio.ai
Click "Download" and choose your system
Launch the downloaded installer
Follow the installation wizard (accept default settings)
On first launch, LM Studio will automatically download its components

💡 Tip: On macOS, you may need to authorize the application in System Preferences > Security & Privacy after first launch.

GPU Configuration

For optimal performance, configure GPU acceleration. This step is crucial for generation speed.

Windows with NVIDIA

Open LM Studio and go to the Settings tab (⚙️)
Click Hardware in the left menu
Under "GPU Acceleration", select cuBLAS (NVIDIA)
Enable "GPU offload" and set the slider to 100%
Verify that your GPU appears in the list

macOS (Apple Silicon)

Go to Settings > Hardware
Select Metal as GPU backend
Enable "GPU offload" — all unified RAM will be used

⚠️ Warning: If you don't have a compatible GPU, LM Studio will use the CPU. It will work but be 5-10x slower.

Download your first model

LM Studio integrates a model browser. Here's how to find and download an LLM:

Click on the Search tab (🔍) on the left
In the search bar, type a model name (ex: "Qwen 3", "Llama 3.3")
Filter by recommended size according to your RAM:
- 8GB RAM → 3-7B models
- 16GB RAM → 7-13B models
- 32GB+ RAM → 30B+ models
Click on a model (ex: "Qwen3-8B-Q5_K_M.gguf")
Click Download — the model will be saved automatically

💡 Quick alternative: Use LocalClaw to get a personalized recommendation with direct download link!

Launch chat and configure the model

Once the model is downloaded, it's time to chat with your local AI:

Click on the Chat tab (💬) on the left
At the top right, click Select a model to load
Choose the downloaded model from the list
Wait for loading (progress bar)
Once loaded, the chat window becomes active

Important parameters to know

Temperature (0-2): Model creativity. 0.7 = balanced, 1.2+ = more creative/risky
Context Length: Conversation memory length (default: 4096 tokens)
Max Tokens: Maximum length of responses

Your first prompts

Test these prompts to evaluate your installation:

Conversation test:

Explain the theory of relativity to me like I'm 10 years old

Code test:

Write a Python function that calculates the Fibonacci sequence up to n

Reasoning test:

A train leaves Paris at 100km/h, another leaves Lyon at 120km/h...

💡 Performance indicator: Look at the "tokens/s" counter at the bottom of the screen. Above 20 tok/s is fluid, 50+ tok/s is very responsive.

Advanced features

Once comfortable, explore these powerful features:

Server Mode (local API)

LM Studio can serve your model as an OpenAI-compatible API:

Go to the Developer tab (👨‍💻)
Click Start Server
Your model is now accessible on http://localhost:1234
Use this URL in any OpenAI-compatible application

RAG — Chat with your documents

LM Studio 0.3+ allows you to "converse" with your files:

In the Chat tab, click the 📎 icon
Select a PDF or text file
The model will respond based on the document content

System Prompt

Customize model behavior with a "system prompt":

In chat settings, find "System Prompt"
Example: You are a Python programming expert. Respond concisely and technically.
This will influence the entire conversation

Troubleshooting common issues

❌ "Out of memory" / Crashes

The model is too large for your RAM. Switch to a smaller version (7B instead of 13B) or a stronger quantization (Q4 instead of Q8).

🐌 Very slow generation

Verify that the GPU is properly configured in Settings > Hardware. If you don't have a compatible GPU, it's normal to be slow on CPU.

📥 Model won't download

Check your internet connection and available disk space. Try another model to isolate the problem.

Conclusion

Congratulations! You now have a completely local, private, and functional generative AI. No subscription, no data sent to the cloud, just you and your machine.

To go further:

Test different models to find the one that matches your usage
Experiment with parameters (temperature, context length)
Explore the local API to integrate your LLMs into other tools
Join the community on Reddit r/LocalLLaMA for the latest news

And don't forget: LocalClaw is here to help you choose the perfect model every time you want to explore a new LLM!

Guide

How to Choose the Right Local LLM in 2026

Top 2026

LM Studio Beginner Guide: From Zero to Your First LLM

Prerequisites