Running AI on My Mac: Why I Ditched ChatGPT for LM Studio (And Saved $240/Year)

LM Studio lets you run large language models locally on Apple Silicon Macs — completely free, offline, and private. On an M4 Pro with 48GB RAM, the best models are Qwen3-30B for coding/analysis and Qwen3-VL-30B for vision tasks. You trade cloud speed (responses take 5-10 seconds) for zero cost, full privacy, and no dependency on OpenAI’s uptime.

I was knee-deep in a coding problem when ChatGPT went dark. The little error message mocked me. “Try again later,” it said. My deadline wasn’t going to wait.

That outage was annoying. But it got me thinking. Every time I use ChatGPT, my data flies to OpenAI’s servers. They log it. They train on it. Maybe that’s fine for “what’s a good pizza recipe.” But code snippets? Research notes? That felt wrong.

I needed something different. Something local. Something mine.

The Cloud Problem No One Talks About

Here’s the thing about cloud AI. It’s convenient. Type a question, get an answer. But convenience has a cost.

Your prompts live on someone else’s computer. They say they don’t use it for training anymore. Maybe they don’t. But can you be sure? What about government requests? Data breaches? Server logs?

I’m not paranoid. I just value control. My Mac has plenty of power. Why send my data across the internet when I can process it right here?

How I Found LM Studio

A friend mentioned LM Studio in passing. “Run AI models on your Mac,” he said. “Totally free.”

I was skeptical. Local AI sounded slow. Complicated. Probably worse than the cloud versions.

I downloaded it anyway. The install was simple. No account. No credit card. Just download and go.

The interface looked clean. Like a chat app, but with a model picker. I could browse thousands of open-source models. Download them. Run them locally.

No API keys. No monthly bills. No data leaving my machine.

My M4 Pro Makes This Possible

I upgraded to an M4 Pro MacBook Pro earlier this year. 48GB of unified memory. At the time, I thought I’d gone overboard.

Turns out, that RAM is perfect for AI. Most people run smaller models. Maybe 8B parameters. Those work fine on 16GB machines.

But with 48GB? I can run Qwen3-VL-30B. That’s a big model. A smart model. And it can do something most AI tools can’t.

It can see.

Model Recommendations by RAM

Not sure which model to run? Here’s what works at different RAM levels:

RAM	Best Models	Use Case
16GB	Qwen-7B, Llama-8B, Mistral-7B	Basic chat, simple coding questions
32GB	Qwen-14B, Llama-13B, Deepseek-Coder-6.7B	Complex reasoning, longer context windows
48GB+	Qwen3-VL-30B, Llama-70B (quantized), CodeLlama-34B	Vision models, advanced analysis, code generation

More RAM means bigger models. Bigger models mean better reasoning. It’s that simple.

Best Models for Apple Silicon Macs (2026)

Mac Config	Best Model	Quant	VRAM Usage	Best For	Rating
M4 Pro 24GB	Qwen 2.5 Coder 14B	Q5_K_M	~12GB	Coding, general chat	⭐⭐⭐⭐⭐
M4 Pro 24GB	Llama 3.3 8B	Q8_0	~10GB	Fast general purpose	⭐⭐⭐⭐
M4 Max 36GB	Qwen 2.5 32B	Q5_K_M	~24GB	Best all-rounder	⭐⭐⭐⭐⭐
M4 Max 48GB	Llama 3.3 70B	Q4_K_M	~42GB	Maximum capability	⭐⭐⭐⭐⭐
M4 Max 48GB	Qwen 2.5 VL 32B	Q5_K_M	~24GB	Vision + text	⭐⭐⭐⭐
Any Mac 16GB+	Phi-4 Mini 3.8B	Q8_0	~5GB	Light tasks, testing	⭐⭐⭐

My daily driver: Qwen 2.5 Coder 14B at Q5_K_M on an M4 Pro 24GB. It handles coding tasks, document analysis, and general questions without breaking a sweat — and everything stays on my machine.

Vision Models Are the Unexpected Part

Qwen3-VL is a vision-language model. Feed it text, it responds. Feed it an image, it understands that too.

Last week I got a weird error in my terminal. Red text everywhere. I took a screenshot. Pasted it into LM Studio. Asked “what’s wrong here?”

The model looked at my screenshot. Pointed out the exact line causing trouble. Explained why. Suggested a fix.

Here’s the kind of prompt that works:

Look at this terminal screenshot. The red error text starts with “TypeError”. What’s causing this and how do I fix it?

And the model responds with context-aware analysis:

The error “TypeError: Cannot read properties of undefined” on line 47 indicates you’re trying to access a property on a variable that wasn’t initialized. Check that userData is defined before calling userData.profile.name. Add a null check or optional chaining: userData?.profile?.name.

This isn’t OCR. The model actually understands visual context. Diagrams. Charts. UI mockups. Code screenshots with syntax highlighting intact.

I also run the smaller Qwen3-VL-8B for quick questions — it’s faster and good enough for most things. The 30B model is noticeably better for anything that requires multi-step reasoning or a longer context.

Most people with 16GB machines are running 7B or 8B models. With 48GB you can run 30B, and the quality difference is real.

Speed Comparison: Cloud vs Local

Let’s be honest about performance. Here’s what I measured:

Task	ChatGPT	LM Studio (Qwen-30B)
Simple question	~1 sec	~5 sec
Code explanation (100 lines)	~2 sec	~10 sec
Image analysis	~3 sec	~15 sec
Long document summary	~4 sec	~20 sec

The slower speed is the main trade-off. But here’s the thing—I’ve found it rarely matters for actual work. While waiting 10 seconds for a response, I’m reading the question I just asked. By the time I look up, the answer is there.

What I Actually Use It For

Every day looks different. Sometimes I’m researching a new library. I paste documentation. Ask questions. The model helps me understand faster than reading alone.

Other times I’m experimenting. Testing prompts. Trying different models. Seeing what works. LM Studio makes this easy. Switch models with one click.

I’ve used it for code reviews. Explaining legacy code. Brainstorming architecture. Even writing. If you’re curious how local models compare to cloud-based options for coding, I broke that down in my AI coding tools comparison. The AI isn’t perfect. But it’s always there. Always private.

Yesterday I analyzed a competitor’s UI design. Screenshot → LM Studio → detailed breakdown of their layout choices. All offline. No one tracking what I’m researching.

That’s the real win. Privacy isn’t about hiding. It’s about control.

Alternatives I Considered

LM Studio isn’t the only option. Here’s how it compares:

Ollama – Command-line focused, no built-in GUI. Great for devs who prefer terminal. Excellent for automation and scripting.
GPT4All – Simpler UI, fewer models, easier for beginners. Good starting point if LM Studio feels overwhelming.
Jan.ai – Similar to LM Studio, newer project, currently has fewer model options. Worth watching.

I picked LM Studio for the model variety and vision model support. The interface is clean. The model selection is massive. And it just works.

The Honest Downsides

LM Studio isn’t perfect. Let me be real about the problems.

First, it’s slower than ChatGPT. Cloud models have massive GPU clusters. My Mac has… one M4 Pro. Responses take longer. Maybe 5-10 seconds instead of instant.

Second, you manage everything yourself. Want a new model? Download it. That’s a few gigabytes. Models pile up fast. You’ll need storage space.

Third, there’s a learning curve. Which model for which task? What’s the difference between 7B and 70B? You have to learn this stuff.

The interface is simpler than the web version of ChatGPT. No plugins. No browsing. Just you and the model.

For some tasks, cloud AI still wins. Web search? Up-to-date info? Yeah, ChatGPT is better there. And when AI-generated code fails in production, running models locally gives you more control over the fallout — something I explored in my take on vibe coding.

Why I Keep Using It Anyway

The practical answer: my data stays on my machine, it works offline, and I’m not paying per token. ChatGPT Plus is $20/month — $240/year. LM Studio is free, and the models are free. I bought the Mac for other reasons anyway.

There’s also the ecosystem thing. I’m not locked into one model or one provider. If Qwen releases something better next week, I download it. If I don’t trust a particular model for a particular task, I swap. That flexibility matters more than I expected when I started.

Open source model releases have been genuinely relentless. What counts as “good” keeps moving upward.

Getting Started

Go to lmstudio.ai. Download the app. Open it.

Click “Discover” to browse models. Search for “Qwen3-VL-8B” if you want to try vision. Or “Qwen-7B” for text-only.

Download a model. Wait. It’s big. Go make coffee.

Once downloaded, click “Load.” Pick your model. Start chatting.

That’s it. Seriously.

If you have a Mac with Apple Silicon, you’re golden. 16GB RAM minimum. More is better. Windows and Linux work too.

The models live at Hugging Face. Thousands of them. Some good. Some bad. LM Studio shows you ratings and downloads to help pick.

My Final Take

I still use ChatGPT. It’s better for anything that needs web access or current information. But for daily coding work, document analysis, and anything involving code I’d rather not send to an external server — LM Studio is what I reach for.

The M4 Pro with 48GB was expensive. Running local models is what made it feel worth it. The vision capability was the surprise — I didn’t expect that to be genuinely useful, and it is.

If you have Apple Silicon and care about where your data goes, it’s worth downloading and trying. The barrier is lower than it sounds.

AI Coding Tools Compared (2026) — How local models stack up against cloud AI assistants
The Truth About Vibe Coding — When AI-generated code works and when it bites you
I Built Logwell for Self-Hosted Logging — PostgreSQL-native logging with Claude Code
Full-Stack App on Cloudflare Workers — D1, Durable Objects, Queues, and AI parsing
My Side Project Stack in 2026 — The full toolkit alongside LM Studio

Running AI on My Mac: Why I Ditched ChatGPT for LM Studio (And Saved $240/Year)

TL;DR

Key Takeaways

The Cloud Problem No One Talks About

How I Found LM Studio

My M4 Pro Makes This Possible

Model Recommendations by RAM

Best Models for Apple Silicon Macs (2026)

Vision Models Are the Unexpected Part

Speed Comparison: Cloud vs Local

What I Actually Use It For

Alternatives I Considered

The Honest Downsides

Why I Keep Using It Anyway

Getting Started

My Final Take

Frequently Asked Questions

How much RAM do I need to run LM Studio?

Is LM Studio really free?

Can LM Studio analyze images and screenshots?

How does LM Studio compare to ChatGPT?

What Mac models support LM Studio?

Divanshu Chauhan (@divkix)

TL;DR

Key Takeaways

The Cloud Problem No One Talks About

How I Found LM Studio

My M4 Pro Makes This Possible

Model Recommendations by RAM

Best Models for Apple Silicon Macs (2026)

Vision Models Are the Unexpected Part

Speed Comparison: Cloud vs Local

What I Actually Use It For

Alternatives I Considered

The Honest Downsides

Why I Keep Using It Anyway

Getting Started

My Final Take

Related Posts

Frequently Asked Questions

How much RAM do I need to run LM Studio?

Is LM Studio really free?

Can LM Studio analyze images and screenshots?

How does LM Studio compare to ChatGPT?

What Mac models support LM Studio?

Divanshu Chauhan (@divkix)