Skip to main content
Back to Blog

Running AI on My Mac: Why I Ditched ChatGPT for LM Studio

9 min read
AI LM Studio Privacy Mac Qwen Local AI M4 Pro Vision Models ChatGPT Alternative Offline AI
Featured image for Running AI on My Mac: Why I Ditched ChatGPT for LM Studio

TL;DR

LM Studio lets you run powerful AI models locally on Mac with Apple Silicon, offering complete privacy and no subscription costs, though it's slower than cloud alternatives.

Key Takeaways

  • LM Studio runs AI models locally with zero data leaving your machine
  • M4 Pro with 48GB RAM can run 30B parameter vision models like Qwen3-VL
  • Vision models understand screenshots, diagrams, and UI mockups
  • Costs $0 vs $240/year for ChatGPT Plus
  • Trade-offs: slower responses (5-10 sec), requires storage for models

LM Studio lets you run large language models locally on Apple Silicon Macs — completely free, offline, and private. On an M4 Pro with 48GB RAM, the best models are Qwen3-30B for coding/analysis and Qwen3-VL-30B for vision tasks. You trade cloud speed (responses take 5-10 seconds) for zero cost, full privacy, and no dependency on OpenAI’s uptime.

I was knee-deep in a coding problem when ChatGPT went dark. The little error message mocked me. “Try again later,” it said. My deadline wasn’t going to wait.

That outage was annoying. But it got me thinking. Every time I use ChatGPT, my data flies to OpenAI’s servers. They log it. They train on it. Maybe that’s fine for “what’s a good pizza recipe.” But code snippets? Research notes? That felt wrong.

I needed something different. Something local. Something mine.

The Cloud Problem No One Talks About

Here’s the thing about cloud AI. It’s convenient. Type a question, get an answer. But convenience has a cost.

Your prompts live on someone else’s computer. They say they don’t use it for training anymore. Maybe they don’t. But can you be sure? What about government requests? Data breaches? Server logs?

I’m not paranoid. I just value control. My Mac has plenty of power. Why send my data across the internet when I can process it right here?

How I Found LM Studio

A friend mentioned LM Studio in passing. “Run AI models on your Mac,” he said. “Totally free.”

I was skeptical. Local AI sounded slow. Complicated. Probably worse than the cloud versions.

I downloaded it anyway. The install was simple. No account. No credit card. Just download and go.

The interface looked clean. Like a chat app, but with a model picker. I could browse thousands of open-source models. Download them. Run them locally.

No API keys. No monthly bills. No data leaving my machine.

My M4 Pro Makes This Possible

I upgraded to an M4 Pro MacBook Pro earlier this year. 48GB of unified memory. At the time, I thought I’d gone overboard.

Turns out, that RAM is perfect for AI. Most people run smaller models. Maybe 8B parameters. Those work fine on 16GB machines.

But with 48GB? I can run Qwen3-VL-30B. That’s a big model. A smart model. And it can do something most AI tools can’t.

It can see.

Model Recommendations by RAM

Not sure which model to run? Here’s what works at different RAM levels:

RAMBest ModelsUse Case
16GBQwen-7B, Llama-8B, Mistral-7BBasic chat, simple coding questions
32GBQwen-14B, Llama-13B, Deepseek-Coder-6.7BComplex reasoning, longer context windows
48GB+Qwen3-VL-30B, Llama-70B (quantized), CodeLlama-34BVision models, advanced analysis, code generation

More RAM means bigger models. Bigger models mean better reasoning. It’s that simple.

Best Models for Apple Silicon Macs (2026)

Mac ConfigBest ModelQuantVRAM UsageBest ForRating
M4 Pro 24GBQwen 2.5 Coder 14BQ5_K_M~12GBCoding, general chat⭐⭐⭐⭐⭐
M4 Pro 24GBLlama 3.3 8BQ8_0~10GBFast general purpose⭐⭐⭐⭐
M4 Max 36GBQwen 2.5 32BQ5_K_M~24GBBest all-rounder⭐⭐⭐⭐⭐
M4 Max 48GBLlama 3.3 70BQ4_K_M~42GBMaximum capability⭐⭐⭐⭐⭐
M4 Max 48GBQwen 2.5 VL 32BQ5_K_M~24GBVision + text⭐⭐⭐⭐
Any Mac 16GB+Phi-4 Mini 3.8BQ8_0~5GBLight tasks, testing⭐⭐⭐

My daily driver: Qwen 2.5 Coder 14B at Q5_K_M on an M4 Pro 24GB. It handles coding tasks, document analysis, and general questions without breaking a sweat — and everything stays on my machine.

Vision Models Changed Everything

Qwen3-VL is a vision-language model. Feed it text, it responds. Feed it an image, it understands that too.

Last week I got a weird error in my terminal. Red text everywhere. I took a screenshot. Pasted it into LM Studio. Asked “what’s wrong here?”

The model looked at my screenshot. Pointed out the exact line causing trouble. Explained why. Suggested a fix.

Here’s the kind of prompt that works:

Look at this terminal screenshot. The red error text starts with “TypeError”. What’s causing this and how do I fix it?

And the model responds with context-aware analysis:

The error “TypeError: Cannot read properties of undefined” on line 47 indicates you’re trying to access a property on a variable that wasn’t initialized. Check that userData is defined before calling userData.profile.name. Add a null check or optional chaining: userData?.profile?.name.

This isn’t OCR. The model actually understands visual context. Diagrams. Charts. UI mockups. Code screenshots with syntax highlighting intact.

I also run the smaller Qwen3-VL-8B. It’s faster. Good for quick questions. But the 30B model? That’s where the magic happens. Complex reasoning. Better context. More accurate answers.

Most people can’t run 30B models. Not enough RAM. I can. That’s my edge.

Speed Comparison: Cloud vs Local

Let’s be honest about performance. Here’s what I measured:

TaskChatGPTLM Studio (Qwen-30B)
Simple question~1 sec~5 sec
Code explanation (100 lines)~2 sec~10 sec
Image analysis~3 sec~15 sec
Long document summary~4 sec~20 sec

The slower speed is the main trade-off. But here’s the thing—I’ve found it rarely matters for actual work. While waiting 10 seconds for a response, I’m reading the question I just asked. By the time I look up, the answer is there.

What I Actually Use It For

Every day looks different. Sometimes I’m researching a new library. I paste documentation. Ask questions. The model helps me understand faster than reading alone.

Other times I’m experimenting. Testing prompts. Trying different models. Seeing what works. LM Studio makes this easy. Switch models with one click.

I’ve used it for code reviews. Explaining legacy code. Brainstorming architecture. Even writing. If you’re curious how local models compare to cloud-based options for coding, I broke that down in my AI coding tools comparison. The AI isn’t perfect. But it’s always there. Always private.

Yesterday I analyzed a competitor’s UI design. Screenshot → LM Studio → detailed breakdown of their layout choices. All offline. No one tracking what I’m researching.

That’s the real win. Privacy isn’t about hiding. It’s about control.

Alternatives I Considered

LM Studio isn’t the only option. Here’s how it compares:

  • Ollama – Command-line focused, no built-in GUI. Great for devs who prefer terminal. Excellent for automation and scripting.
  • GPT4All – Simpler UI, fewer models, easier for beginners. Good starting point if LM Studio feels overwhelming.
  • Jan.ai – Similar to LM Studio, newer project, currently has fewer model options. Worth watching.

I picked LM Studio for the model variety and vision model support. The interface is clean. The model selection is massive. And it just works.

The Honest Downsides

LM Studio isn’t perfect. Let me be real about the problems.

First, it’s slower than ChatGPT. Cloud models have massive GPU clusters. My Mac has… one M4 Pro. Responses take longer. Maybe 5-10 seconds instead of instant.

Second, you manage everything yourself. Want a new model? Download it. That’s a few gigabytes. Models pile up fast. You’ll need storage space.

Third, there’s a learning curve. Which model for which task? What’s the difference between 7B and 70B? You have to learn this stuff.

The interface is simpler than the web version of ChatGPT. No plugins. No browsing. Just you and the model.

For some tasks, cloud AI still wins. Web search? Up-to-date info? Yeah, ChatGPT is better there. And when AI-generated code fails in production, running models locally gives you more control over the fallout — something I explored in my take on vibe coding.

Why I Keep Using It Anyway

Because my data stays mine.

Because I don’t need internet to think.

Because I’m not paying per token.

I ran the numbers. ChatGPT Plus is $20/month. That’s $240/year. LM Studio is free. The models are free. I paid for the Mac anyway.

When my internet goes down, LM Studio still works. When OpenAI has an outage, I don’t care. When they change their pricing, doesn’t affect me.

I’m not locked into their ecosystem. Don’t like Qwen? Try Llama. Or Mistral. Or Deepseek. Or whatever comes next week.

Open source moves fast. Really fast. Models improve constantly. And they’re all free.

Getting Started Is Easier Than You Think

Go to lmstudio.ai. Download the app. Open it.

Click “Discover” to browse models. Search for “Qwen3-VL-8B” if you want to try vision. Or “Qwen-7B” for text-only.

Download a model. Wait. It’s big. Go make coffee.

Once downloaded, click “Load.” Pick your model. Start chatting.

That’s it. Seriously.

If you have a Mac with Apple Silicon, you’re golden. 16GB RAM minimum. More is better. Windows and Linux work too.

The models live at Hugging Face. Thousands of them. Some good. Some bad. LM Studio shows you ratings and downloads to help pick.

My Final Take

I still use ChatGPT sometimes. It has its place. But for daily work? LM Studio wins.

The M4 Pro with 48GB was expensive. But running serious AI models locally? That alone justifies the cost. The vision models are a game-changer. Most people don’t realize what’s possible.

If you care about privacy, try it. If you hate subscriptions, try it. If you just want to own your tools, try it.

Your data belongs to you. Not to some company’s training dataset.

Download LM Studio. Pick a model. See what local AI can do.

You might not go back.

Frequently Asked Questions

How much RAM do I need to run LM Studio?

Minimum 16GB for smaller 7B-8B models. 32GB+ recommended for 13B models. 48GB+ needed for 30B+ parameter models like Qwen3-VL-30B.

Is LM Studio really free?

Yes, completely free. No account, no credit card, no subscription. The models from Hugging Face are also free and open-source.

Can LM Studio analyze images and screenshots?

Yes, with vision-language models like Qwen3-VL. You can paste screenshots of code errors, UI designs, charts, or diagrams and the model will understand the visual context.

How does LM Studio compare to ChatGPT?

LM Studio is slower (5-10 seconds vs instant), works offline, has complete privacy, costs nothing, but lacks web browsing and real-time information. For daily coding and analysis work, it's highly capable.

What Mac models support LM Studio?

Any Mac with Apple Silicon (M1, M2, M3, M4). Performance scales with RAM and chip tier. M4 Pro/Max with 48GB+ RAM offers the best experience for large models.

How to Set Up LM Studio on Mac

Total time: PT15M

  1. 1

    Download LM Studio

    Go to lmstudio.ai and download the Mac installer. Open and drag to Applications.

  2. 2

    Browse Available Models

    Click 'Discover' tab. Search for 'Qwen3-VL-8B' for vision capabilities or 'Qwen-7B' for text-only.

  3. 3

    Download a Model

    Click download on your chosen model. Wait for download to complete (several GB).

  4. 4

    Load and Chat

    Click 'Load' to activate the model. Start chatting in the conversation interface.

Divanshu Chauhan

Divanshu Chauhan (@divkix)

Software Engineer based in Tempe, Arizona, USA. More about divkix