I am still hoping, but for the moment… I have been trying every 30-80B model tha...

DrAwdeOccarim · 2025-12-22T12:06:01 1766405161

I use Opus 4.5 and GPT 5.2-Codex through VS Code all day long, and the closest I've come is Devstral-Small-2-24B-Instruct-2512 inferring on a DGX Spark hosting with vLLM as an "Open AI Compatible" API endpoint I use to power the Cline VS Code extension.

It works, but it's slow. Much more like set it up and come back in an hour and it's done. I am incredibly impressed by it. There are quantized GGUFs and MLXs of the 123B, which can fit on my M3 36GB Macbook that I haven't tried yet.

But overall, it feels about about 50% too slow, which blows my mind because we are probably 9 months away from a local model that is fast and good enough for my script kiddie work.

larodi · 2025-12-22T09:32:07 1766395927

Claude Code is a lot about prompting and orchestration of the conversation. The LLM is just a tool in these agentic frameworks. Whats truly ingenious is how context is engineered/managed, how is the code-RAG approached, and them LLM memory that is used.

So my guess would be - we need open conversation or something along the line of "useful linguistic-AI approaches for combing and grooming code"

jwr · 2025-12-22T10:17:04 1766398624

Agreed. I've been trying to use opencode and crush, and none of them do anything useful for me. In contrast, claude code "just works" and does genuinely useful work. And it's not just because of the specific LLM used, it's the overall engineering of the tool, the prompt behind the scenes, etc.

But the bottom line is that I still can't find a way to use either local LLMs and/or opencode and crush for coding.

sbene970 · 2025-12-22T18:58:04 1766429884

Search for "Claude Code Router" on GitHub, which you can use to route any models through Claude Code.

larodi · 2025-12-22T17:42:09 1766425329

Which is very sad and perhaps she should be aiming to introduce some very smart linguists into the whole ML:LLM thing that can learn and explore how to best to interact with the funny archive that models are.

lostmsu · 2025-12-22T16:50:19 1766422219

I did the same with recent stuff and so far gpt-oss-120b on high was the best with gpt-oss-20b on high close second.