Local LLM Real-World Tests

Local LLM Real-World Tests: real hardware tests, screenshots, and tokens per second

This series keeps hands-on local model tests separate from the model finder. Each article uses a fixed hardware setup, captures the important steps, and reports measured speed instead of vague hardware advice.

First article · 2026-06-16

8GB RAM CPU-Only Local LLM Benchmark

Docker 8GiB, no GPU passthrough, six Ollama models tested with screenshots and measured tokens per second.

qwen2.5:0.5b: 58.02 tokens/sqwen2.5:7b: 7.15 tokens/sCPU-only, no GPU

Series rule

Every benchmark page must show the setup, model list, speed, and slow boundary

Future articles can cover 16GB RAM, 8GB VRAM, 16GB VRAM, MacBook unified memory, and multi-GPU workstations without crowding the model finder page. This gives search users a clearer path: use the finder to shortlist models, then read benchmark pages when they want real machine results.