📝 Research :https://ojitha.blogspot.com.au for my lengthy articles.

Google Gemma 4 MoE (26B) on AMD Ryzen AI

May 10, 2026
Overview:
This technical log documents the installation and optimisation of the Google Gemma 4 Mixture-of-Experts (MoE) model on the MINISFORUM AI X1 Pro, a mini PC featuring the AMD Ryzen AI 9 HX 470 processor. The report details the challenges of running a large 26-billion-parameter model on a consumer-grade Unified Memory Architecture, focusing on critical RAM allocation and BIOS UMA adjustments. It explains how to resolve memory-mapping failures and hardware-specific OOM errors by bypassing standard Linux kernel overcommit limits and fine-tuning the vLLM and ROCm software stack. Performance comparisons highlight that while Ollama offers higher speeds for individual users, the vLLM backend provides superior efficiency for multi-user API environments. Ultimately, the guide provides a comprehensive resolution matrix and a definitive Docker configuration to achieve stable inference on this specific RDNA 3.5 hardware.</p>
More…

Running Gemma 4 E4B on the AMD ROCm

May 5, 2026
Gemma 4vLLM · ROCmRyzen AI · NPU

This deep-dive shows how to run the Google DeepMind Gemma 4 E4B model — a 4.5B-effective dense network with Per-Layer Embeddings — on a Minisforum AI X1 Pro driven by the AMD Ryzen AI 9 HX 470, Radeon 890M iGPU and XDNA 2 NPU. It walks the verified vLLM Docker recipe on ROCm 7.2, decomposes the hybrid sliding-window plus global attention that makes a 128K context fit on a 16GB-class memory budget, and shows where MIGraphX can offload an ONNX sidecar to the NPU. The result is a layered guide from architecture math through tuning, quantisation, and benchmarking.

More…

LLM Wiki

April 8, 2026
Scala Functors
Andrej Karpathy Obsidian method refers to his publicly shared system for using LLMs to build and maintain personal knowledge bases as interlinked markdown wikis, all viewed and navigated through Obsidian. The system, which he calls LLM Wiki or LLM Knowledge Bases.
More…

Running AMD ROCm AI Workloads locally

March 7, 2026
Scala Functors
This guide demonstrates how to run AMD ROCm AI workloads on the MINISFORUM AI X1 Pro-470 mini PC powered by the AMD Ryzen AI 9 HX 470 (12-core Zen5, up to 5.2GHz), featuring the integrated Radeon 890M (gfx1150) GPU and an 86 TOPS NPU. Running Ubuntu with the OEM kernel, the setup includes installing ROCm 7.2, verifying HSA agents with rocminfo, and deploying PyTorch in Docker containers. It also details configuring MIGraphX and ONNX Runtime with the MIGraphX Execution Provider via Docker Compose, enabling high-performance on-device ML inference — fully local, no discrete GPU required..
More…

Kubernetes Introduction

January 3, 2026
Scala Functors
This post provides a practical introduction to Kubernetes, focusing on essential networking concepts and hands-on debugging techniques within a Linux-based cluster. It guides readers through inspecting node roles, understanding Container Network Interfaces (CNIs) such as Flannel, and analysing overlay networks using tools such as ip route and crictl. The tutorial examines kube-proxy modes, service discovery via CoreDNS, and the deployment of multi-tier applications using Redis and Nginx. Furthermore, it demonstrates how to expose services using NodePort, scale deployments for high availability, and utilise `kubectl exec` and `cp` for effective pod interaction and troubleshooting.
More…