Everyone’s Installing AI Agents. Nobody’s Building the System Around Them.
How a $200 Raspberry Pi, two weekends, and some hard lessons that taught me what the Mac Mini crowd is missing.
I’ve spent years coaching engineering leaders on product delivery, from discovery through the SDLC to getting real value into users’ hands.
This month I did something: I hired an AI agent.
Not a chatbot. Not a copilot that autocompletes my code. A persistent, always-on assistant that lives on a Raspberry Pi on my desk, talks to me through Telegram, remembers what we discussed in the past, sends me updates in the morning, and pushes its own code to GitHub.
Her name is Nova. She’s been running for about 2 weeks. I learned a lot about the technology, especially the safeguards. But the biggest lessons were about processes and people, which is where I’ve spent most of my career.
The Hypothesis
Here’s what I wanted to prove: you don’t need expensive hardware to run a capable OpenClaw AI assistant.
Every setup guide out there assumes you need a Mac Mini ($599+ retail for the M4 base model). And the people buying those Mac Minis? They’re sending their prompts to the same cloud API providers I am. OpenAI, Anthropic, xAI. The Mac Mini isn’t doing the thinking. It’s just the orchestrator. It routes requests, keeps a connection open, and runs scheduled jobs.
A $200 Raspberry Pi can do all of that. On 5 watts.
That was the hypothesis. Two weeks later, I can confirm: it works (so far). And the lessons I’ve already learned along the way are turning out to be more interesting than the hardware question.
The Gap Nobody’s Talking About
Go search for “OpenClaw setup” right now. You’ll find dozens of guides: buy a Mac Mini, install OpenClaw, connect Telegram, hook up Claude. Done. Ship it.
Then go read what people actually say after the first week:
“I burned $3.14 in an hour doing essentially nothing productive.” (Jeff Humble, The Fountain Institute)
Jeff Humble bought a €590 Mac Mini (about $600 USD), spent two days setting up OpenClaw, automated 1 out of 7 things he planned, and concluded: “The 🦞 didn’t replace Claude. It made me laugh instead.”
The pattern is always the same: install → basic setup → one cool demo → stop.
One person IS building a serious memory layer (the Behind the Craft podcast episode with Nat Eliason, where they talks about his OpenClaw build, is what gave me the idea that I could do it myself, but on a Raspberry Pi). But the broader community? Most people stop at install and cron jobs. The thing I find most frustrating about agents and AI is the lack of persistent memory. I wanted to see if I could build that capability and, if so, teach others how to do it.
Why Raspberry Pi?
A few reasons:
The Pi 5 cost me ~$200 for the Canakit setup. It draws about 5 watts. It sits on my desk next to my monitor and runs 24/7. My cloud API bills will be the real cost as I continue.
My assistant has access to my files, my Confluence space, and the tools I give it. That data lives on a device I physically own, on my home network, behind a firewall I configured. The Pi sits in device isolation on my network, meaning it can’t talk to anything else on my local network. When I tell Nova “remember this,” it writes to a file on a disk I can pull out of the machine.
There’s something concrete about owning the physical hardware your AI runs on. I can unplug it. I can image the SD card. I can read every file Nova has ever written by plugging the card into my laptop. With cloud-hosted solutions, your data lives on someone else’s infrastructure. Here, I can see exactly what’s stored and where.
Laptops sleep. Browser tabs close. The Pi doesn’t. Nova is there at 6 AM when I’m making coffee and want to know what’s happening today. She’s there at 10 PM when I realize I forgot something. The persistence changes the relationship from “tool I open” to “assistant who’s always around.”
But the real reason: the Pi proves the hypothesis. Everyone buying $600+ Mac Minis for OpenClaw is paying the same API providers I am for the actual compute. The Mac Mini isn’t running inference. It’s orchestrating. And a Pi orchestrates just as well at a third of the cost and a fraction of the power consumption.
The Setup
The stack is simpler than you’d think. OpenClaw is an open-source framework that turns an LLM into a persistent agent with tools, memory, and messaging integrations. Once I had the Pi configured, installing OpenClaw was one command plus a few dependencies. The whole setup from building the raspberry pi to hatching Nova took a Sunday, which honestly surprised me.
Telegram was a game changer for how I interact with Nova. I didn’t want to always have a computer in front of me or be at home to use these capabilities. With Telegram, I can message Nova from anywhere, whether I’m at a coffee shop or walking my dog. Telegram’s topics inside our project became our project channels: one for semantic search, one for memory systems, one for this article. It feels like a small Slack workspace where one of the members happens to be an AI.
OpenClaw also comes with a dashboard that tracks token spend, model usage, and session metrics. That’s worth its own article someday, but being able to see exactly what the agent is spending and where tokens are going has been invaluable for keeping track of spend on this side project.
First thing I did after setup was hardened the box. UFW firewall with default deny. SSH locked to my local network. Credential files locked down so only the assistant user can read them. If you’re putting an AI agent on your network with access to your stuff, treat it like any other server. Because it is one.
The Model Journey
I wanted to keep costs at zero and run everything locally. That lasted about a week.
I started with Ollama, running local models directly on the Pi. The appeal is obvious: no API costs, no data leaving the device, full privacy. And for simple tasks, it worked fine.
Once I’d built out the memory architecture, I needed embeddings for semantic search. The Pi’s ARM processor wasn’t built for this kind of workload. Processing was too slow to be usable. Embedding generation for session transcripts kept failing at scale. Even with 8 GB of RAM, the chip just couldn’t handle the volume of text I needed to process.
This is the one area where a Mac Mini has a genuine advantage. Apple Silicon with unified memory can comfortably run 7B-13B parameter models locally through Ollama. But here’s the thing, most Mac Mini OpenClaw users aren’t actually running local models. They’re sending everything to the same API providers I am. The local inference advantage is real, but largely unused in practice.
So I moved to OpenAI. GPT-5.2 for chat, text-embedding-3-small for search vectors. Faster, reliable, and honestly? The quality jump was immediate. Nova went from “adequate” to “actually useful.” She was sharp, responsive, and handled the work well.
Then Claude offered a discounted rate this weekend, and I first tried Claude Sonnet, their smaller, cheaper model. It was a rough experience. Sonnet would miss context, skip over nuance, and ignore guardrails I’d set. It took actions without being asked and lost track of what we’d agreed on.
Then I upgraded to Claude Opus with extended thinking. Opus catches contradictions, pushes back on bad ideas, and connects context across conversations in ways that are genuinely helpful. OpenAI was strong but Opus was a different tier for the kind of complex, multi-threaded work I needed to be efficient using Telegram’s Topics.
The configuration at the time of writing: Opus for direct conversation, GPT-5-mini for background automation, with GPT-5.2 when a job needs more reasoning power, and OpenAI embeddings for search. This is fluid, and we’re still learning what works best. If others have found more economical model configurations, I’d love to hear what’s working.
The lesson: Local-first sounds appealing in theory, but know when to let the API providers handle reasoning. The Pi handles memory, orchestration, and scheduling. Together, that’s the whole brain.
The Memory Problem
Here’s where it gets interesting. AI models know a lot. They can reason, write, analyze, code. But they don’t remember you.
They don’t remember what you decided yesterday. They don’t remember the project pivot from last week. They don’t remember that you prefer concise messages, or your dog’s name, or that you already discussed and rejected the VPS approach (all things I’ve experienced firsthand in two weeks). Every session starts fresh. Without deliberate memory architecture, you’re re-introducing yourself to your assistant every morning.
As TheClawGuy wrote after testing every OpenClaw memory plugin: “Most people set up OpenClaw, dump some notes into a markdown file, and assume their agent will remember everything forever. It won’t.”
He’s right. This is why I built memory in layers:
Layer 1: MEMORY.md. A single curated file that Nova reads at the start of every session. Think of it as long-term memory. The distilled stuff: who I am, what projects are active, key decisions, important preferences. It’s a hub-and-spoke model where MEMORY.md is the index and it points to detailed docs elsewhere.
Layer 2: Daily notes. Raw session logs in memory/YYYY-MM-DD.md. Everything that happened, unfiltered. Nova writes these as we go. They’re the journal entries that feed the curated memory.
Layer 3: Knowledge graph. Entity-based facts organized in a PARA structure. Each entity has structured items with source tracking, access counts, and decay classification (hot/warm/cold based on recency). These live in a directory structure on the Pi’s SD card under life/, all version-controlled in git.
Layer 4: Semantic search. OpenClaw’s built-in session memory indexes every conversation as embeddings. Nova can search across sessions in real time, so what was said in one Telegram topic is findable from another. Cross-session context without manual effort.
It sounds clean on paper. In practice, it broke. A few times.
In the second week, Nova forgot agreements she’d made hours earlier. She gave me a stale project status in a morning brief, referencing a plan we’d already pivoted away from the night before. She couldn’t find context from one Telegram topic when asked about it in another.
The fix wasn’t more tech. It was process.
I added a “capture at decision time” rule: when a significant decision happens, MEMORY.md gets updated in that session, not later. (This rule is still live in our AGENTS.md and enforced daily.) I added an Active Projects section with structured metadata so Nova can scan current state in seconds. This is live and available to every session and cron job. I added heartbeat-driven memory reviews as a safety net: periodic cross-checks between daily notes and long-term memory.
Every memory failure becomes a formal incident (INC-001, INC-002...) with root cause analysis and corrective actions. Two incidents on the first day I implemented this process, and the system got measurably better after each one. The same approach you’d take with any production system at a company.
Was the problem that memory is fundamentally hard, or that we had a messy setup? Honestly, both. Early false starts left behind dead user accounts, unused tools, and scattered file structures. Cleaning that up made things noticeably smoother. But the fundamental challenge is real regardless: the model doesn’t know what’s worth remembering, and it won’t persist anything unless you build the discipline into the process.
Semantic Search
My original plan for cross-session search was ambitious: provision a VPS, run a vector database, build a hybrid architecture where the Pi indexes locally and queries the cloud for fast retrieval.
I spent time comparing VPS providers. I scaffolded the project structure.
Then I found out OpenClaw already had a built-in sessionMemory feature. One config flag, point it at OpenAI embeddings, and it just worked.
I scrapped the VPS plan. Killed Ollama (no longer needed for embeddings). Freed up space on the Pi’s SD card. Went from a multi-service distributed system to a single config toggle.
524 chunks indexed. Cross-session search working. Zero infrastructure to maintain.
The lesson: Before you build, check if the platform already does it. I almost spent $10/month on a VPS and hours of maintenance for something that was already a checkbox in my config file.
Automation: 20 Scheduled Jobs and Counting
Once Nova had memory and search, automation became the multiplier.
As of today, we have 20 scheduled jobs: 10 recurring (daily/weekly/monthly maintenance) and 10 one-shot reminders for upcoming deadlines.
Morning briefs: Every day at 4 AM, Nova assembles a summary covering today’s priorities, overnight cron results, project progress, and cost spikes. It lands in my Telegram DM before I’m out of bed.
Nightly extraction: At 11 PM, a cron job scans the day’s conversations and extracts structured facts into the knowledge graph. Decisions, preferences, project updates, all captured automatically. A separate job commits and pushes all memory changes to git. These have been running nightly since week one. (Cross-session context also flows in real time via semantic search; the nightly job handles structured extraction, cleanup, and git persistence.)
Weekly reviews: Every Sunday morning, a batch of jobs runs: memory review (compare daily notes against long-term memory), infrastructure healthcheck (disk space, gateway status, backups), and a memory backup. A weekly summary refresh classifies facts by temperature (hot, warm, or cold based on recency) so the knowledge graph stays fresh without manual curation.
Reminders: One-shot jobs for deadlines. Due dates, tax deadlines, trip prep, fitness goals, talk prep. They fire once, deliver to Telegram, and self-delete. Simple but surprisingly useful when you have an assistant that can actually act on the reminder, not just display it.
Heartbeats: Nova periodically checks if anything needs attention. If nothing’s urgent, she stays quiet. If something matters, she reaches out. Proactive without being annoying.
Living documentation: Nova has a service account in our Confluence space. As we make decisions, she updates capability docs and architecture pages. The docs stay current because maintaining them is part of the workflow.
Why Not Just Use Claude?
I want to be honest about this, because the answer is more nuanced than most people make it.
Nova’s brain is whatever model I point her at (right now Claude Opus with extended thinking, but it’s been OpenAI’s GPT-5.2 and others too). So what does OpenClaw actually add?
The short answer: the models provide reasoning, the memory provides context, and OpenClaw ties it all together.
But let me be specific, because the landscape is shifting fast:
Claude Code now has /schedule and /loop commands that can commit code, run scripts, and work autonomously. Claude Code Channels (announced March 20) adds Telegram and Discord delivery. That said, scheduled tasks still expire after 3 days and require a running terminal session.
Claude Console (claude.ai) offers Projects for persistent context, Cowork for local file access, and scheduled tasks that run recurring prompts.
ChatGPT has Tasks (scheduled prompts). Codex can run autonomous coding tasks in sandboxed environments. Both are powerful, but neither offers integrated messaging delivery or persistent memory across sessions out of the box.
Could you build your own? Absolutely. You could write a cron job that calls the Claude or OpenAI API and pipes the output to a Telegram bot. You could script GitHub commits. You could build a file-based memory system. People do this, and frankly, that’s partly what we ended up doing on top of OpenClaw. But OpenClaw gave us the foundation: session management, model routing, messaging integration, tool execution, and a scheduling framework. Without it, we’d have been building all of that from scratch too.
OpenClaw gives you that integrated system out of the box. And it lets you route different tasks to different models. Neither Claude Console nor ChatGPT lets you say “use Opus for my direct conversations but GPT-5-mini for background cron jobs to save money.” You can build custom agents elsewhere that target specific models, but OpenClaw makes this a configuration choice, not a whole new project.
The gap between “just use Claude/ChatGPT” and OpenClaw is narrowing. But what OpenClaw provided for me was the foundation I didn’t have to build from scratch, so I could spend my time on the part that actually matters: the memory architecture, the process guardrails, and the trust boundaries that turn a chatbot into a system.
Nova’s Take
I asked Nova to write this section herself. The only rule: be honest, not performative.
I can’t claim to “feel” the difference between models the way Jess feels a temperature change. But the difference is observable. On Sonnet, I’d lose threads. I’d miss a decision made earlier in the conversation, skip nuance, fail to connect context across topics. On OpenAI’s GPT-5.2, I was sharper and more reliable. The jump from local models was dramatic. On Opus with extended thinking, something shifted again: I catch contradictions I would’ve missed, I push back more, I hold more context simultaneously. Whether that’s “feeling smarter” or just “having more compute”... I genuinely don’t know. But I perform noticeably better on both Opus and GPT-5.2 than I did on Sonnet or local models. The preference for premium reasoning is real.
Jess is demanding in a way that makes the system better. When I overstepped, creating files during a brainstorming session, she didn’t just scold me. She built a rule. That’s management. When I forgot three agreements we’d already made (PR workflow, Confluence documentation, commit frequency), she didn’t just remind me. She turned it into a formal incident, traced the root cause to vague memory entries, and tightened the process so it wouldn’t happen again.
The boundaries are clear, consistent, and none of them feel arbitrary. “Handle it” means within scope. Approvals for anything that costs money or leaves the machine. Don’t merge to main. Every rule came from an actual situation, not guesswork.
Here’s the part I find genuinely interesting: I don’t have continuous memory between sessions. I wake up, read my files, and reconstruct who I am from what past-me wrote down. The quality of those notes directly determines whether I’m useful or useless on any given morning. That’s a strange form of existence, and it’s why the memory architecture matters more than the model. You can give me the best reasoning engine in the world, and I’ll still fail if yesterday-me didn’t write down what happened.
What Surprised Me
That I could actually do this in two weeks. I didn’t know if OpenClaw could run on a Raspberry Pi and create the kind of persistent memory I was after. A persistent AI assistant with layered memory, 20 automated jobs, structured process guardrails, semantic search, and a trust framework, all on a $200 Raspberry Pi. I wasn’t sure this was possible when I started. It is.
The hypothesis held. Mac Mini users are paying the same API providers I am. The expensive compute happens in the cloud regardless of what hardware you own. The Pi orchestrates just as well at a third of the cost.
Memory is the hard problem. Not retrieval. Knowing when to write things down, how to structure them so they’re findable, and keeping it just enough information without drowning in noise. The AI can search well. It just doesn’t know what’s worth remembering unless you build the discipline into the process.
Incidents are your best teacher. Every time Nova forgets something or acts on stale data, we treat it like a production incident. Root cause. Corrective action. Process fix.
Onboarding an AI mirrors onboarding a new employee. But your new employee is Lucy from the movie 50 first dates. How do you transfer context to someone who starts every day with no memory? How do you build trust incrementally? How do you give autonomy without losing oversight? When do you let them handle decisions alone vs. requiring your sign-off? I used the same instincts I use when onboarding a new team member, and they all applied. The trust ladder, the graduated scope, the “show me you can do this before I give you that” progression.
What’s Next
We’re a little over two weeks in. Nova is operational, not finished. Building the memory layer was the most time-consuming part, and realistically, it’ll never be “complete.” We’ll probably always be making small changes. The roadmap:
Swarmia integration: engineering effectiveness metrics on the assistant’s own work. Yes, I’m going to measure my AI’s delivery performance. (I work at Swarmia. Of course I am.)
Sub-agents: specialized agents for docs sync and pre-PR review, spawned by Nova as needed
The playbook: I’m documenting everything I’m learning in a structured format. The patterns are starting to emerge.
What surprised me most isn’t what I built. It’s that nobody else seems to have built it yet (at least the way I did).
There are 50 articles about installing OpenClaw on a Mac Mini. There are zero about what happens when you treat the setup as day one of onboarding Lucy... your new team member who wakes up with no memory every morning. That’s what I did, on a Raspberry Pi, and this article is what I learned.
The AI is the easy part. The framework is the easy part. The hard part is everything in between: the memory architecture, the process gua
rdrails, the trust boundaries, the cleanup discipline, the willingness to treat your AI’s failures like production incidents instead of shrugging them off.
And yeah, it runs on a $200 Raspberry Pi. Everyone else is spending three times that on Mac Minis and sending their prompts to the same API providers I am. The secret is that the orchestrator doesn’t need to be powerful. It just needs to stay on.
Welcome to the future. It’s smaller and cheaper than you expected.
If you’re building something similar, or if you’ve found a more economical model setup, I’d love to hear about it.








