<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://widami.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://widami.github.io/" rel="alternate" type="text/html" /><updated>2026-04-20T02:23:54+00:00</updated><id>https://widami.github.io/feed.xml</id><title type="html">David Mitchell’s Blog</title><subtitle>My thoughts, ideas, and notes. Mostly about AI experiments and projects</subtitle><entry><title type="html">Claude Plays Factorio: Update on FactoMCP and Claude Code</title><link href="https://widami.github.io/2026/04/19/claude-plays-factorio-3/" rel="alternate" type="text/html" title="Claude Plays Factorio: Update on FactoMCP and Claude Code" /><published>2026-04-19T00:00:00+00:00</published><updated>2026-04-19T00:00:00+00:00</updated><id>https://widami.github.io/2026/04/19/claude-plays-factorio-3</id><content type="html" xml:base="https://widami.github.io/2026/04/19/claude-plays-factorio-3/"><![CDATA[<p>I’ve been experimenting with having Claude Code spin up subagents using Haiku to take care of things like collecting resources, fueling things, and construction. This seems to be working OK. I don’t have any hard measurements, but it does seem like this gets more done before exhausting the 5-hour usage limit I run into. The weekly limit is a somewhat bigger deal. If I use up a five-hour limit I can wait, but using up the weekly limit means I can’t get other things done with Claude for days at a time.</p>

<p>Placement and geometry is still an issue. I spent about an hour fighting to get Claude to put down a standard two-drill setup on the coal patch. This is a standard early-game design to keep coal drills running longer by having them feed into each other. For some reason it was convinced there needed to be a two-tile gap between the drills, and I got pretty frustrated trying to get it to perform what is really a simple action. I’m pretty sure the root cause was placing a drill and adding fuel before putting the other drill into place — if you do this, a piece of coal drops on the ground preventing placement of the second drill. It’s the kind of thing humans figure out pretty fast to avoid, but I’ve had to explain it to Claude multiple times.</p>

<p>Which brings me to one observation about its behavior while playing: it often jumps to conclusions that are wrong. I wonder if this is because the game state progresses but the model can’t tell unless it checks again. It can look at an area, see that it’s empty, and start trying to place the drills — but when coal drops on the ground, there’s no update forced back to the model. It won’t see that unless it looks, and it usually doesn’t look; it just jumps to erroneous conclusions.</p>

<p>Claude also has a tendency to do manual workarounds instead of fixing problems — hand-feeding coal into furnaces instead of figuring out why they’re not getting coal, or hand-crafting items instead of fixing the assemblers that are supposed to make them.</p>

<h2 id="some-progress">Some Progress</h2>

<p>All of that said, I was able to make slightly more progress. It managed to build a simple power plant without manual intervention, including placing the water pump. It even managed to run power poles through a forest, which is the kind of thing it often gets stuck on.</p>

<p><img src="/assets/images/power_plant.png" alt="Power plant" /></p>

<p>I told it that it was going to need a lot of belts and should automate that using crates for input, and it managed to get that built OK.</p>

<p><img src="/assets/images/belt_automation.png" alt="Belt automation" /></p>

<p>It even almost came up with a working iron plate setup. Almost. I’m not sure what the extra belt is supposed to accomplish here, and it didn’t quite manage to get coal into the furnaces — but it was close. There’s just one belt needed on the west side to get that working. It also used a burner drill to get the coal instead of an electric one, which means it would still need to fuel that drill manually. But it’s closer!</p>

<p><img src="/assets/images/furnace_setup.png" alt="Furnace setup" /></p>

<h2 id="mcp-updates">MCP Updates</h2>

<p>There are also some updates to the MCP code worth noting. I now include information about the pickup and drop target of entities when appropriate. This makes it easier for the model to evaluate whether things are lined up correctly.</p>

<h2 id="next-steps">Next Steps</h2>

<p>I’m thinking about trying a different harness rather than Claude Code. It’s so hard to make progress on the Claude Pro plan without hitting limits. The <a href="https://hermes-agent.nousresearch.com">Hermes</a> agent looks promising, and OpenRouter has some free tiers. I’ll hit limits there too, but I’m not sure how fast. Having better tracking of actual usage will give me a clearer picture of what I really need — the Claude Pro plan is pretty opaque about actual usage and what that would translate to in API costs. So that will be the next step. Stay tuned!</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I’ve been experimenting with having Claude Code spin up subagents using Haiku to take care of things like collecting resources, fueling things, and construction. This seems to be working OK. I don’t have any hard measurements, but it does seem like this gets more done before exhausting the 5-hour usage limit I run into. The weekly limit is a somewhat bigger deal. If I use up a five-hour limit I can wait, but using up the weekly limit means I can’t get other things done with Claude for days at a time.]]></summary></entry><entry><title type="html">Claude Plays Factorio: The Reasoning Problem</title><link href="https://widami.github.io/2026/04/14/claude-plays-factorio-2/" rel="alternate" type="text/html" title="Claude Plays Factorio: The Reasoning Problem" /><published>2026-04-14T00:00:00+00:00</published><updated>2026-04-14T00:00:00+00:00</updated><id>https://widami.github.io/2026/04/14/claude-plays-factorio-2</id><content type="html" xml:base="https://widami.github.io/2026/04/14/claude-plays-factorio-2/"><![CDATA[<p>So this is an example of the reasoning output. It’s pretty clear that it’s struggling with basic game mechanics and is often running around in circles. Part of this is probably due to some issues in the MCP code. I tried having it write a tool to output a text-based “map” of an area but it seems like it might be misrepresenting the direction of inserters. Generally though, it seems like some explaining of game geometry and changing the MCP tools to give it a better understanding of the game state might help.</p>

<blockquote>
  <p>For the output inserter at y=-42.5 facing north, its pickup would be around y=-43.3 based on that arm offset pattern, which puts it right at the edge of AM1’s south boundary. This could be a problem — if the game treats that position as outside the AM or in a dead zone, the inserter won’t be able to grab items from it. I might need to reposition the output inserter further away from the AM boundary to ensure it can actually access the contents inside.</p>
</blockquote>

<p>So here it thinks maybe it needs to move the inserter further away from AM1 (Assembling Machine 1). But really the belt it’s trying to put the copper wire onto is full.</p>

<blockquote>
  <p>Let me consider using the east face of AM1 instead. I could place an inserter at (-58.5,-44.5) facing west, which would pick from inside AM1 at x=-59.5 and drop the copper cables at x=-57.5 in the gap between AM1 and AM2. The problem is that puts the output in the wrong location — I need those cables on the y=-41.5 belt, not in the gap between the two AMs.</p>

  <p>Actually, I’m realizing the coordinates don’t work out. AM1’s east edge is at x=-59, so an inserter at x=-58.5 would be positioned one tile east of that edge, outside the machine entirely. That means it can’t actually pick from inside AM1 at all.</p>
</blockquote>

<p>But now it seems to think the inserter should somehow be inside AM1.</p>

<blockquote>
  <p>Looking back at the rotation data, when the inserter faces north from (-60.5,-46.5), it picks from (-60.5,-47.5) — exactly one tile in the direction it’s facing. So the pickup location is always offset by one tile from the inserter’s position in its facing direction. For my south-side inserters at x=-61.5, they’re picking from inside AM1’s range, which is why they work. The key constraint is that the pickup position itself needs to fall within the machine’s coverage area. The pickup location at x=-61.5, y=-43.5 is indeed inside AM1’s bounds, so that’s not the issue. AM1 shows it’s working with a full output buffer containing copper cable, yet the inserter is stuck waiting for source items — which suggests AM1 might be paused or idle despite the “working” status, possibly because its input is empty and it’s just finishing its last queued operation.</p>
</blockquote>

<p>And now it’s discussing a different inserter entirely. The (-60.5,-46.5) location is the inserter at the top of the AM which feeds copper plates off the belt.</p>

<blockquote>
  <p>I’m thinking the real problem is that the inserter can’t actually reach AM1’s output slot from its current position, even though the coordinates seem right. Let me try repositioning the inserter to the gap between AM1 and AM2 on the south side instead, placing it at AM1’s east edge to see if that gives it proper access to the output. Mine the useless input inserter at (-59.5,-42.5) and replace it at (-59.5,-42.5) direction=”north” as an additional output inserter. Maybe this x position works better for accessing AM1’s output.</p>

  <p>Actually wait — x=-59.5 is IN AM1’s x range (-62 to -59)? Is -59.5 &lt; -59? YES, -59.5 &lt; -59 (more negative). So x=-59.5 IS within AM1 x range of -62 to -59 (since -62 &lt; -59.5 &lt; -59)? Wait: -62 &lt; -59.5 is TRUE. And -59.5 &lt; -59 is TRUE (-59.5 is more negative = smaller). So yes, x=-59.5 IS inside AM1 x range. ✓</p>

  <p>So an inserter at (-59.5,-42.5) direction=”north” would pick from (-59.5,-43.5) which IS inside AM1. This might work! Let me try replacing the inserter there.</p>

  <p>Actually, I’m realizing the real issue might be simpler: that south-facing inserter at (-59.5,-42.5) could be blocking things if it’s picking copper cables off the belt below and feeding them back into AM1, creating a loop even when the belt is supposedly empty.</p>

  <p>AM1’s recipe needs copper plates, not copper cables. So AM1 would reject the copper cables. The inserter might be stuck trying to put cables back in.</p>

  <p>Oh! This might be the issue. If the inserter at (-59.5,-42.5) is picking copper cables from the belt and trying to stuff them into AM1 (which doesn’t accept them since the recipe is copper-cable which needs copper PLATES)… the inserter would just fail to insert and keep trying. Meanwhile, copper cables on the belt near x=-59.5 would be stuck being grabbed and rejected.</p>
</blockquote>

<p>I think you get the point. It’s pretty confused about where the inserters need to be relative to the assembler. Honestly I’m not even sure what problem it’s trying to think through. Now that I’ve taken a look at the chain of thought it’s not surprising that it burns through tokens like crazy without getting a whole lot done.</p>

<p>On the other hand, it did manage to get the factory it was working on far enough along that it made something. There are in fact several red potions on the belt! Clearly it has no idea how to actually use belts in Factorio, but that’s probably something which can be explained to it. Or possibly I can have it take some screenshots and see if it can figure out the problem and what to do about it. For context, Claude did drop both iron and copper plates into the chest but the iron ones got pulled out first and ended up as gears.</p>

<p><img src="/assets/images/broken_factory.png" alt="Claude's broken factory" /></p>

<p>I think I had drastically overestimated Claude’s understanding of how Factorio actually works. It seems to understand the basics, and it seems to be able to code to the Lua API pretty well. But once you get into dealing with all the coordinates and such it just seems confused. In fairness, I’m finding the coordinate system somewhat confusing myself — it’s certainly harder to reason about than just looking at the visual layout.</p>

<h2 id="what-im-going-to-fix">What I’m Going to Fix</h2>

<p><strong>Tighter entity lookups.</strong> I’ve realized it’s using <code class="language-plaintext highlighter-rouge">find_entities_filtered()</code> with a radius of 1, which has led to a lot of errors where it finds the wrong entity. This function should usually be called with just an exact location and no search radius — that way it returns exactly one entity (or two if there’s an ore resource at the same spot). That should cut down on a lot of confusion.</p>

<p><strong>Fix the <code class="language-plaintext highlighter-rouge">detail_area</code> tool.</strong> I have a tool that may be getting the direction of inserters wrong. Fixing that up should help it understand the layout.</p>

<p><strong>Use <code class="language-plaintext highlighter-rouge">drop_target()</code> and <code class="language-plaintext highlighter-rouge">pickup_target()</code>.</strong> Some entities like inserters have these methods, which tell you specifically what they’re feeding into and pulling from. Including that in the tool output will almost certainly help a lot. Here’s a concrete example of the difference:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>❯ For the copper wire assembler AM1, describe the relationship between the inserter
  north of it, the AM, and the inserter south of it using coordinates.

⏺ - Inserter at (-60.5,-46.5): picks up from (-60.5,-47.5), drops to (-60.5,-45.3) — feeds the north belt into AM1
  - AM1 at (-60.5,-44.5): occupies (-62,-46) to (-59,-43)
  - Inserter at (-59.5,-42.5): picks up from (-59.5,-43.5), drops to (-59.5,-41.3) — extracts from AM1 onto the south belt

❯ Now describe it using unit numbers.

⏺ - Inserter #206: picks up from belt #129, drops to assembling-machine #136
  - Assembling-machine #136
  - Inserter #297: picks up from assembling-machine #136, drops to belt #211
</code></pre></div></div>

<p>You can see in the first output that none of the numbers actually match up. The drop point of the top inserter doesn’t match the coordinates of the AM, and the pickup point of the bottom inserter doesn’t match the AM coordinates either. But using <code class="language-plaintext highlighter-rouge">drop_target()</code>/<code class="language-plaintext highlighter-rouge">pickup_target()</code> to get the entities and describing them by unit number makes the relationships much clearer.</p>

<p><strong>Use <code class="language-plaintext highlighter-rouge">unit_number</code> for identity.</strong> Most entities have a <code class="language-plaintext highlighter-rouge">unit_number</code>, and referencing entities by that instead of by coordinates should reduce a lot of the confusion about which entity we’re talking about.</p>

<p>My next step will be to make those changes to the MCP tools and have it start a new factory. I’ll also explain a bit more about how belts work so it stops making sushi belts and see if that works out better. Stay tuned!</p>]]></content><author><name></name></author><summary type="html"><![CDATA[So this is an example of the reasoning output. It’s pretty clear that it’s struggling with basic game mechanics and is often running around in circles. Part of this is probably due to some issues in the MCP code. I tried having it write a tool to output a text-based “map” of an area but it seems like it might be misrepresenting the direction of inserters. Generally though, it seems like some explaining of game geometry and changing the MCP tools to give it a better understanding of the game state might help.]]></summary></entry><entry><title type="html">Claude Plays Factorio (Sort Of)</title><link href="https://widami.github.io/2026/04/12/claude-plays-factorio/" rel="alternate" type="text/html" title="Claude Plays Factorio (Sort Of)" /><published>2026-04-12T00:00:00+00:00</published><updated>2026-04-12T00:00:00+00:00</updated><id>https://widami.github.io/2026/04/12/claude-plays-factorio</id><content type="html" xml:base="https://widami.github.io/2026/04/12/claude-plays-factorio/"><![CDATA[<p>This is my attempt to have Claude Code play Factorio via an MCP server. The short version: it works, kind of, and it cheats a lot. The MCP server code is at <a href="https://github.com/WidAmi/FactoMCP">FactoMCP</a> on GitHub.</p>

<h2 id="the-setup">The Setup</h2>

<p>The Factorio Lua API is available via RCON, which is how my MCP server talks to the game. The catch is that the API isn’t really meant for this use case — it’s designed for writing game extensions, not for controlling a player entity. There are functions to have a player walk or mine at a certain spot, but it’s very easy to write MCP tools that just teleport, or magically place and remove items with no regard for normal game rules.</p>

<h2 id="first-contact">First Contact</h2>

<p>The first tool I had Claude write was <code class="language-plaintext highlighter-rouge">run_lua()</code> — a general-purpose escape hatch to execute arbitrary Lua in the game. My expectation was that it would issue pretty simple commands, like single function calls to get state or perform actions. Surprisingly, it would write fairly involved Lua scripts right out of the gate, often with loops and conditionals.</p>

<p>And with no guidance about playing by the rules, it was happily cheating. No need to walk when you can teleport. If mining isn’t working the way it expects, it just spawns items directly into its inventory. That was a big motivation for writing more specific tools that at least try to follow the rules.</p>

<h2 id="token-burn">Token Burn</h2>

<p>It burns tokens like crazy. Most of this is probably internal reasoning, but in recent versions of Claude Code the model’s thinking isn’t shown by default, so it’s hard to know what would help it reason more efficiently or where it’s going wrong.</p>

<h2 id="the-coordinate-problem">The Coordinate Problem</h2>

<p>The coordinate system is trickier than I expected. As a player, everything looks like a simple grid. But where items actually get placed depends on their size. A 1x1 item like a power pole is centered in its grid cell — so it ends up at <code class="language-plaintext highlighter-rouge">(5.5, 5.5)</code>, not <code class="language-plaintext highlighter-rouge">(5.0, 5.0)</code>. Claude consistently wants to use even offsets, which means the game snaps placements to unexpected locations. This is something I need to address with better instructions.</p>

<h2 id="seeing-the-world">Seeing the World</h2>

<p>It’s really hard to get a useful picture of game state via the API compared to what a human player takes in just by looking at the screen. I added the ability for the MCP to take screenshots, and Claude can interpret those — it seems to help a little. I haven’t leaned into it heavily though, since image interpretation likely burns a lot of tokens.</p>

<h2 id="what-its-actually-built">What It’s Actually Built</h2>

<p>Despite all this, it does occasionally get something done.</p>

<p>Here’s a boiler setup Claude built. Not how I’d do it, but the parts are wired up correctly. Fluid networks and pipe placement were a struggle — a dedicated fluid tool helped somewhat, but I still haven’t gotten it to place a water pump successfully. The weird offset required for a 2x1 item, the dependency on rotation, and the restriction on which water-adjacent edges it can use means it can spiral indefinitely on that one task. Since pump placement is rare in the grand scheme of things, I haven’t focused on it.</p>

<p><img src="/assets/images/claude_boilers.png" alt="Claude's boiler setup" /></p>

<p>Here’s an early-game coal mining setup. I seem to need to remind Claude every time how to arrange drills so they feed a belt in a cycle. Left to its own devices it wants to just drop coal on the ground, or at best put a chest at the output of each drill — neither of which is a real strategy.</p>

<p><img src="/assets/images/claude_coal.png" alt="Claude's coal mining setup" /></p>

<p>And here’s where it’s started building a factory. I’m about 99% sure this layout isn’t going to work, but I’m curious what it thinks it’s building. The problem is it’ll probably burn through a week’s worth of tokens trying to get there, so I may not let it run.</p>

<p><img src="/assets/images/claude_factory.png" alt="Claude's factory start" /></p>

<h2 id="where-to-go-from-here">Where to Go from Here</h2>

<p><strong>See the reasoning.</strong> Anthropic seems to be making internal reasoning harder and harder to inspect. But whatever Claude is thinking, it’s not efficient — tons of tokens for modest results. Getting visibility into the reasoning chain would help a lot.</p>

<p><strong>Better MCP tools.</strong> The game state tools need work. Right now it’s hard to answer basic questions like: is power out? Is fuel running low? Is a drill actually outputting onto a belt, or just dropping on the ground? Is placement failing because of a coordinate offset error? Better tools that surface these answers directly would reduce the trial-and-error loop considerably.</p>

<p><strong>Sub-agents.</strong> One idea I want to try: use a more capable model with extended thinking to produce a plan, then hand that plan off to the cheapest possible model with no thinking to actually execute the placements. Separation of planning and execution might be the key.</p>

<p><strong>Explicit planning steps.</strong> I’m also working on a Factorio calculator MCP (stay tuned). The idea is to give Claude an explicit goal — “automate 5 red and green science per minute” — and require it to: (1) use the calculator to determine what’s needed, (2) generate a proposed layout, and only then (3) execute by walking around and placing items. That kind of structured breakdown might get it to produce a reasonable plan instead of improvising tile by tile.</p>

<p>There’s a lot of room to improve here, but it’s been a fun experiment so far.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[This is my attempt to have Claude Code play Factorio via an MCP server. The short version: it works, kind of, and it cheats a lot. The MCP server code is at FactoMCP on GitHub.]]></summary></entry><entry><title type="html">Vibe Coded WiFi Monitor</title><link href="https://widami.github.io/2026/03/31/hello-world/" rel="alternate" type="text/html" title="Vibe Coded WiFi Monitor" /><published>2026-03-31T00:00:00+00:00</published><updated>2026-03-31T00:00:00+00:00</updated><id>https://widami.github.io/2026/03/31/hello-world</id><content type="html" xml:base="https://widami.github.io/2026/03/31/hello-world/"><![CDATA[<p>So this is something I had been kicking around in my head for a while but was able to bring to life in an evening using Claude Code. I did some fine tuning later on, but the bulk of this was built in just a few hours after dinner. If you want to skip to the code, it can be found at <a href="https://github.com/WidAmi/wifi_monitor">wifi_monitor</a> on GitHub.</p>

<p><a href="https://github.com/WidAmi/wifi_monitor"><img src="https://raw.githubusercontent.com/WidAmi/wifi_monitor/main/screenshots/wifi-overview-1.png" alt="WiFi Overview Dashboard" /></a></p>

<h2 id="the-setup">The Setup</h2>

<p>As I mentioned, I’d been thinking about this for a while so I had some idea of what I wanted it to look like — basically a few dashboards. One showing overall status of the WiFi, one for details about any particular AP, and one for details of any single client.</p>

<p>My WiFi network consists of four Netgear Orbi tri-band access points running OpenWRT. The 2.4GHz radio serves clients, one 5GHz also serves clients, and one 5GHz acts as the mesh backhaul. One AP (orbi1) has a wired connection to the router, and a couple have wired clients as well (each Orbi has 4x ethernet ports).</p>

<h2 id="building-it">Building It</h2>

<p>Data collection is a Python script run by systemd every two minutes. It SSH’s to each AP and runs a bunch of commands to pull stats. Most of the research on <em>what</em> commands to run was done by Claude Code — I’d describe what numbers I considered important and it would figure out the right commands to get them. Very iterative and interactive.</p>

<p>Once collection was sorted, CC got the rest stood up without much help from me:</p>

<ul>
  <li>Install Docker</li>
  <li>Write a docker-compose with InfluxDB and Grafana</li>
  <li>Configure systemd to run the collection script on a timer</li>
</ul>

<p>We did get stuck for a bit because the collection script was running as root and couldn’t find the right SSH key to log into the Orbis. Once that was sorted, data was flowing into Influx.</p>

<p>The Grafana dashboards were also Claude-driven. I’d describe what to add or change, and it would just get it done. A few small tweaks since that first night, but most of it was built in one session.</p>

<h2 id="is-this-impressive">Is This Impressive?</h2>

<p>Not really. Most of it is pretty trivial. The collection script has a little sophistication, as do the dashboard configs, but it’s not architected to be generally useful — things like the number of APs and their type are more baked in than they should be.</p>

<p>But hey, just vibe code up your own version! The repo is <a href="https://github.com/WidAmi/wifi_monitor">here</a>.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[So this is something I had been kicking around in my head for a while but was able to bring to life in an evening using Claude Code. I did some fine tuning later on, but the bulk of this was built in just a few hours after dinner. If you want to skip to the code, it can be found at wifi_monitor on GitHub.]]></summary></entry></feed>