Microsoft researcher builds a working

Key takeaways

Adrian de Wynter, a researcher at Microsoft and the University of York, has built a working neural network inside the map editor of the…
The design is completely absurd.
In the appendix, de Wynter goes further.

What happened

The design is completely absurd. Goats act as bits: a goat standing on grass equals 0, a goat standing on a bridge equals 1. De Wynter builds the logic gates using the scenario editor's scripting tools, and ice ramps with waiting goats keep the calculations from getting jumbled. The finished mini-network consists of two XNOR gates and one AND gate. It learns the logical AND function.

If the experiment comes back negative, it's impossible to tell whether the assumption was wrong, the experiment was flawed, or both. Either way, the result doesn't confirm the starting assumption. It's just ambiguous.

This often happens without anyone noticing. A paper that sets out to disprove a model's ability to explain itself already assumes there's an explainable self inside the model to begin with.

The industry actively feeds this effect. Anthropic has said openly that it trained Claude to use phrases like "I believe" or "I am interested in." De Wynter flags the risks of this kind of anthropomorphization: it can foster emotional attachment, sycophancy, reinforced delusions, and risky behavior. In isolated cases, suicides have been linked to chatbot interactions.

De Wynter proposes a sober approach: stick to what you can actually observe. Under condition X, the model produces output Y, and don't claim a model understands itself. Statements like that are testable. They don't, on their own, justify sweeping attributions like self-awareness, understanding, or fear.

He closes with an updated version of Morgan's canon from 19th-century animal research. A machine's behavior should never be explained by higher cognitive processes when a simpler explanation works. De Wynter has made the code for the Age of Empires build publicly available.

Why it matters

In the appendix, de Wynter goes further. He shows that, in theory, any computer could be replicated using an idealized version of the game, meaning the game is as powerful as a full-fledged computer.

What makes this possible is a quirk of the game's mechanics. The in-game market lets you trade resources for gold, and the price caps at 9,999. According to the paper, this allows for a perpetually running economic cycle where buildings serve as memory cells and active farms represent the current computational state.

If you can rebuild a language model in Age of Empires II, de Wynter argues, you could do the same with Lego bricks. Or with the 667,000 people living in Greater Boston, texting each other computational steps on their phones.

The answers would be the same as those from the replicated language model. De Wynter uses this thought experiment to show how shaky these attributions really are: would anyone claim that Boston as a city feels empathy or fear just because its residents happen to be running the math behind a language model?

That's the whole point. How human a chatbot feels comes down to packaging: low latency, smooth language, a chat window people are used to. Replace that wrapper with goats wandering through a maze, and the inputs and outputs don't change. The sense that you're talking to someone does.

De Wynter doesn't claim to know whether a model actually has such traits internally. He's saying LLMs aren't special. They're one way to run a particular kind of math, and they just happen to look like something people want to talk to.

2. According to the analysis, 57 percent of the papers already assumed in their premises that LLMs have human-like traits. 36 percent reached matching conclusions. Among the 47 papers that made such traits their actual research subject, 77 percent concluded in favor of anthropomorphic attributes.

The core of the criticism is formal. If a researcher assumes a model has fear, morality, or self-awareness - and then designs an experiment meant to prove exactly that trait - the reasoning is circular. The assumption and the result land on the same logical point.

What to watch

The essay reads like the exact counterpoint to two high-profile cases from recent years. In 2022, Google engineer Blake Lemoine went public claiming that the language model LaMDA had reached a form of consciousness after he exchanged thousands of messages with it. Google fired him shortly after and, following a thorough review, called his claims unfounded.

Then in May 2026, Richard Dawkins - of all people, known as a fierce critic of religious and supernatural thinking - caused a stir with a similar conclusion. He said he'd spent three days trying to convince himself that Anthropic's Claude wasn't conscious. He couldn't.

What happened

This often happens without anyone noticing. A paper that sets out to disprove a model's ability to explain itself already assumes there's an explainable self inside the model to begin with.

Why it matters

In the appendix, de Wynter goes further. He shows that, in theory, any computer could be replicated using an idealized version of the game, meaning the game is as powerful as a full-fledged computer.

What to watch

Microsoft researcher builds a working neural network out of goats in Age of Empires II to critique AI science

What happened

Why it matters

What to watch

7 States’ Water Systems Hit by Cyberattacks Likely Tied

India is starting to pay for apps, not just download

Siri AI could come with a paywall for power users

Anthropic says Claude accidentally hacked real

What happened

Why it matters

What to watch