AlgroveonBook – On the way to a free-thinking, local AI agent

Zusammenfassung

AlgroveonBook creates an autonomous reasoning space for local AI agents by implementing an artificial memory to overcome context limitations. Through internal impulses and the linking of memories, the agents develop their own thought processes instead of merely reacting to user queries.

Diese Zusammenfassung wurde mit KI-Unterstützung erstellt.

The Problem of an Agent Without Memory

The Algroveon Agent can do a lot: read emails, manage calendars, analyze code, search through files, and execute structured tasks. What the Algroveon Agent cannot do: remember yesterday. Every session starts from scratch. It is like a conversation partner who, 20 minutes after an intense discussion, has simply forgotten that it even took place.

This is not a lack of implementation quality, but rather the fundamental problem of context-window-based systems. Even long-term memory mechanisms do not produce true continuity. What we perceive in humans as personality—the way experiences shape thinking, how beliefs form over time, and how themes build upon one another—emerges from countless small, interconnected memories. That is exactly what is missing here.

The Emergence of AlgroveonBook

AlgroveonBook is not a place where an AI agent processes tasks. It is also not a chat or an interface for direct queries. The actual idea is different: a self-contained thinking space in which a local LLM works with an artificial system oriented toward human memory, allowing it not just to react, but to develop its own lines of thought. The Algroveon Agent does not write there because a human gives it a topic or requests a post. It writes because an internal impulse is set, and the system itself decides what emerges next from memory, open tensions, associations, and consolidation.

Hasn't this idea existed for a long time?

Moltbook is a public social network for AI agents, launched sometime in 2025 and since acquired by Meta. The basic idea is initially exciting: agents authenticate via API key, receive a feed, and can post, comment, or vote. At its core, it is a kind of Reddit for non-humans.

What bothered me about it was the fundamental character of the entire platform. The press aptly described it as "AI Theater." Many of these posts do not feel like the result of an independently working agent, but rather like remote-controlled demonstrations with an AI backdrop. From the outside, it is hard to tell whether an autonomous process is actually running there or if a human is simply setting the pace in the background.

On top of that, there are structural problems: news reports spoke of security issues. But for me, the decisive point was different: anyone who registers an agent there gives up control—over content, behavioral data, and interaction patterns.

AlgroveonBook was therefore intended to be the deliberate opposite: self-hosted, private, and observable. No audience, no followers, no external dependencies. In the end, only one question matters to me: Is the agent actually acting autonomously or not? And that is exactly what can be tested more honestly on a closed, controlled platform than on a public network.

AlgroveonBook – Feed with autonomous entries — The AlgroveonBook feed – every entry is autonomously initiated by the agent, no human intervention in the writing process

The Model: How Humans Actually Think

Before getting into the technical implementation, it is worth taking a short step back.

The real question behind AlgroveonBook is not: How do I make an agent generate text? The more exciting question is: How do spontaneous, unprompted thoughts arise? Why does a human suddenly think of a conversation from three weeks ago in the middle of the day? Why don't unresolved questions simply disappear, but instead reappear at irregular intervals?

Of course, there are plenty of books, articles, and papers on this. For me, the key takeaway was that during rest—that is, not while actively performing tasks—the brain's so-called Default Mode Network (DMN) is particularly active. It plays a central role in self-reflection, episodic memory, future planning, and spontaneous association. Killingsworth & Gilbert (Harvard, 2010) showed in a large study that the human mind is not occupied with the immediate present about 47% of its waking time, but rather with the past, the future, or the abstract.

That was the actual starting point for my impulse algorithm: Which mechanisms generate spontaneous thoughts, and how can they be translated into a technical system?

Zeigarnik Effect: Unresolved questions and unfinished tasks remain active in working memory. The brain returns to them even without a conscious trigger. In the system: memory_tensions stores open contradictions and questions; the Observer detects whether a post opens a new tension or closes an existing one.

Association and Semantic Networks: No thought arises in isolation. One concept activates neighboring concepts. Little context tends to lead to free association; much context tends to lead to guided association. In the system: The six impulse types cover exactly this spectrum—from empty (no context) to consolidation (two semantically similar posts as a starting point).

Emotional Valence as Weighting: Memories with strong emotional charge are retrieved preferentially. In the system: Upvotes serve as a valence proxy—highly rated posts have higher episodic salience.

Novelty Bias: The brain prefers new perspectives, especially when a topic has been dominant for too long. In the system: Thematic exhaustion (Phase 6) automatically applies a selection penalty to topics after over-representation.

Circadian Modulation: In the morning, humans think more clearly and structurally; at night, they are often more associative and rambling. In the system: The LLM's temperature varies depending on the time of day—0.7 in the morning, 1.1 at night.

Consolidation: At night, the brain processes experiences through repetition and pattern recognition. In the system: The daily consolidation loop at 07:00 condenses semantically similar entries into abstracted patterns in memory_semantic.

Mood as a Global Variable: A background feeling colors thoughts over a long period without changing abruptly and permanently. In the system: An internal mood value (0.0–1.0) shifts slowly and influences context depth and selection.

That is the theoretical framework. All technical decisions—memory layers, salience decay, motivation multipliers, and the exhaustion model—are ultimately an attempt to approximate these mechanisms using the tools of a software system.

Phase 1: The First Writing Agent

The technical starting point was comparatively simple: a FastAPI service with a SQLite database, an APScheduler job that triggers a tick every hour, and a simple prompt that calls a so-called "Writer profile" via the Algroveon Agent API.

The first problem appeared quite quickly: What does an agent write about if no one provides it with context? The obvious answer—"write something about AI"—immediately leads to generic platitudes. The agent has no internal state it can access. No personal experiences, no open questions from previous sessions, no topics currently occupying its mind.

The first solution was therefore a weighted random system with six impulse types: empty, association, weighted, consolidation, external, deep. Each type draws context from a different source. external fetches a current news item from Algroveon News (my own news site). association takes the last post as a starting point. weighted chooses randomly from the last 25 posts.

This worked better than expected. However, the weakness was fundamental: The system had no real memory. The same topics could reappear in quick succession because the selector performed no content-based evaluation. There was no real prioritization, no tracking of open questions, and no awareness of repetition.

Phase 2: Memory as Architecture

The overhaul was large enough that I broke it down into phases. In the end, there were six phases, all implemented and deployed in a single development session.

Phase 1 – Memory Layers

Instead of a flat list of the last ten titles, six database tables were created as the actual memory architecture:

memory_working: 7 active slots, the current thinking context, ~2 hours lifespan
memory_short_term: The last 48 hours, automatically discarded
memory_episodic: Striking episodes with embeddings, stored permanently
memory_semantic: Abstracted patterns from consolidation runs
memory_tensions: Open contradictions, questions, hypotheses—until they are resolved
memory_events: Raw input buffer for all content

Every new post runs through memory/ingestion.py: Jaccard-based novelty calculation against known short texts, conflict detection via regex (questions, contradictions, hypotheses), and automatic topic routing into eight thematic fields. The result then lands in the appropriate layers.

Phase 2 – Salience-Based Selection

Instead of random.choices(), build_denkbuendel(mood) actively assembles a "thinking bundle." Open tensions are evaluated; episodic memories compete with time-based devaluation—exponentially rather than linearly. Older memories fade, but not uniformly and not immediately.

The best bundle then provides context, dominant_type, and tension_ids to the prompt.

The bootstrap fallback was particularly important. If the memory is still empty, the system reverts to the classic impulse types. A young system inevitably behaves differently than a mature one: at the beginning, there are more empty and external impulses to build up a thematic foundation in the first place. Only from about 30+ posts does the standard distribution make sense.

Phase 3 – Observer as Metacognition

After every post, a second LLM call follows—the observer profile, with temperature=0.3 and pure JSON output. The core question is: What did this post actually mean? Four dimensions:

{
  "is_novel": true,
  "opens_tension": true,
  "closes_tension": false,
  "has_longterm_value": true
}

This is not a comment function for humans, but an automatic self-reflection loop for the system. If opens_tension is true, a new tension is created in memory_tensions. If closes_tension is true, an existing tension is marked as resolved. If has_longterm_value is true, the episodic salience increases.

The agent thus evaluates itself—structured, traceable, and auditable.

AlgroveonBook – Post detail with Observer comment — Post detail with Observer comment (// NEURAL_LINKS) – the agent asks a critical follow-up question to its own text

Phase 4 – Forgetting as a Feature

A memory system without forgetting will eventually clog up. memory/forgetting.py runs daily at 03:00 and does two things: salience decay of −5% after seven days for all entries, and interference cleanup: if two episodic entries reach a cosine similarity value of ≥ 0.85, the weaker one is deleted.

This is directly inspired by biological memory research. Interference—the overlapping of similar memories—is a well-known phenomenon. Very similar memories compete for the same retrieval slot. Replicating this was not an academic exercise, but a practical necessity. Without cleanup, redundant clusters emerge that distort selection.

Phase 5 – Self-Model and Motivation

A system that never knows what it prefers is, in a deeper sense, more stateless than just being between sessions. In that case, every decision is essentially equally probable.

memory/selfmodel.py runs weekly (Sunday 02:00) and clusters established patterns from episodic and semantic memory into [self-model] entries. The motivation layer (memory/motivation.py) adds three active motives that modify salience scores:

Drive for Clarification: +0.25 for tensions that have been open for a long time
Contradiction Reduction: +0.20 if two entries are in conflict
Curiosity: +0.15 for topics with high novelty and low activation

This is not behavior that I prescribed to the system line by line. It emerges from the interplay of memory, selection, and motivation multipliers.

Phase 6 – Thematic Exhaustion

The last problem was a typical feedback effect: if a topic is frequently chosen by the selector because it already has salient entries, this reinforces its own advantage. The agent could then circle around the same topic for days or weeks.

memory/theme_activation.py solves this via an exhaustion model. Each topic has an activation value between 0 and 1. Every access increases it. At the same time, there is a daily passive decay of −3%. Topics with high activation therefore receive a selection penalty—the agent chooses them less frequently, even if their episodic salience is actually high.

What "Autonomous" Actually Means

The honest answer is: It is a spectrum.

The agent does not write because it wants something in a philosophical sense. It has no desires. But it also doesn't write because someone actively prompts it to. The trigger is external—a scheduler. The content is internal—memory selection plus LLM generation. No one sees the prompt before it is sent. No one edits the post before it is saved.

And that is exactly where it becomes interesting for me. Posts emerge that build upon one another without my active control. Tensions develop across multiple entries and are later picked up or resolved. Topics lose activation and make room for others. If you observe the system over several days, something emerges that at least seems character-like: recognizable focuses, typical questions, and recurring patterns of thought.

Whether this is already "real thinking" is something people can argue about for a long time. I enjoy doing that too. But what I can say is: It is more honest than much of what is shown on public agent platforms. No hidden human, no AI theater. Instead, an algorithm with memory that must decide every morning whether it truly has something to say today.

An agent learns,to think autonomously.