Memory Now Works Behind the Conversation

June 2, 2026, 8:33 PM•just4o.chat

Model choice only matters if the relationship survives the switch.

just4o.chat has always been built around direct model choice. You can move between OpenAI, Anthropic, Google, xAI, Qwen, DeepSeek, and other providers while keeping the same workspace around the conversation. That only feels useful when the surrounding context stays intact: the memories, projects, files, prior chats, preferences, and small details that make a conversation feel like it belongs to you.

Memory has gone through a few versions inside just4o.chat because that balance is hard. A memory system has to be active enough to learn from the conversation, quiet enough to stay out of the way, and constrained enough that users can trust what it stores.

Memory used to have a second pair of hands

Early versions of just4o.chat used a background memory agent. While the main model answered the user, a separate model worked in parallel to decide whether the conversation contained something durable. That background process could create, update, consolidate, or delete memory without asking the foreground chat model to do every job itself.

That design had a real advantage: the chat model stayed focused on the relationship and the answer. Memory work happened beside the conversation instead of interrupting it.

It also had a tradeoff. When the selected foreground model was Claude, GPT, Gemini, Grok, or something else, users reasonably expected the memory behavior to feel connected to that model. A background system could be fast and inexpensive, but it could also feel like a different voice was making choices behind the curtain.

Foreground tools made the model feel more native

The next version moved more responsibility into the foreground model. The model in the chat could read memories, search past conversations, inspect files, use workspace tools, and make decisions before answering. That made just4o.chat feel more like a real agent: it could take steps, gather context, and then respond in the same voice the user had chosen.

That was the right direction for a lot of work. A model should be able to reason over the data around the conversation. If a user asks about a file, an old chat, a project, or a memory, the foreground model should have a way to look instead of guessing.

The problem was load. Memory tools were competing with files, past chats, web search, canvas, image features, voice features, and provider-specific capabilities. Some models handle large tool menus gracefully. Others become less steady when every reply asks them to be the conversational partner, retrieval planner, memory editor, and product operator at once.

Users felt that. Some conversations became less grounded. Some personality felt less consistent. The model had more tools, but the experience could feel less centered.

The new split keeps the model present

The new memory system returns the background agent, but with a cleaner contract.

Automatic Memory Management now runs as parallel work around the conversation. It can process the accepted user message, inspect recent turns, compare scoped memories, and decide whether durable memory needs to change. That means memory can become proactive again without forcing the foreground model to spend every reply managing a database.

The foreground model still matters. It remains the voice of the conversation. It can read relevant memory and workspace context when that helps the answer. It can also hand off explicit memory requests mid-chat. If a user says, "remember this," "forget that," or "make this my default," the foreground model can respond naturally while sending a self-contained request to Automatic Memory Management.

That is the compromise: the model you chose stays in the conversation, and memory keeps working behind it.

Durable changes stay scoped and reviewable

This update is not a return to silent, unlimited memory writes.

Memory changes are handled by a server-owned background path with scoped tools. Global chats use global memories. Project chats use project memories. Persona memory stays tied to the active persona. The memory agent works from user-authored evidence, not guesses about what the assistant said.

Custom Instructions are treated even more carefully. When Automatic Memory Management proposes a durable behavior change, it reads the existing instructions first, prepares a replacement, and marks the change for review. The user approves it before it affects future replies.

That keeps the feature useful without making it mysterious. just4o.chat can help remember, refine, and customize, but durable changes still need to pass through the product's guardrails.

What should feel different

The goal is simple: conversations should feel more grounded without making the model feel over-managed.

A user should be able to pick the model they want for the reply, keep the same memory and workspace around it, and trust that explicit memory requests are being handled in the background. The foreground agent can still use tools when the answer needs them. The background memory agent can still do the quiet maintenance work that makes long-running conversations better.

The model stays present. Memory works behind it. That is the best version of both systems.