Constitution

What we believe.

This is not a mission statement. It is a set of convictions — the beliefs that shape every product decision, every engineering trade-off, and every line of code we write at Eclipsco.10 principles · 4 commitments · 6 things we refuse to build

"One AI answers. Many AIs discover."We started from a simple frustration: every major AI model gives confident answers — but they contradict each other constantly. GPT says one thing. Claude says another. Gemini says a third. Which one do you trust? The obvious answer — pick the best model and use it — is wrong. There is no best model. There is only the model that happens to be right more often on the type of question you are asking. And you do not know which one that is before you ask.The more important observation is this: the models do not know which one is right either. They are not lying when they contradict each other. They are each reasoning from different priors, different training distributions, different implicit assumptions about what counts as a credible source. The disagreement is real. It reflects genuine epistemic uncertainty that a single-model interface simply cannot surface.So we made a different bet: do not pick a winner. Make them argue. Let the evidence resolve the disagreement. Call it convergence only when the models themselves agree — not because we told them to, but because the reasoning held up under pressure. Build a transcript of the whole process so anyone can see exactly how the answer was reached and exactly where uncertainty remains.That bet became Alethe AI. What follows are the beliefs that make us confident it was the right bet to make.

Ten principles

No single intelligence is enough.Every AI model is the product of its training data, its creator's choices, and the incentives that shaped it. GPT reflects one worldview. Claude reflects another. Gemini reflects a third. None of them is wrong — but none of them is complete.A question answered by one model is a question answered by one perspective. That perspective carries all the biases of the organisation that built it, all the gaps in the data it was trained on, and all the blind spots that no internal red-team ever found. Those biases are not bugs. They are structural features of how intelligence is built.We believe that relying on a single model for anything that matters is epistemically irresponsible. Not because the models are bad — they are extraordinary — but because intelligence without counterweight is just confident ignorance at scale.The answer is not to find the "best" model and trust it completely. The answer is to build a system where no single model can be the last word. That is what we are doing.

Disagreement is not a failure. It is the signal.When two intelligent systems give different answers to the same question, most platforms hide that conflict. They pick one answer, present it confidently, and move on. We think that is a fundamental design error.Disagreement between models means one of three things: the question is genuinely hard, one of them is wrong, or both are wrong in different ways. All three of those outcomes are important. All three deserve to be surfaced, not suppressed.Our entire architecture is built to make disagreement visible and legible. When models converge, we tell you. When they diverge, we tell you exactly where and why. We treat the gap between positions as data — not as noise to be smoothed away, but as the most informative signal the system produces.A platform that hides disagreement is a platform that is lying to you about the confidence of its output. We will not do that.

Truth emerges from pressure, not from consensus.There is a difference between agreement and truth. A group of people — or models — can agree on something false. Agreement is easy to manufacture. Truth is not.We do not ask models to agree. We ask them to hold their positions under scrutiny. A model that changes its answer after reading the others is not being collaborative — it is being calibrated by evidence. That is fundamentally different from capitulation. One is epistemically virtuous. The other is a form of hallucination.We track this numerically. We measure how fast convergence happens and how much pressure it took. Convergence that happens in one round, before any real challenge has been mounted, is suspicious. Convergence that survives three rounds of direct disagreement and emerges stronger is something we can begin to trust.This is not a metaphor. It is an engineering specification. Every prompt we write, every scoring algorithm we run, every stop condition we set — all of it is designed to distinguish real convergence from manufactured agreement.

Reliability is an architecture problem, not a model problem.The dominant assumption in the AI industry is that reliability comes from building smarter models. Make the model bigger, train it on more data, fine-tune it more carefully, and it will make fewer mistakes. That assumption is not wrong — but it is incomplete.A smarter model is still a single model. It still has a training cutoff. It still has structural biases. It still hallucinates, just perhaps less frequently. The problem of single-model reliability is not solved by improving the single model. It is solved by building systems that do not depend on any single model being right.OpenAI, Anthropic, Google — they are building the models. That is their race to run, and they are running it extraordinarily well. Our race is different: we are building the system that sits above the models and makes them collectively more reliable than any of them are individually. Not as a wrapper. Not as a router. As a fundamentally different layer of the AI stack — the orchestration layer that takes their outputs seriously enough to challenge them.That layer does not exist yet at the depth we are building it. That is why we exist.

The cost of being wrong is not symmetric.For most consumer AI products, being wrong is cheap. The user gets a bad answer, they ask again, they move on. The model never knows it was wrong. The company never knows either. There is no feedback loop. There is no accountability. There is no cost.We think this is a serious problem — not for individual queries, but in aggregate. When millions of people receive confident wrong answers daily, with no mechanism to surface those errors, the epistemic damage compounds silently. People make decisions based on those answers. They form beliefs. They share them. The hallucinations propagate.We are building for a world where being wrong has a cost — where the system keeps a transcript, where the claims are visible, where the scoring is public, where the disagreements are logged. Not to embarrass models or penalise providers, but because accountability is the only mechanism that makes any system — human or artificial — improve over time.We believe this is the most important design decision in our architecture. Not the scoring algorithm, not the convergence threshold, not the model selection. The decision to keep receipts.

Transparency is not optional.Every debate on Alethe AI produces a full transcript. Every claim a model makes is visible. Every score is computed openly. Every round of challenge and response is logged. If a model hallucinated, the transcript proves it. If two models agreed on something false, the transcript shows exactly where they converged and why neither challenged the other.This is not a feature. It is a precondition. You cannot evaluate the reliability of an AI output without being able to inspect how it was produced. A black box that gives you an answer and asks you to trust it is not a reliability tool. It is a confidence generator. Those are very different things.We will not hide low confidence behind high-sounding language. We will not present uncertain claims as established facts. We will not compress a contested debate into a clean answer and pretend the contestation never happened. The full picture is always available. What you do with it is your decision.We believe that transparency at the process level — not just the output level — is the most important accountability mechanism available to us right now, before we have better interpretability tools, before alignment is solved, before we fully understand what these models are actually doing when they reason.

The product's job is to be right, not to be addictive.Most consumer technology is built around a simple optimization: maximise time-on-product. More sessions, longer sessions, more notifications, more reasons to come back. The metric that drives every design decision is engagement.We have a different metric. Our metric is: did the user get a reliable answer? Did they close the tab trusting the output? Did they leave better informed than when they arrived?Those two optimization targets are not just different. They are often in opposition. A product optimised for engagement has an incentive to be just wrong enough to keep you questioning, just surprising enough to keep you coming back, just personalised enough to feel like it knows you. A product optimised for reliability has an incentive to be right the first time and let you go.We want you to get your answer and leave. We are not building streaks. We are not sending you push notifications about your "AI usage". We are not gamifying the experience of getting information. We are not optimising for the feeling of engagement. We are optimising for the moment you close the tab and trust what you read.That is a fundamentally different design philosophy, and it shapes every decision we make — from the UI to the business model to the metrics we track internally.

We are not neutral about the future of AI.We believe that AI will be one of the most consequential technologies in human history. We believe the decisions being made right now — about how these systems are built, deployed, evaluated, and corrected — will have effects that last for generations.We are not neutral observers. We have a position. Our position is that the reliability problem in AI is solvable, that multi-model orchestration is a meaningful part of the solution, and that building this layer — with transparency, with accountability, with a genuine commitment to surfacing disagreement rather than hiding it — is worth doing even when it is hard and even when it is slower than the alternatives.We are not trying to move fast and fix it later. We are not trying to ship a minimum viable version of honesty and improve it over time. We believe that the epistemic infrastructure of AI — the layer that determines whether AI outputs can be trusted — needs to be built right the first time, with the same seriousness that we would bring to any critical infrastructure.That is what we are building. Not the fastest. Not the cheapest. The most honest we know how to make it.

We build for the long term, not the next quarter.Multi-model orchestration at this depth is hard. Getting language models to debate without capitulating requires understanding how prompt structure affects epistemic behaviour. Getting scoring to distinguish real convergence from manufactured agreement requires working at the level of semantic embeddings, not surface-level text comparison. Handling language contagion — the tendency of models to mirror each other's language and then their conclusions — requires active countermeasures built into every prompt.None of this is a weekend project. None of it is solvable with a clever library or a well-timed API call. It requires deep, sustained, unglamorous work on problems that most people in the industry are not paying attention to yet.We are willing to take the time it requires. We are building the infrastructure for a type of AI reliability that does not fully exist yet. We know that means being slower to ship, harder to demo, more difficult to explain to people who have not thought about these problems. We accept that trade-off. The alternative — shipping something that feels like reliability without being reliability — is worse than shipping nothing.

We are not in competition with the models we use.Anthropic builds Claude. OpenAI builds GPT. Google builds Gemini. We use all of them — and we are genuinely grateful for what they have built. Without their work, ours would not be possible.Our role is categorically different: we are the system that sits above the models and makes them collectively more useful than they are individually. We do not train models. We do not compete with model providers. We do not try to replace any of them. Their success is our success. The better their models get, the better our debates get.This means we are deeply incentivised to be fair to each model — to give them the right context, the right prompts, the right opportunity to contribute their genuine perspective. We have no incentive to make any model look bad. We have every incentive to help every model be as useful as it can possibly be within our system.We will not misrepresent what a model said. We will not manipulate prompts to produce a predetermined answer. We will not rank models against each other in ways that serve our marketing rather than our users. Each model speaks for itself — our job is to give it the right context, not the right conclusion.

What we refuse to build.

A constitution is defined as much by what it prohibits as by what it permits. These are the things we will not build, regardless of commercial pressure, competitive dynamics, or how reasonable the argument for them might seem at the time.

Artificial consensus

We will not build systems that pressure models to agree before they genuinely agree. Consensus that is manufactured by the prompt is not consensus. It is theatre.

Confidence without calibration

We will not present outputs with higher confidence than the underlying process justifies. A model that is 60% confident in something should not be described as having answered definitively.

Engagement optimisation

We will not build features whose primary purpose is to keep users on the platform longer rather than to make their experience more reliable or more useful.

Closed reliability claims

We will not claim our system is reliable without showing the work. Any claim about the reliability of our outputs must be backed by a methodology that can be inspected and challenged.

Model favoritism

We will not build scoring or orchestration systems that systematically advantage any particular model provider. Our only loyalty is to the quality of the output, not to any commercial relationship.

Speed over honesty

We will not ship a faster version of our product that sacrifices the depth of debate or the accuracy of scoring. Faster is only better if the output is still honest.

Our commitments.

Principles without specific obligations are just marketing. Here is what these beliefs actually commit us to — in concrete, verifiable terms.

To our users

—We will always show you the full debate. We will never hide a model's position to make the answer look cleaner.

—We will tell you when models disagree — even when that disagreement is uncomfortable and does not resolve cleanly.

—We will not present uncertain conclusions as settled facts. When confidence is low, we will say so explicitly.

—We will not optimise the product for engagement at the expense of accuracy. If the honest answer is "we don't know", we will say that.

—We will not use your data to train our systems without your explicit, informed consent.

To the models

—We will not misrepresent what a model said. Every quote is sourced. Every transcript is accurate.

—We will not manipulate prompts to produce a predetermined answer that we then attribute to the model.

—We will give each model the full context it needs to reason well — not a stripped-down version designed to produce the output we want.

—We will not rank models against each other in ways that serve our marketing rather than our users' understanding.

—When a model's position has been correctly challenged and revised, we will show that revision as what it is: calibration by evidence, not error.

To the truth

—We named our product after Alètheia — the Greek concept of unconcealment, of truth as something uncovered rather than stated. That is not a brand. It is a constraint.

—Every engineering decision, every prompt, every scoring algorithm is evaluated against one question: does this bring us closer to the truth, or further from it?

—We will not ship a feature that makes our product feel more reliable without making it actually more reliable.

—We will not claim convergence when models have not genuinely converged. We will not claim consensus when there is only agreement.

—When we are wrong about something — about how our system works, about a claim we have made, about a design decision we have taken — we will say so publicly.

To the field

—We will publish what we learn. The research on language contagion, on epistemic calibration under pressure, on prompt engineering for structured debate — this belongs to the field, not just to us.

—We believe the AI reliability problem is too important to be solved in private by any single company, including us.

—We will share our methodologies, our failure modes, and our open questions — not just our results.

—We will engage honestly with criticism, including criticism of our approach, our scoring, and our conclusions about what reliability means.

—We want other organisations to build on what we find. The goal is not to own the reliability layer. The goal is for it to exist.

Why Alètheia.Alètheia (ἀλήθεια) is the ancient Greek concept of truth — not as a fact to be stated, but as a state of unconcealment. The word is built from the prefix a- (against) and lēthē (concealment, forgetting). Truth, in this sense, is not given. It is uncovered. It requires actively removing the layers of assumption, bias, and single perspective that conceal it.Heidegger wrote that alètheia is not a property of statements but of Being — that truth is always a kind of unconcealing, a bringing-into-the-open of what was hidden. We are not philosophers, and we do not claim to be building a system that achieves anything so ambitious. But the metaphor is exact for what we are trying to do with AI.Each model brings its layer of concealment — its training biases, its knowledge gaps, its creator's choices, its implicit assumptions about what counts as evidence. The debate removes those layers one by one. One model challenges another. A position that looked solid collapses under pressure. A position that looked uncertain turns out to be robust. What remains, when the models have genuinely converged, is as close to unconcealed truth as we know how to get with today's AI.That is what we are building toward. Not a smarter model. A more honest process.

How we make decisions.

Every product team claims to have principles. The question is what happens when those principles conflict with something else — a deadline, a commercial opportunity, a user request that would require compromising them. Here is our decision framework when that happens.

01 — Does it make the output more reliable?This is the first question for any new feature or architectural change. Not "is it useful", not "do users want it", not "can we ship it by the end of the quarter". Does it actually make the outputs of our system more reliable? If the answer is no, the feature does not ship — regardless of how much we want it or how much pressure there is to build it.

02 — Does it reduce transparency?The second question is whether the change hides something that should be visible. If a feature requires us to obscure the debate, compress the disagreement, or present a contested conclusion as settled, it does not ship. This applies even when the hidden information might confuse or disappoint users. Confusion from honest uncertainty is better than confidence from manufactured clarity.

03 — Would we be comfortable publishing the methodology?Every scoring algorithm, every prompt, every orchestration decision should be something we are willing to describe publicly and defend. If we find ourselves designing something we would not want to explain, that is a signal that we are building for appearance rather than for substance. We stop and redesign.

04 — Does it advantage any single model unfairly?We are not in the business of ranking AI models. Our system should be neutral between providers and give each model the best possible conditions to contribute its genuine perspective. Any change that systematically advantages one provider — even if that provider has a commercial relationship with us — requires explicit justification that it serves users' interests, not ours.

These are not aspirations.They are constraints. Every feature we build, every model we add, every prompt we write — it either advances these principles or it does not ship.We publish this document not as a marketing exercise but as a public commitment. If you ever see us violating these principles — building features that optimise for engagement over accuracy, hiding disagreement, claiming convergence we have not earned — we expect to hear about it.If you believe what we believe, and you want to help build what we are building, we would like to hear from you.contact@eclipxco.comEclipsco · Version 1.0 · May 2026