Sourcegraph Cody: how to review AI suggestions when codebase context creates false confidence

Published 2026-04-26 · 5 min read

Sourcegraph Cody is an AI coding assistant that indexes your entire codebase — not just the open file, but the full repository, including your internal APIs, shared utilities, naming conventions, and architectural patterns. When you ask Cody a question or accept a completion, it draws on that indexed context to generate output that references your actual function names, your actual import paths, and the patterns your team already uses.

This is Cody’s genuine advantage over tools that only see the current file. It is also its specific review problem. When AI output mirrors your own codebase back at you — same variable naming, same idioms, same module structure — the familiarity response fires before evaluation does. Code that looks like your code is not the same as code that is correct. But the two are hard to separate under normal working conditions, and Cody’s design creates three distinct points where that separation is most likely to fail.

Why Cody creates a different attention problem than other AI assistants

Most AI coding tools generate code that is generically plausible — syntactically correct, structurally common, but recognizably “AI-flavored” in ways that are easy to notice. Cody generates code that is specifically plausible: it calls your real helper functions, imports your real module paths, and follows the patterns a new team member would learn by reading your codebase for a week. The output doesn’t feel like AI output. It feels like code someone on your team wrote.

That familiarity creates a trust transfer that doesn’t exist with tools working from a blank-slate context. When Tabnine suggests an unfamiliar pattern, you notice it as a suggestion. When Cody suggests a pattern that matches your team’s established conventions, it is harder to hold it at arm’s length as something that still needs evaluation. The cognitive stance that code review requires — treat this as untrusted until verified — is much harder to maintain when the output passes the initial “is this ours?” check before the “is this right?” check has run.

The three Sourcegraph Cody attention traps

1. The codebase-mirror confidence bypass

When Cody uses @codebase context to answer a question, it retrieves relevant excerpts from your repository and generates its response in relation to that retrieved content. The response will reference your real function names, your real data models, your real configuration patterns. Reading it produces recognition rather than evaluation: “yes, that’s how we do auth” or “right, that’s the service layer pattern.” Recognition is not the same as correctness verification.

Cody may retrieve the right codebase context but still generate a suggestion that applies it incorrectly — calling a function with the right name but the wrong argument order, using the right pattern but in a context where an invariant that pattern depends on doesn’t hold, or referencing a utility that was correct at index time but has since been refactored. The familiarity of the output suppresses the scrutiny that would catch any of these. You are reading for recognition when you should be reading for correctness.

2. Inline context panel authority bleed

Cody’s chat panel can show the context it used to generate a response: the file excerpts it retrieved, the symbols it resolved, the relevant code it considered. This context panel is a genuine transparency feature — it makes Cody’s reasoning visible in a way most AI tools don’t. But the presence of real codebase content in the same panel as the generated suggestion creates an authority transfer problem.

When you see a snippet from your own auth/middleware.ts displayed next to a Cody suggestion about authentication, the real snippet’s authority transfers implicitly to the suggestion. “Cody based this on our actual middleware” becomes a reason to trust the output rather than verify it. The retrieved context tells you what Cody looked at. It does not tell you whether Cody understood it correctly, applied it correctly, or avoided the edge cases that your team’s code handles in ways that aren’t visible in a brief excerpt.

Showing the context is good. Treating shown context as a validation of the output is the failure mode.

3. The /commands-as-review substitution

Cody provides slash commands for specific tasks: /explain to describe what a block of code does, /fix to correct a problem, /test to generate tests, /doc to add documentation. These commands feel like review actions because they ask Cody to analyze existing code rather than generate new code from scratch. Running /explain on a function produces a description that reads like understanding. Running /test produces test cases that look like coverage.

Neither is review. /explain generates a natural-language description of what Cody believes the code does — if the code has a subtle bug, the description will describe the buggy behavior as intended behavior. /test generates tests that match Cody’s model of how the code works — which means the tests will pass against the code they were generated from, including the edge cases the code mishandles. AI-generated tests applied to AI-generated code confirm the model’s assumptions rather than testing against an independent standard of correctness. The result is coverage without verification.

Three fixes

Check the retrieved context before reading the suggestion

When Cody shows context it retrieved to generate a response, read the retrieved snippets before reading the suggestion. Ask whether the context it retrieved is actually the right context: is this the current version of the function, or a version from before a recent refactor? Is this the shared utility, or a similar-but-different function in a different module? Does the retrieved snippet include the invariants the suggestion depends on, or does it show a happy path that hides a constraint?

If the retrieved context is wrong or outdated, the suggestion built on it is wrong regardless of how fluent it looks. Reading context quality before suggestion quality prevents the authority transfer from firing. It also catches the case where Cody retrieved the right file but the wrong portion — a common failure mode when the relevant invariant lives in a different function than the one retrieved.

Read imports and call signatures, not just logic

Cody’s codebase awareness makes it good at importing real modules and calling real functions by name. It is less reliable at getting argument order, optional parameter defaults, and the subtle differences between two similarly-named utilities correct. Before accepting any Cody suggestion that makes function calls, check each call against the actual function signature — not the name (which will be correct) but the argument list, the return type, and whether the calling code handles errors the callee can throw.

This is the check that the familiarity bypass most reliably suppresses. When you recognize the function name, the argument list feels correct by association. Reading the actual signature separately — opening the file, checking the type definition — breaks the association and replaces the familiarity check with a factual one.

Run `/test` after review, not instead of review

If you use Cody’s /test command, run it after you have reviewed and accepted the implementation, not as a substitute for reviewing it. Use the generated tests as a starting point and then extend them: add the edge case you identified during review, the error path the generated tests didn’t cover, the concurrent-access scenario the model’s tests ignored. If you identified a specific correctness concern during review and the generated tests don’t address it, add the test yourself.

The useful question is not “did Cody generate tests?” but “do the tests cover the thing I was worried about?” AI-generated tests are a starting point for test coverage, not a completion signal. The completion signal is that your review concern has a test that would fail if the concern were realized — and that test has to come from your review, not from the model’s interpretation of its own output.

Sourcegraph Cody’s codebase indexing is the feature that makes it more useful than tools without repository context. The review problem is exactly that usefulness: code that references your real modules, your real patterns, and your real conventions looks pre-validated in a way that makes the familiarity bypass feel warranted. It is not. Recognition is not review. The retrieved context is not verification. The generated explanation is not understanding. Each of those steps — retrieving context, generating a fluent suggestion, explaining the output — is Cody working from its model of your codebase. The check against your actual intent is something only you can do.

ZenCode — breathing for vibe coders

A VS Code extension that fires a 10-second breathing pause during AI generation gaps. Keeps you in review mode instead of done-signal mode.

Get ZenCode free

Try it in the browser · see the real numbers