Pulumi AI: how to review infrastructure code when natural language generates your cloud configuration
Pulumi AI is an AI-powered assistant built into the Pulumi platform that lets you describe cloud infrastructure in plain language and receive working Pulumi programs in return. You type “a load-balanced Node.js application on AWS with an RDS PostgreSQL database and S3 bucket for uploads” and Pulumi AI returns TypeScript, Python, or Go code defining the VPC, subnets, security groups, EC2 instances, load balancer, RDS cluster, IAM roles, and S3 bucket, ready to pass to pulumi up. The output is not a template or a scaffold — it is a complete, runnable Pulumi program, generated from the description you provided.
That makes Pulumi AI a qualitatively different tool from AI coding assistants that generate application logic. Infrastructure code provisions real cloud resources, incurs real costs, creates real security surfaces, and manages real state that must stay synchronized with what is actually deployed. The review dynamics are distinct from reviewing a generated React component or a Python function: the consequences of merging without review run in your cloud account, generate invoices, and can persist indefinitely. Understanding the specific traps that Pulumi AI’s generation model creates does not make it less valuable — AI-generated infrastructure dramatically accelerates initial provisioning and reduces the time to a working reference configuration. The traps are in where the generation model’s scope ends and human review must begin.
The three Pulumi AI review traps
1. Cost blindspot in generated resource configurations
Pulumi AI generates resource configurations that work. Working, for a generated infrastructure program, means that the resources are correctly defined, correctly connected, and that pulumi preview produces a plan without errors. Cost does not enter the generation model’s objective. The model produces configurations that satisfy the architectural description, not configurations that satisfy the architectural description at the minimum cost consistent with the requirements.
The concrete patterns that surface in generated code: multi-AZ database clusters when a single-AZ development instance would suffice, db.r6g.large RDS instances when db.t3.micro is adequate for the load described, retention windows set to 35 days when 7 was intended, NAT gateways provisioned in every availability zone when one would handle the described traffic, S3 Intelligent-Tiering enabled by default without evaluating access patterns that might make it more expensive than Standard, CloudWatch log groups with no retention policy set — which defaults to indefinite retention and accumulates charges silently. Each individual choice is defensible. A multi-AZ RDS instance is a reasonable default for production. The problem is that Pulumi AI applies these defaults uniformly regardless of whether your description was for a production environment or a development prototype, and pulumi preview does not surface estimated cost as part of the plan output.
The consequences arrive on the monthly bill rather than in the terminal. A generated “simple staging environment” that provisions a multi-AZ RDS cluster, three NAT gateways, and a large EC2 instance can cost $400–$800 per month. That is not visible in the diff, not visible in pulumi preview, and not flagged during code review if reviewers are focused on correctness rather than cost. The fix: before reviewing generated infrastructure code for correctness, read it for cost. For every resource definition, identify the instance type, the replication factor, the storage size, and any retention or lifecycle configuration. Cross-reference with the environment description — production, staging, or development — and verify that the generated defaults match the cost tolerance for that environment. Run pulumi preview with cost estimation tooling if available in your Pulumi organization tier. For prototypes and staging environments, explicitly downgrade instance types and disable multi-AZ before merging.
2. IAM over-permissioning in generated roles and policies
Pulumi AI generates IAM roles and policies that allow the defined resources to perform the operations they need to perform. The generation model resolves permission uncertainty by granting more rather than less: if the generated EC2 instance might need to read from S3, the generated IAM policy grants s3:*; if the Lambda function might need to write to DynamoDB, the policy grants dynamodb:*; if cross-account access is involved, the trust policy may include a wildcard on the principal rather than a specific account ID. The generated code does not error. The resources can do everything they need to do. They can also do considerably more.
The specific patterns to look for in generated IAM output: action wildcards (s3:*, ec2:*, sts:*) where scoped actions (s3:GetObject, s3:PutObject) would satisfy the described requirement; resource wildcards ("Resource": "*") where specific resource ARNs would suffice; trust policies with overly broad principals, including "AWS": "*" or conditions missing entirely; missing permission boundaries on roles that are assumed by AI-generated code; absence of condition keys that would restrict usage to specific VPCs, IP ranges, or MFA state. These patterns produce code that passes pulumi preview and successfully deploys — the resources are valid IAM configurations, just not minimal ones.
The security surface that over-permissioned IAM creates is not theoretical. A Lambda function with s3:* on "Resource": "*" can read from any bucket in the account, including buckets containing secrets, backups, and other services’ data stores. An EC2 instance with iam:* can create new IAM roles and escalate privileges. A compromised resource with excessive permissions becomes an account-level security incident rather than an isolated resource-level one. The fix: review every generated IAM role and policy as a separate pass after reviewing infrastructure architecture. For each action wildcard, enumerate the specific actions the resource actually requires and replace the wildcard. For each resource wildcard, replace it with specific ARNs where the generated code makes the resource identifiers available. Check every trust policy’s principal and conditions. This pass is independent of correctness review — over-permissioned IAM is not an incorrectness the code review would catch through normal reading.
3. State drift from modified generated configurations
Pulumi AI generates a program. You run pulumi up and the program provisions the infrastructure. Then you modify the program — adjusting an instance type, adding a tag, changing a security group rule, renaming a resource — to fit your actual requirements rather than the generated defaults. This is the expected workflow; generated output is a starting point, not a final configuration. The trap is that Pulumi’s state model tracks the relationship between program resources and deployed cloud resources by the resource’s logical name in the program. When generated code names resources generically — webServer, database, securityGroup — and you rename them to match your naming conventions, or when you restructure generated component hierarchies to match your project layout, Pulumi’s state file loses the mapping between program names and cloud resource IDs.
The visible symptoms: pulumi up plans to delete the old resource and create a new one rather than updating the existing one, because it cannot match the renamed program resource to the deployed cloud resource. For stateful resources — RDS instances, EBS volumes, ElastiCache clusters — this means unintended data loss. For resources with dependencies — security groups referenced by multiple instances, IAM roles referenced by multiple services — a deletion cascade can affect resources the generated code did not touch. The error does not appear during code review; it appears when you run pulumi up in the environment where the original generated code was already deployed, and the plan shows unexpected replacements.
The underlying cause is structural: generated code does not know what you have already deployed. If you generated infrastructure code, deployed it, and are now modifying the generated program, the state file is authoritative about the deployed resources. Renaming program resources without aliasing them breaks the state mapping. The fix: before modifying resource names or hierarchies in generated Pulumi code that has already been deployed, run pulumi state to understand what the current state file tracks. Use Pulumi’s aliases property to preserve state mappings when renaming program resources. For generated code you have not yet deployed, rename freely. For generated code that is already in state, treat renaming as a migration that requires explicit aliasing or state manipulation, not a simple refactor. Review any generated code modification against the current state file before running pulumi up, and confirm that the preview plan does not contain unexpected replacements of stateful resources.
Reviewing Pulumi AI output without mistaking generation for configuration design
The three traps share a structural cause: Pulumi AI generates code that satisfies the stated architectural description accurately and is silent about the unstated constraints that govern real infrastructure decisions. Cost tolerance is an unstated constraint — the generation model cannot know whether you are building a production workload or a weekend prototype. Least-privilege IAM is an unstated constraint — the model optimizes for a policy that works, not one that is minimal. State consistency is an unstated constraint — the model generates a fresh program without knowledge of what you have already deployed.
Using Pulumi AI well means treating the generated output as a correct architectural starting point and then applying the three review passes that the generation model cannot perform. First, read for cost: for every resource definition, verify that the instance type, replication factor, storage size, and retention configuration match your environment’s cost tolerance. Second, read for IAM scope: for every generated role and policy, replace action wildcards with specific actions and resource wildcards with specific ARNs. Third, read for state impact: if generated code will be applied to an environment that already has Pulumi state, review every resource name and hierarchy change against the current state file before running up. Each pass is independent and addresses a gap the generation model cannot close.
Related reading: Terraform AI on the same plan-correctness gap in HCL-based IaC, plus community module trust and state operation irreversibility. GitHub Copilot for GitHub Actions on AI-generated workflow code and the permissions scope creep trap in a CI/CD context. OpenAI Codex on the sandbox success as production proxy trap, structurally similar to pulumi preview passing without surfacing cost or security gaps. Semgrep on the rule coverage as security coverage trap — the same pattern of a tool that is accurate within its scope while silent about what falls outside it. Snyk Code on security scanning that finds supply-chain vulnerabilities but does not evaluate IAM policy scope. How to review AI-generated code for the base checklist that applies after the automated tools have run their pass.
Pulumi AI generated the config. ZenCode asks whether you checked the cost and IAM scope.
ZenCode surfaces one concrete review question before you run pulumi up — the check that the generation model, pulumi preview, and standard code review cannot perform for your specific cloud environment.