FinOps Meets IAM: How AWS Finally Delivered Granular Cost Attribution in Bedrock

Let me start with a confession. Over the past eighteen months, my inbox received at least a dozen messages with subject lines like “we need to figure out who is spending so much on AI.” And invariably, the answer I gave was a diplomatic version of: “we can’t really know, but we can estimate by cross-referencing CloudWatch with CloudTrail and then pasting into the billing report.” It wasn’t satisfying for anyone. Finance wanted precise chargeback, tech leads wanted per-team visibility, and I wanted to stop building scripts that were basically glorified spreadsheets.

On April 17, 2026, AWS launched granular cost attribution in Amazon Bedrock. Overnight, the `line_item_iam_principal` column appeared in CUR 2.0, Cost Explorer started accepting IAM principal tags as filtering dimensions, and the tedious conversation about “who is consuming our Claude and Nova budget” finally has a direct answer. This is one of those features that look minor when announced but solve a specific operational pain that had been dragging down enterprise-grade generative AI adoption.

The pre-feature reality

Bedrock launched with a billing model that makes sense for an inference service: you pay per token processed, input and output priced separately, each model with its own rate. That’s fair. The problem wasn’t the billing model; it was how it appeared on the bill.

Before this feature, if you had ten teams sharing the same AWS account and making Bedrock calls, the cost showed up aggregated as a single line item. “Bedrock: $23,457 this month.” That was it. If the CFO asked “how much did the analytics team spend?”, you had to resort to a combination of gymnastics: enable CloudTrail data events (which aren’t free), run queries in Athena against the logs, sum tokens manually by identity, cross-reference with model pricing (which changes), and do the math by hand. Every month. For every team. For every project.

In organizations where I worked, I saw this end in three distinct patterns. Pattern 1: someone created separate accounts per team to get automatic billing. It solved the attribution issue, but created AWS Organizations overhead and disrupted shared prompts and knowledge bases. Pattern 2: a custom gateway was adopted (API Gateway with Lambda over Bedrock) that tagged each call with identity metadata, and later someone built ETL to feed it into internal billing. Functional, but with constant maintenance overhead. Pattern 3: the problem was simply ignored and costs remained diffuse, which was terrible for mature FinOps.

None of the three is good. The new feature eliminates all of them.

Feature overview

Technically, it’s elegant. Bedrock now records, for each inference call (InvokeModel, Converse, Chat Completions), the IAM identity of the principal that made the call. That information goes directly into CUR 2.0 in a column called `line_item_iam_principal`, which holds the full ARN of the caller, whether it’s an IAM user, IAM role, federated session, or assumed role.

Cost attribution flow by identity

With that, you get two levels of visibility:

Level 1, automatic identity attribution. Without doing anything beyond enabling the option in CUR 2.0, you already see which user or role called and how much they spent. “Alice called Claude Opus 4.6, spent $234 on input tokens.” “app-backend-prod-role spent $1,800 on Claude Sonnet 4.6.” It’s immediate, requires no tagging, no code changes.

Level 2, tag-based aggregation. If you attach tags to your IAM users and roles (and the modern pattern is to do that anyway), those tags are captured automatically and become filtering and grouping dimensions. If the role `app-backend-prod` has tags `team=payments`, `env=prod`, `costCenter=CC-101`, you can ask “how much did team=payments spend this month” and the answer comes from Cost Explorer with no additional effort.

IAM principal tags appear in billing with the `iamPrincipal/` prefix to avoid collision with resource tags. You activate these tags as cost allocation tags in the billing console, wait 24 to 48 hours, and the data starts appearing.

The four access patterns that matter

Where things get interesting is with the real access patterns applications use to talk to Bedrock. AWS categorizes four scenarios, each with nuances:

Scenario 1: IAM user calling directly. The simplest case. The call carries the user’s identity. Billing attributes directly. Tags on the user aggregate automatically.

Scenario 2: application assuming a role. The most common case in production. The application runs with temporary credentials from a role, and every Bedrock call is attributed to the role’s ARN. If you have one role per application, per-application attribution comes for free. If multiple applications share the same role, you need another strategy.

Scenario 3: federation via IdP (Okta, Entra ID, Google Workspace). User logs in via SSO, assumes a role, calls Bedrock. The role ARN and session name are recorded. If your IdP passes session tags (for example, a `team` tag derived from the user’s AD group), those tags flow to billing automatically. It’s powerful and few people use it. But they should.

Scenario 4: gateway pattern. Here’s the gotcha. Many organizations place a gateway (API Gateway + Lambda, or an internal proxy like LiteLLM or similar) between end users and Bedrock. The gateway assumes a single role, and all calls show up attributed to that role. If you don’t do anything, all organizational inference appears in billing as a single expense under the gateway’s role. Terrible for FinOps.

The solution for scenario 4 is er-user session tags. The gateway, when processing the end-user’s request, calls `sts:AssumeRole` with session tags that identify the user (`user=alice`, `team=payments`). The resulting session carries those tags, and Bedrock records them in CUR 2.0. Implementing this requires a code change in the gateway, but it’s perhaps the highest-ROI change in the entire adoption plan: you gain maximum granularity without having to rearchitect.

Practical implementation steps

If you want to take advantage of this today, the roadmap is relatively short:

First, identify the existing CUR 2.0 export and check whether it includes IAM principal data. If it doesn’t (which is the case for any export created before this feature), you need to create a new export. Existing exports are not updated retroactively, and this isn’t communicated well. I’ve already seen people lose a month waiting for data to appear in the old export.

Second, define which organizational dimension you want to track. Team? Cost center? Application? Project? End customer (if you’re SaaS)? That determines the tagging schema. I recommend something simple at first: `team`, `env`, `costCenter`. Add more later if needed.

Third, tag existing IAM users and roles. Quick script with the CLI, or via Terraform if you have your infrastructure as code (which you should). Tags only appear in cost allocation after the principal has made at least one Bedrock call, so if you tag but never invoke, nothing shows up.

Fourth, activate the tags as cost allocation tags in the Billing console. This is the step most commonly forgotten. Without explicit activation, the tag exists but doesn’t appear in Cost Explorer.

Fifth, wait 24 to 48 hours. Go to Cost Explorer, group by tag, and confirm data is there.

For gateway scenarios, add a sixth step: modify the gateway code to pass session tags when assuming the role. It’s not complex (it’s a small change in boto3, aws-sdk-js, or equivalent), but it requires security review because session tags appear in CloudTrail and can leak information if poorly chosen.

Current limitations

I’ll be honest: the feature solves 80% of the problem, but the remaining 20% is frustrating.

Cache tokens. Bedrock charges cache-write tokens (when you enable prompt caching) separately from regular tokens. The current feature attributes the total cost per principal, but doesn’t disaggregate by token type. If you want to know “how much did team=analytics spend on cache write vs regular input”, you still need to cross-reference CUR 2.0 with CloudWatch.

bedrock-mantle APIs. Attribution today works for bedrock-runtime (InvokeModel, Converse, Chat Completions). For administrative and fine-tuning APIs that go through bedrock-mantle, not yet. AWS says it’s coming, without a date.

Cross-account attribution. If your organization has Bedrock in a central account and multiple application accounts call via cross-account role, attribution rests with the assumed role, not the end user in the origin account. It’s solvable with session tags, but requires discipline.

Time granularity. Data appears with 24 to 48 hours of delay. If you want near-real-time alerting on team or user consumption, you need to complement with CloudWatch metrics or CloudTrail data events, which are billed separately.

These limitations aren’t dealbreakers, but they’re the kind of thing that determines how much value you extract from the feature. Being aware of them helps set expectations with financial stakeholders.

Recommended approach for a greenfield rollout

If I were structuring a FinOps program for generative AI adoption in a mid-sized organization, I would follow this sequence.

I’d start by defining a minimum mandatory tag schema: `team`, `env`, `costCenter`, `application`. Document it in an internal ADR and enforce it via SCP or IAM policy that blocks role creation without the tags. That prevents the classic “we start with tags and in six months the environment is messy” problem.

Next, I’d migrate the LLM gateway (if it exists) to pass session tags. It’s the highest-ROI code change. Without it, the Bedrock feature becomes almost useless in any environment using a gateway, which is most enterprise environments.

I’d configure Cost Explorer alerts by tag, with thresholds per team. “If team=payments exceeds $5,000 in a month, send an alert to the team Slack.” That transforms the feature from passive observation into active governance. Cost you see and cost you get alerted about are very different in terms of organizational behavior.

Finally, I’d establish a monthly ritual for reviewing AI spend. Not a meeting just to talk about cost, because no one tolerates that. But a review focused on understanding which usage patterns are generating value and which are generating spend without clear return. With granular attribution, this conversation becomes data-backed instead of feelings-backed.

Closing thoughts

Granular cost attribution per IAM principal in Bedrock is one of those features that look small in release notes but change the economics of generative AI adoption at enterprise scale. Without it, GenAI in large organizations was always a political tug-of-war: who pays, who uses, who justifies. With it, it becomes what it should have been from the start: an infrastructure resource with mature FinOps, auditable, attributable.

For anyone building internal AI platforms, this is the kind of capability you should design around. You no longer need to build your own token accounting system. AWS delivered, and delivered well. The rest is adopting it with discipline.

I personally am going to spend next week retiring homegrown scripts that became obsolete. I won’t miss them.

Thiago Souza is a cloud architect working with regulated AWS environments, an AWS Community Builder, and writes about security, architecture, and large-scale operations.

References:

AWS ML Blog: Introducing granular cost attribution for Amazon Bedrock (Apr/2026)
AWS Cloud Financial Management Blog: Track Amazon Bedrock Costs by Caller Identity
AWS Billing Documentation: Using IAM principal for cost allocation
Amazon Bedrock Documentation: IAM principal attribution

FinOps Meets IAM: How AWS Finally Delivered Granular Cost Attribution in Bedrock

The pre-feature reality

Feature overview

The four access patterns that matter

Practical implementation steps

Current limitations

Recommended approach for a greenfield rollout

Closing thoughts

Promote your content

Join our developer community

Main Menu

FinOps Meets IAM: How AWS Finally Delivered Granular Cost Attribution in Bedrock

The pre-feature reality

Feature overview

The four access patterns that matter

Practical implementation steps

Current limitations

Recommended approach for a greenfield rollout

Closing thoughts

Promote your content

Join our developer community