At a glance

· If you cannot explain why Azure SQL costs moved, you do not have cost clarity. You have a bill.

· Azure billing shows where money landed. It rarely shows who caused it.

· You need a simple cost model: direct costs stay with apps; shared costs get split by a usage signal.

· Start with an 80/20 split. Then tighten it with telemetry from Query Store and Azure Monitor.

Why this matters

SQL is one of the fastest ways for cloud costs to become a political issue. Platform teams get blamed for “expensive shared services.” App teams get blamed for “bad queries.” FinOps gets stuck in the middle, trying to explain a bill that does not map cleanly to ownership.

The real issue is not that SQL is expensive; rather, it is that it is not scalable. It is that SQL is easy to share and hard to attribute. Elastic pools, managed instances, central monitoring, private endpoints, backups, and reservations all mix. The invoice looks clean. The accountability does not.

The three truths you must separate

Billing truth: what Azure billed and where it landed (subscription, resource group, meter).
Operational truth: what actually consumed the resources (CPU, I/O, storage, connections, query patterns).
Accountability truth: who should pay based on your operating model (shared platform vs app-specific use).

Cost clarity is the discipline of reconciling all three truths in a way that is repeatable, explainable, and fair enough that teams accept it. Not perfect. Accepted.

Define “platform” vs “app” spend in plain language

Do not let this become a philosophical debate. Write definitions you can put in a chargeback policy and defend in a meeting.

Platform spend

Shared capacity that exists so multiple apps can run (elastic pool, shared managed instance, shared SQL IaaS cluster).
Baseline controls required for the environment (monitoring, security baselines, backups, private networking patterns).
Central operational overhead that is intentionally shared (automation, patching, platform tooling).

App spend

Dedicated database resources purchased for one app (single database, dedicated MI, dedicated SQL VM).
App-specific premium features or scaling choices (zone redundancy, higher tier, extra replicas, aggressive retention).
App-driven consumption inside shared capacity that can be measured and attributed (CPU, reads/writes, storage growth).

A practical spend map for Azure SQL

This table is a starting point. Adjust based on how you run SQL in your estate.

Cost component	What Azure bills	Usually owned by	How to allocate
Elastic pool compute	Pool (not per DB)	Platform	Split by per-DB CPU time signal
Elastic pool storage	Pool storage	Platform	Split by per-DB data size or growth
Single DB compute	Database	App	Direct to app
Backup / long-term retention	Backup meters	Shared, but often app-driven	Direct if DB is dedicated; split if shared
Log Analytics + diagnostics	Workspace + ingestion	Platform	Split by volume per DB/app if you collect per DB
Defender for SQL / security tooling	Per-resource / per-server	Platform	Split evenly or by DB count if signals are weak
Private endpoints + DNS	NICs, zones, resolvers	Platform	Treat it as a platform unless you measure per app
Reserved instances/savings plan	Amortized discount	FinOps + platform	Allocate the discount back to the consumers

Where most orgs lose cost clarity

They try to use billing tags to solve a telemetry problem.
They centralize platform resources but do not design a fair allocation method.
They adopt elastic pools to simplify operations, then act surprised when attribution gets messy.
They apply reservations, savings plans, or hybrid benefits and do not push the savings back to the consuming apps.

If you are using shared SQL capacity, you must plan for allocation. Otherwise, the platform team becomes the dumping ground for everyone else’s consumption.

A simple cost model that scales

Use four buckets. Keep them consistent across your dashboards, exports, and conversations.

App direct: costs billed to an app-owned resource (single DB, dedicated MI, dedicated SQL VM).
App shared: an app’s share of shared capacity (elastic pool, shared MI, shared tooling).
Platform base: baseline services you run, whether apps exist or not (core workspaces, central automation, base networking).
Platform shared: shared services that scale with adoption (pools, shared MI fleets, shared monitoring per DB).
Your reporting should show all four buckets. The goal is not to make the platform spend disappear. The goal is to show what is truly baseline versus what grew because consumption grew.

Allocation strategies that actually work

There are only three methods worth debating. Pick one per cost component. Document it. Then automate it.

1) Direct allocation (the easy wins)

If a resource is dedicated to one app, bill it directly. Do not overcomplicate it.

Use separate resource groups or subscriptions per app where possible.
Enforce required tags (app, environment, owner) with policy and deny when missing.
Keep dedicated databases out of elastic pools unless there is a strong reason.

2) Even split (acceptable when signals are weak)

Sometimes you do not have a clean usage signal. An even split is fine when you are transparent about it.

Shared security tooling costs where per-app telemetry is not feasible
Base networking and name resolution patterns (private DNS zones, resolvers)
Platform automation run costs if every team benefits similarly

3) Metered split (the long-term answer for shared SQL)

For elastic pools and shared managed instances, split costs based on a usage signal that approximates who caused consumption.

A practical signal stack (pick the strongest you can get):

CPU time per database (Query Store or DMVs)
Data reads/writes per database (Query Store, DMVs, or Azure Monitor where available)
Storage footprint or growth per database
Connection count per database (a weak proxy, but sometimes enough)

Then allocate pool cost using a weighted formula. Example:

Compute share per DB = (DB CPU time / total CPU time)
Storage share per DB = (DB size / total size)
Pool cost allocation per DB = (pool compute cost compute share) + (pool storage cost storage share)

This is not perfect. It is defensible. And it gives teams a lever: tune queries, reduce I/O, control growth.

Data you need (and where it comes from)

You can do this without buying another FinOps tool. You need two feeds: costs and signals.

Cost feed

Azure Cost Management exports (daily) to storage
Amortized view for reservations/savings plans where possible
Consistent tag set on resources (at least app, environment, owner)

Signal feed

Query Store (CPU time, reads, writes, duration) at the database level
Azure Monitor metrics for SQL (CPU %, storage %, workers, sessions)
DMVs for elastic pools and database resource stats (snapshot and trend)
Optional: diagnostic logs to Log Analytics if you want a single query surface

The trick is joining them. Costs are billed per resource and meter. Signals are measured per database. Your allocation layer is the glue.

A 30-day adoption plan that does not melt ops

Week 1: Establish the taxonomy

Agree on the four buckets and publish a one-page definition.
Standardize tags: appId, env, ownerTeam, costCenter, dataClass (minimum viable set).
Decide which SQL patterns count as “platform shared” (pools, shared MI).

Week 2: Build the first allocation

Export costs daily and build a simple Power BI or workbook view by subscription and resource group.
Pull a weekly CPU-time signal per database from Query Store or DMVs.
Allocate pool compute using CPU time. Allocate pool storage using DB size.
Publish a first showback report with assumptions in writing.

Week 3: Make it operational

Automate the allocation job (scheduled notebook, function, or dataflow).
Add an exception process: missing tags, unknown owner, shared database edge cases.
Add a monthly review with platform + FinOps + top app owners. Keep it short.

Week 4: Enforce and improve

Enforce required tags on new SQL resources using policy and a paved road template.
Push reservations or savings benefits back into app showback (do not keep the discount centralized).
Add the second-best signal (reads/writes or storage growth) if CPU alone is too noisy.

Pitfalls & Sharp Edges

Elastic pools hide per-DB billing. You must measure usage to split fairly.
Backup and long-term retention can spike without anyone noticing. Treat retention as an explicit app choice.
Central Log Analytics workspaces can become “platform spend” even when apps generate the logs. Track ingestion by source.
Reservations and savings plans change what “cost” means. Decide whether your showback uses actual, amortized, or blended.
Tagging breaks on day 2 unless you enforce it. Policy is not optional if you want cost clarity at scale.

If your goal is to reduce spend, the most common win is not “optimize SQL.” It is “stop paying for shared ambiguity.” Once teams see what they own, behavior changes.

The operator rule

If you cannot explain an Azure SQL cost change in one minute using data, you do not have cost ownership. You have surprise billing. Fix the model first, then tune the queries.

SQL cost clarity: separating “platform” spend from “app” spend