Hi @Janice Chi
A fair cost comparison across Databricks Serverless SQL Warehouse, Azure SQL Hyperscale, and ADLS Archive/Cold requires normalizing against three dimensions: compute units, storage charges, and billing granularity.
Compute equivalency (DBUs vs vCores)
- Databricks Serverless SQL Warehouse: Billed in DBUs per hour (pro-rated per second of usage). The X-Small size (6 DBUs/hr) is not directly equivalent to a fixed number of SQL Hyperscale vCores - Databricks abstracts CPU/memory allocation and manages elasticity internally. There is no official 1:1 mapping (e.g., “X-Small = 16 vCores”) because execution engines differ (Spark vs. SQL Server engine).
- Azure SQL Hyperscale (Serverless): Billed in vCore-seconds, with compute automatically paused/resumed. Here, compute is explicitly tied to vCores + memory per vCore. For comparisons, the recommended approach is to use benchmark workload performance (TPC-H or internal queries) instead of attempting a direct DBU↔vCore equivalence.
Storage tiers and costs
- ADLS Archive/Cold: Very low storage cost, but with rehydration latency (Archive) and early deletion penalties. Reads are charged per GB, and retrieval can add noticeable delay.
- Hyperscale: Includes built-in HA replicas, write-ahead logging, checkpointing, and backups in the cost. Storage is billed separately for data + log + backup pages.
- Databricks: Storage cost is externalized (your ADLS account). You only pay compute (DBUs) + underlying ADLS storage tier costs.
Billing models and granularity
- Databricks Serverless SQL Warehouse: Pro-rated per second, with a minimum billing time for active sessions. Startup/idle time may incur cost until the warehouse suspends.
- SQL Hyperscale (serverless): Per-second billing, with no charge when compute is auto-paused.
- ADLS Archive/Cold: Storage is billed per GB/month, with additional per-GB read and rehydration charges. Archive also has early deletion charges if data is deleted before 180 days.
Hidden/indirect costs to account for
- Databricks: Minimum cluster warm-up time, DBU/hr rounding if using classic (vs serverless).
- Hyperscale: Background operations (auto statistics, backups, HA replicas) are included but may slightly increase storage cost.
- ADLS Archive/Cold: Rehydration delay, egress/read charges, and early deletion costs.
Best practice for cost modeling
- Normalize workloads by expected query concurrency, frequency, and data scanned per query.
- Compare total monthly TCO, not just per-hour compute: (Compute + Storage + Data Access + HA/replica overhead).
- Use Azure Pricing Calculator and Databricks pricing documentation with workload assumptions to model apples-to-apples.
References:
- Azure SQL Database serverless compute tier
- Azure Storage pricing (Hot, Cool, Archive)
- Databricks SQL pricing
Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.