Compute Purchasing Options

Not sure you’re ready?

Take the ~3-minute readiness diagnostic and see where you stand.

Every time a virtual machine spins up in an AWS data center, a financial meter begins ticking. The fundamental proposition of cloud computing replaces capital expenditure—the agonizing process of predicting peak loads, ordering hardware, and racking physical servers—with operational expenditure. Yet, this agility introduces a distinct engineering friction. If a solutions architect provisions compute resources without precisely aligning the AWS purchasing model to the structural behavior of the workload, they will bleed capital.

Every cloud instance is a virtual machine provisioned atop physical hardware. The compute purchasing model you choose dictates exactly how and when those physical resources are allocated.

Understanding how to acquire compute capacity is not merely an accounting exercise. It is a core architectural competency. The pricing model dictates the physical availability of your infrastructure, the resilience mechanisms you must engineer, and ultimately, the viability of your business's technical operations.

We must begin with the default state of the cloud. Amazon EC2 On-Demand Instances serve as the baseline pricing model for Amazon EC2. When you provision an On-Demand Instance, you are asking AWS to allocate physical CPU and RAM to your workload right now, and to keep it there until you explicitly terminate it.

Because you are demanding immediate, guaranteed resources, On-Demand Instances require no long-term commitment and require no upfront payment. Instead, AWS allows users to pay for compute capacity by the hour or the second.

This model is a mathematical premium paid for absolute flexibility. On-Demand Instances provide the guaranteed continuous availability required by production workloads, making them the de facto choice for systems that must serve live user traffic without interruption. Furthermore, they are suitable for unpredictable workloads that cannot be interrupted, as well as ideal for short-term workloads where the long-term baseline is not yet understood.

Think about the physical reality of an AWS data center. To ensure that millions of customers can launch On-Demand instances instantly, AWS must maintain a massive surplus of unutilized servers. Idle silicon generates no revenue. To extract value from this surplus, AWS auctions it off.

To guarantee instantaneous On-Demand provisioning for enterprise clients, cloud providers must maintain vast data centers with surplus physical servers. The Spot market monetizes this otherwise idle hardware.

Amazon EC2 Spot Instances utilize unused Amazon EC2 compute capacity. Because AWS wants this hardware doing something, Amazon EC2 Spot Instances offer discounts of up to 90 percent compared to On-Demand pricing.

However, there is a fundamental catch dictated by the laws of supply and demand: when an On-Demand customer needs that hardware, AWS will abruptly take it back. Amazon EC2 provides a two-minute warning before reclaiming a Spot Instance. Because you can lose your underlying hardware at a moment's notice, Spot Instances are inappropriate for critical workloads requiring guaranteed continuous uptime.

Spot Instance availability is governed by strict economic equilibrium. As On-Demand capacity requests increase, the supply of available Spot instances drops, triggering abrupt instance interruptions.

If we know the server might vanish, how do we architect for it? We build systems that simply don't care if a server dies. Spot Instances are highly suitable for fault-tolerant workloads. If you are operating stateless web servers, where user session data is offloaded to a database like DynamoDB or Redis, they are excellent candidates for Spot Instances. Similarly, batch processing jobs are highly suitable for Spot Instances, because if a node processing a queue of background jobs is terminated, the message simply returns to an SQS queue for another worker to pick up.

Historically, users could play an active bidding game. Today, while users can define a maximum price they are willing to pay per hour for a Spot Instance, the market is much more streamlined.

Designing Resilient Spot Architectures

Architects rarely rely on Spot instances in isolation. Instead, we use Auto Scaling groups to provision both On-Demand Instances and Spot Instances within a single group. Combining On-Demand Instances and Spot Instances within an Auto Scaling group balances cost optimization with baseline capacity—ensuring that even if a massive Spot interruption occurs, your base tier of On-Demand instances keeps the lights on.

Alternatively, an Amazon EC2 Fleet allows users to launch a mix of On-Demand Instances and Spot Instances using a single API call, defining the exact proportions of each.

When asking AWS to fulfill your Spot request, you must choose an allocation strategy. This tells AWS how to hunt for available capacity:

Capacity-optimized Spot allocation strategy: Provisions instances from the most available Spot capacity pools. Because AWS draws from pools with the deepest reserves of idle hardware, this strategy actively reduces the frequency of Spot Instance interruptions.
Price-capacity-optimized Spot allocation strategy: Provisions instances based on both pool availability and lowest price, providing a delicate balance between budget reduction and interruption resilience.

Once your workloads establish a predictable baseline, paying the On-Demand premium becomes negligent. If you know you will run a database 24/7 for the next year, you should negotiate a lease.

Amazon EC2 Reserved Instances require a purchasing commitment of either one year or three years. To maximize your financial efficiency, Reserved Instances can be purchased with an All Upfront, Partial Upfront, or No Upfront payment option. Unsurprisingly, the All Upfront Reserved Instance payment option provides the highest discount, as AWS has your money in hand immediately.

Like On-Demand, Reserved Instances provide the guaranteed continuous availability required by production workloads, but they come in two distinct flavors:

Feature	Standard Reserved Instances	Convertible Reserved Instances
Maximum Discount	Provide discounts of up to 72 percent compared to On-Demand pricing.	Offer lower maximum discounts than Standard Reserved Instances.
Attribute Flexibility	Cannot be exchanged for different instance families.	Allow users to exchange the instance family, operating system, and tenancy attributes.
Secondary Market	Can be sold on the AWS Reserved Instance Marketplace if your needs change.	Cannot be sold on the AWS Reserved Instance Marketplace.

The "Capacity vs. Discount" Trap

A profound source of confusion for engineers is the difference between a billing discount and a physical reservation of hardware. To resolve this, AWS splits RIs into two scopes:

Zonal Reserved Instances: These provide a capacity reservation in a specific Availability Zone. If you buy a Zonal RI, AWS literally sets aside physical slotting in that specific data center. You are guaranteed the space.
Regional Reserved Instances: These apply the discount to usage across any Availability Zone within the selected AWS Region. Because AWS doesn't know which AZ you will launch into, Regional Reserved Instances do not provide a capacity reservation. However, they offer instance size flexibility within the same instance family (e.g., if you reserve an m5.large, the discount will automatically apply if you spin up two m5.medium instances instead).

Tracking RIs, exchanging Convertible RIs, and balancing AZs became an exhausting spreadsheet exercise. In response, AWS released a simpler, vastly more flexible model.

Like RIs, AWS Savings Plans require a term commitment of either one year or three years. However, instead of committing to specific server types, AWS Savings Plans commit a user to a consistent amount of usage measured in dollars per hour (e.g., committing to spend $10.00/hour).

The Mechanics of a Savings Plan: AWS automatically applies Savings Plans discounts to the highest matched usage first, maximizing your financial return. AWS bills any compute usage beyond the AWS Savings Plan commitment at regular On-Demand Instance rates.

There are two primary tiers of Savings Plans:

Compute Savings Plans: These are the ultimate "easy button" for modern cloud architectures, offering discounts of up to 66 percent compared to On-Demand rates. Compute Savings Plans provide flexibility regardless of the Amazon EC2 instance family, and flexibility regardless of the AWS Region. Most importantly, Compute Savings Plans apply discount rates to Amazon EC2 usage, AWS Fargate usage, and AWS Lambda usage. If your engineering team decides to migrate a legacy EC2 application into containerized Fargate tasks or serverless Lambda functions, your financial commitment automatically follows the workload.
EC2 Instance Savings Plans: These offer deeper discounts of up to 72 percent compared to On-Demand rates, but mandate a tighter technical constraint. EC2 Instance Savings Plans apply discounts strictly to a selected instance family within a specific AWS Region (e.g., the m5 family in us-east-1). They do, however, allow flexibility for instance size, operating system, and instance tenancy within that chosen instance family.

Cloud computing is built on multitenancy—your virtual machine shares a physical motherboard, CPU, and RAM with workloads belonging to other AWS customers. A hypervisor enforces strict isolation, but you are still sharing the metal.

In standard multitenant environments, a hypervisor securely partitions shared physical hardware among multiple AWS customers. Dedicated environments bypass this structure to provide exclusive access to the underlying metal.

Certain regulatory frameworks or legacy enterprise software vendors refuse to accept this. To accommodate them, AWS provides two single-tenant options:

Dedicated Hosts: A Dedicated Host is a physical Amazon EC2 server fully dedicated to a single customer's use. Because you control the metal, Dedicated Hosts allow customers to use existing per-socket or per-core software licenses (like legacy Oracle or Microsoft SQL Server licenses). Consequently, Dedicated Hosts provide visibility into the number of physical sockets and cores on the server.
Dedicated Instances: These run in a virtual private cloud on hardware dedicated to a single AWS account. While no other customer's instances will run on that hardware, Dedicated Instances do not provide visibility into the underlying physical server hardware. You simply know your neighbors are your own applications, but you cannot map the virtual machines to physical socket topologies.

Recall the earlier distinction: a billing discount is a coupon; a capacity reservation is a reserved table at a restaurant. Savings Plans and Regional RIs are coupons. If an Availability Zone runs out of physical hardware during a massive regional event, having a Savings Plan will not force AWS to magically manifest a server for you.

If you require an ironclad guarantee that space will exist when you need it (for a heavily anticipated product launch or a disaster recovery failover), you must use On-Demand Capacity Reservations.

On-Demand Capacity Reservations allow users to reserve compute capacity in a specific Availability Zone. Unlike RIs, On-Demand Capacity Reservations do not require a long-term term commitment. However, because you are hoarding physical space, On-Demand Capacity Reservations incur standard On-Demand instance rates when not combined with a billing discount. You pay for the reserved space whether you run an instance in it or not.

To eliminate the On-Demand premium while maintaining the reserved space, On-Demand Capacity Reservations can be combined with Savings Plans to reduce the hourly capacity rate. Alternatively, On-Demand Capacity Reservations can be combined with Regional Reserved Instances to reduce the hourly capacity rate. This combination gives you the ultimate enterprise posture: guaranteed physical space coupled with heavily discounted long-term pricing.

When you synthesize these purchasing options, the architectural choices become deeply logical.

Production workloads demand high availability to serve live user traffic without interruption. You cannot drop a customer's shopping cart because someone else outbid you for CPU cycles. Therefore, your base production tier must rely on On-Demand Instances or Reserved Instances.

Production architectures rely on the guaranteed hardware access of On-Demand and Reserved Instances to provide high availability across multiple nodes, ensuring that single-point failures do not disrupt service.

Conversely, non-production workloads are generally highly suitable for Spot Instances. Software development environments can typically tolerate sudden instance interruptions; if a developer's compile job fails, they simply restart it. Similarly, software testing environments can typically tolerate sudden instance interruptions. By ruthlessly applying Spot to these non-critical tiers, an architect frees up vast amounts of capital that can be reinvested into engineering innovation.

Ultimately, mastering AWS compute purchasing is about matching the financial model to the physical and behavioral reality of your code. It is an engineering discipline in its own right, and a foundational pillar of a Well-Architected solution.