Saving $200,000 in a year on RDS - ConvertKit year in review
This is the twelfth breakdown I've written at ConvertKit. I wrote four internally before we decided to publish them on our engineering blog starting last August. For this post, I'd like to do a year in review to compare where we were last year to where we are now by using our bill numbers from April 2019 and comparing them with April 2020.
- EC2-Instances - $21,253.89 (+38%)
- Relational Database Service - $18,840.28 (-50%)
- EC2-Other - $12,784.41 (+68%)
- S3 - $8,337.49 (-2%)
- Savings Plans for Compute usage - $6,912.00 (-%)
- Support - $5,368.58 (-5%)
Relational Database Service - $18,840.28 (-50%)
RDS is the single most exciting thing to change in the last year in our AWS bill. April 2019 we spent $37,575.58 on RDS. At the time, we had one primary and four replicas. Three were used for the application, and the fourth was used to migrate data from MySQL to our new Cassandra cluster.
RDS accounted for 48% of our AWS bill in April 2019. In April 2020, RDS accounted for 24% of the bill.
We had been chipping away at our RDS costs starting at the beginning of 2019. In January 2019, we spent $45,616.90 on RDS alone. We continually made performance improvements in the application anywhere we could, which helped us chip the bill down slowly, but the most significant change happened in July 2019. At that time, we were able to drop our biggest MySQL table, get rid of two replicas, and stop our use of Provisioned IOPS (PIOPS). Our RDS bill is now 59% cheaper than it was at its peak in 2019.
EC2-Instances - $21,253.89 (+38%)
In April 2019 we spent $15,453.05 on EC2 instances. The two biggest increases for us in the last year have been from our Cassandra cluster and our Elastic Stack.
Our Cassandra cluster consists of 18 i3.2xlarge instances that we've reserved. These instances alone cost almost $6,000/month. Cassandra was the most vital piece for reducing our RDS bill. It allowed us to move our most active table to a data store more well suited for a very write-heavy workload. Without Cassandra, we could not have removed the extra read replicas, and we would still be using PIOPS.
Our Elastic Stack is mainly composed of 8 i3en.2xlarge instances that cost about $3600/month. These instances give us 36TB of storage, which is enough for 30 days worth of logs and APM retention. Gaining ownership of our logs allowed us to increase our log retention and consolidate our monitoring to a single location.
Even though Cassandra and Elasticsearch increased our EC2 costs, we were able to make optimizations and purchase the correct reservations to help bring our overall bill down.
EC2-Other - $12,784.41 (+68%)
Data transfer has been a difficult problem for ConvertKit to solve. April 2019, we spent $7600.91 on EC2 other, most of that was USE2-NatGateway-Bytes because we were running backups through our NAT gateway. We solved that issue by using a VPC endpoint for our S3 bucket. April 2020 we spent $8,276.67 on USE2-DataTransfer-Regional-Bytes. The more data that we ingest, the more we'll spend on EC2-Other. We're getting better at preventing data from moving when it doesn't need to, but we still have to be mindful of how we're routing traffic throughout our infrastructure. There's a performance benefit to keeping these costs down too. By finding the most efficient path for the data to flow, you are shaving small amounts of time off every request. It may not be noticeable at first, but it compounds and is good for having the most performant infrastructure possible.
S3 - $8,337.49 (-2%)
Despite the fact it's only down 2% from last year, we've made incredible progress in decreasing our S3 bill. One outage hurt the S3 bill for a couple of months before stabilizing again.
In March 2020, our CDN had an outage that made us serve traffic directly from S3 instead of from cache through our CDN. Once an S3 link gets sent out, there is no way for us to change it, so it takes months before the links become mostly stale. As the links become stale over time, the amount of data we serve directly from S3 decreases, and the bill goes down.
Despite that outage, we've done an excellent job suppressing data transfer. April 2019 TimedStorage-ByteHrs was only 12% of S3 costs, and in April 2020, they were 40% of S3 costs. That means even though we're storing 5x more stuff than we were last year, we're paying less now because our data transfer is down about 56%.
Savings Plans for Compute usage - $6,912.00 (-%)
We saved $2,630.19 in April from purchasing this. This cost did not exist last year.
Support - $5,368.58 (-5%)
- 7% of monthly AWS usage from $10K-$80K - $3,423.61 (-6%)
- This is the cost of only our production account.
- Because our production account costs less now than it did last year we paid less for support.
- 10% of monthly AWS usage for the first $0-$10K - $1,944.98 (-3%)
- This is the cost of our production account and billing account.
- Because both our billing and production account costs less this year than it did last year, support costs less now.
ConvertKit's AWS bill has increased by 1.2% since April 2019, while our MRR has grown 40%. We have made significant strides from both a stability and cost perspective. By optimizing our database and configuring our CDN, we have saved at least $200,000 in the past 12 months.