Preface
This article is a cheatsheet and a collection of tips/tricks for doing back of the envelope calculations.
Numbers
Data Types to Bytes
Note: keep in mind that these are general estimates. Depending on the language that implements them, the actual size stored in memory may vary.
Data Type | Byte(s) | Explanation |
CHAR | 1-4 bytes | 1 byte is enough to cover all the characters in ASCII and then some (1 byte = 255 character choices). Some languages that allow Unicode characters (144,697 character choices) will have to allocate more bytes per character, so some languages may use up 2-4 bytes rather than just 1. |
BOOL | 1 byte | True or False state can technically be represented by 1 bit, but the CPU can't address anything smaller than a byte. |
INT | 4 bytes | Decimals range from \([-2^{31}, 2^{31}]\). Up to ~4 billion numbers. Each byte has 8 bits, so \(\frac{32}{8} = 4\) |
BIGINT | 8 bytes | It uses twice the number of bits as INT and doubles the ranges of INT, so it is called a BIGINT. Decimals of range \([-2^{63}, 2^{63}]\) can now be stored. |
FLOAT | 4 bytes | Same number of bytes as INT. In SQL, precision is supported up to 0 to 23 decimal places. Single precision. |
DOUBLE | 8 bytes | It uses twice the number of bits as FLOAT, so its called a DOUBLE. FLOAT has accuracy issues due to its limited floating point precisions that can't be widely represented under 4 bytes. |
DATETIME | 8 bytes | Contains date and time. A four-byte integer for date packed as `YYYY×10000 + MM×100 + DD` and a four-byte integer for time packed as `HH×10000 + MM×100 + SS`. Some implementations make this size more compact (down to 5 bytes in newer versions of SQL) but 8 bytes is a reasonable estimate for a DATETIME value. |
Size Tables
Pre-requisites: See the chart here for a quick primer on numbers.
Number (numerical) | Number (english) | Power | Byte |
1 | One | \(10 ^ 0\) | 1 byte |
1,000 | Thousand | \(10 ^ {3}\) | 1 KB (1 kilobyte) |
1,000,000 | Million | \(10 ^ {6}\) | 1 MB (1 megabyte) |
1,000,000,000 | Billion | \(10 ^ {9}\) | 1 GB (1 gigabyte) |
1,000,000,000,000 | Trillion | \(10 ^ {12}\) | 1 TB (1 terabyte) |
1,000,000,000,000,000 | Quadrillion | \(10 ^ {15}\) | 1 PB (1 petabyte) |
Time Tables
Millisecond (\(10^{-3}_{sec}\)) | \(1_{sec} = 1,000_{ms}\) |
Microsecond (\(10^{-6}_{sec}\)) | \(1_{sec} = 1,000,000_{μs}\) |
Nanosecond (\(10^{-9}_{sec}\)) | \(1_{sec} = 1,000,000,000_{ns}\) |
Time to Seconds
Hour | Day | Month | Year |
\(60 * 60 = 3600\) | \(3600 * 24 = 86400\) | \(86400 * 30 = 2592000\) | \(2592000 * 12 = 31104000\) |
3600 secs | approx. ~85k secs | approx. ~2.5 million secs | approx. ~30 million secs |
ISO8601
Requests
Requests | Requests per second |
\(\text{2.5 million}_{req/month}\) | \(\text{1}_{req/sec}\) |
\(\text{86,400}_{req/day}\) | \(\text{1}_{req/sec}\) |
This is all you need to know really. There are 2.5 million seconds in 1 month. This means that with 1 request per second, you have 2.5 million requests in that whole month.
Most request counts (for a month) usually range from millions to billions, so having this back-of-the-envelope formula should get you started for easy conversions.
Availability
99.9% availability - three 9s
Duration | Acceptable downtime |
Downtime per year | 8h 45min 57s |
Downtime per month | 43m 50s |
Downtime per week | 10m 5s |
Downtime per day | 1m 26s |
99.99% availability - four 9s
Duration | Acceptable downtime |
Downtime per year | 52min 36s |
Downtime per month | 4m 23s |
Downtime per week | 1m 5s |
Downtime per day | 9s |
In sequence formula
Overall availability decreases when two components with availability < 100% are in sequence:
Availability (Total) = Availability (Foo) * Availability (Bar)
In parallel formula
Overall availability increases when two components with availability < 100% are in parallel:
Availability (Total) = 1 - (1 - Availability (Foo)) * (1 - Availability (Bar))
Latency to Remember
These numbers from System Design Primer is a good reference sheet to remember when designing systems.
Actions | Nanoseconds | Microseconds | Milliseconds |
Blazing Fast | |||
L1 cache reference | \(0.5 ns\) | ||
Branch misredirect | \(5 ns\) | ||
L2 cache reference (L1 is about 14x faster) | \(7 ns\) | ||
Mutex lock/unlock | \(25 ns\) | ||
Main Memory Reference (20x slower than L2 cache) | \(100 ns\) | ||
Very Fast | |||
Compress 1 KB (i.e. with Zippy) | \(10,000 ns\) | \(10 μs\) | |
Send 1 KB over 1 Gbps network | \(10,000 ns\) | \(10 μs\) | |
Read 4 KB randomly from SSD | \(150,000 ns\) | \(150 μs\) | \(0.15 ms\) |
Read 1 MB sequentially from memory | \(250,000 ns\) | \(250 μs\) | \(0.25 ms\) |
Round trip within same datacenter | \(500,000 ns\) | \(500 μs\) | \(0.5 ms\) |
Read 1 MB sequentially from SSD (~1 GB/sec SSD, 4 times slower than RAM) | \(1,000,000 ns\) | \(1,000 μs\) | \(1 ms\) |
Somewhat fast | |||
HDD seek (i.e. 7200 RPM disk drives) | \(10,000,000 ns\) | \(10,000 μs\) | \(10 ms\) |
Read 1 MB sequentially from 1 Gbps network | \(10,000,000 ns\) | \(10,000 μs\) | \(10 ms\) |
Read 1 MB sequentially from HDD | \(30,000,000 ns\) | \(30,000 μs\) | \(30 ms\) |
Send a packet CA -> Netherlands -> CA | \(150,000,000 ns\) | \(150,000 μs\) | \(150 ms\) |