@rkenmi - 検索結果

Virtual Memory

What is Virtual Memory? Virtual Memory is best described as a swap file on your hard disk that holds memory information for your running applications. Memory is structured and managed in two different ways; paging and segmentation.

2020年9月27日

B-Trees vs. LSM Trees

... usage, most notably SQL databases. With a B-Tree indexing structure, data is written onto the disk in fixed size page segments. These page segments are often about 4 KB in size, and have key value pairs sorted by the key.

2018年3月14日

Big Data Processing: Batching vs. Streaming

... relies on bounded data. Traditionally, the input for batch processing are files stored on disk. This is the case for MapReduce implementations such as Hadoop. These files may come from daily cronjobs or are exported from copies of OLTP (Online Transaction Processing) databases, such as a SQL database for inventory or customer purchases.

2022年1月3日

PostgreSQL - a powerhouse relational database

... database engine. It is also incredibly vertically scalable - adding more memory, cpu cores, and disk space gives it a significant boost in capabilities. On average commodity hardware, we can expect the ballpark performance to be as follows: Read Performance Index reads can range from 10,000 TPS - 20,000 TPS per CPU core Complex join queries: Around 1,000 - 2,000 TPS Full table scan: This is naturally slow, but is especially egregious if the records cannot fit into memory Write Performance Single-table INSERT INTO: ~5,000 TPS per CPU core Single-table UPDATE (plus index updates): ~1,000 TPS per CPU core Multi-table/index writes from complex transactions: ~100 TPS Bulk operations: ~10,000 TPS Bottlenecks PostgreSQL performance on a single node will degrade if any of the following is true: Size of all data cannot be contained in memory Disk space being exceeded Disk I/O The write performance above is bounded by disk I/O constraints while writing to the WAL Complex joins or updates to multiple tables/indexes are always going to be a bottleneck If records exceed 10 million: Queries and joins start to stagnate Full text-search also starts to degrade ACID Guarantees PostgreSQL is widely known for its ACID properties - Atomicity, Consistency, Isolation, and Durability.

2025年7月11日

Apache Kafka and Event Streaming

... message broker. Log-based message brokers will, as the name implies, append entries to a log on disk, meaning they are durable. This is what allows Kafka (and other log-based message brokers) to replay events that might have already have been consumed by another client.

2021年3月31日

A primer on MapReduce

... not fit into RAM, and a single machine will also not be able to hold the entire file in its hard disk either. Therefore, the entire file has to be split into pieces, scattered throughout multiple machines.

2020年7月11日

Working with Production at Amazon Retail Website

... can include any kind of metrics for the hosts that are vital to its uptime. For example, low disk space, high CPU usage or high memory usage can indicate a server just waiting to crash and go down, causing your end users to suffer.

2021年12月28日

Quick Numbers in Software Engineering Cheatsheet

... are general estimates, but as vertical scaling improves over time (more vCPU cores, memory, and disk space), we could expect some of these numbers to increase a bit more. Availability 99.

2021年1月23日

OS 101

... kernel, Linux kernel Hypervisor A hypervisor manages hardware resources such as CPU, memory, disk space as an abstraction across multiple operating systems or (virtual) instances.

2022年12月21日

Web Development 101

... and core clock frequency has capped at around 3ghz for quite some time. You can expand your hard disk space and RAM, but even they have size limits. You can spend $1,000 on a single machine to get the state-of-the-art equipment, but spending $10,000 more on that machine for the best-of-the-best isn't going to immensely boost the machine's performance relative to the first boost.

2017年11月9日

Seattle Conference on Scalability: YouTube Scalability

... by 20%. Linux sees 5 volumes instead of 1 logical volume, allowing to more aggresively schedule disk I/O. Same hardware. Eventually did Database Partitions to spread writes AND read, partition by user.

2020年12月25日

Comparison Charts of File Storage Formats

... convenient way for developers and end-users, with less importance on the data size (in memory or disk). For this reason, these formats are typically human readable. For example, CSV and TSV are very popular output formats for data analysts who may use programs like Microsoft Excel.

2022年1月24日

検索結果