Search Results


15 matches found for 'sql'

Distributed scaling with Relational Databases

Background A lot of articles will talk about how to scale databases. Typically, they will talk about the purpose and the general idea of sharding and replication, but often times these topics are explained separately and not so much in conjunction.


SQL: Indexing

Introduction As illustrated in this article, indexing is one of the easiest and most effective tweaks you can add to your SQL database. However, indexing might seem like magic, and you might also not be too sure which field to index in the first place.


Data stores in Software Architectures

Use Cases There are many ways to store your data. In this article we'll walk through some examples of data storage in common system designs. Reminder: There is no single best storage choice and they may vary heavily depending on things such as access patterns and scale.


Snowflake

Introduction Twitter's Snowflake is a ID generation scheme that tackles all of the requirements below: ID fits under 64 bits ID will be used with distribution in mind (horizontal scale SQL, Cassandra, etc.


Storing passwords into a database

Don'ts Don'ts Don't put raw passwords in the database Don't put encoded passwords in the database (i.e. Base64) Don't put simple hashed passwords in the database (i.e. MD5, SHA-256) Whys For obvious reasons, putting raw passwords means that the DBA or anyone who has access to the database can steal the passwords.


NoSQL - the Radical Databases

NoSQL NoSQL is a category of databases that aren't relational. For example, MySQL would be a relational database, where as MongoDB would be a NoSQL database.


Big Data Processing: Batching vs. Streaming

Intro In data processing, we often have to work with large amounts of data. The way in which this data is gathered comes in a few variants: batching, where we aggregate a collection of data (e.g., by hourly time), streaming for data that needs to be processed in real-time, and a unified variant which simply does not distinguish the technical difference between batching and streaming, allowing you to programmatically use the same API for both.


Quick Numbers in Software Engineering Cheatsheet

This is the front... just so you know Preface This article is a cheatsheet and a collection of tips/tricks for doing back of the envelope calculations. Numbers Data Types to Bytes Note: keep in mind that these are general estimates.


Authentications

Authentication Authentication means to verify who you are. Basic Auth Sensitive data required for login is encoded with Base64. Base64 is very easy to decode. Not recommended and probably the least secure authentication method, but easy to implement.


B-Trees vs. LSM Trees

B-Trees Modern databases are typically represented as B-Trees or LSM Trees (Log structured merge trees). B-trees are "tried and true" data structures that are popular in database usage, most notably SQL databases.


Design Concepts

In this article, I want to go over some fundamental design concepts that are useful for coming up with system design. Requirements Functional Requirements Describes specific behaviors i.e. If a URL is generated, it is composed of a Base64 encoded alias Non-functional Requirements Describes architectural requirements i.


RDBMS Optimization

Indexing Probably the easiest tweak to implement. It can usually be done with one SQL command. However, an index should be made based on a good column. For example, if you are frequently querying your rows by timestamp, then the timestamp can be chosen for an index.


Data Sharding: Twitter Posts

Scenario Let's begin with a Twitter-like service that allows you to tweet new posts. The service has very high read and write traffic , we'll say ~10k read TPS, or transactions-per-second for starters.


Web Development 101

HTTP vs. HTTPS HTTP stands for Hypertext Transfer Protocol. It typically runs on TCP port 80. It is a protocol for sending data through browsers in the form of webpages and such. One major flaw with HTTP is that it is vulnerable to man in the middle attacks.


Atomic operations with Elasticsearch

Preface Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. Key Terms: Document - Serialized JSON data.