| Properties | SQL | NoSQL |
|---|---|---|
| Type | Relational database, requires pre-work to decide the structure, how they are going to relate | Not relational, more dynamic |
| Scaling | Vertical scaling - more memory, computing power Increase size of instance (RAM, CPU etc.) | Horizontal scaling - adding servers to handle and balance out usage Add more instances |
| Structure | Table based - put records and entities into tables with relationships, similar to a warehouse | Document approach - key value hashes, graphs, wide column stores |
| Accessing data | SQL language | Dynamic |
| What operations | Multi-row transactions | Unstructured data, usually JSON |
| Benefits | SQL is relational: easy querying on relationships between data among multiple tables, good for structuring and organising data Well structured: Room for potential errors is reduced as SQL schemas require the data model and format to be known before storing ACID compliant: SQL transactions are groups of statements that are executed atomically, all executed or none at all if any statement in a group fails | Flexible and simple to set up - does not support table relationships, data is usually stored in documents or key value pairs (better choice for unstructured data) Data sharding - allows for sharding unstructured data across different data stores, allowing for distributed databases ⇒ horizontal scaling is easier, and large amounts of data can be stored without having to purchase an expensive server |
| Weaknesses | Structure must be created in advance Not effective for storing and querying unstructured data when format is unknown Difficult to scale horizontally because the relational nature: easy to provision multiple read-only replicas for read heavy systems, but for write heavy systems, require vertically scaling, which is generally more expensive than provision additional servers | Loss of consistency - since it is mostly designed for distributed use cases, we need multiple write shards for the same data partition (peer-to-peer replication). This means that after a write to a shard in a distributed NoSQL cluster, small delay before the update can be propagated to other clusters ⇒ reading from replica mean accessing stale data (eventual consistency, fault of distributed databases) |
What is ACID
- Atomicity
- All or nothing transactions
- Consistency
- Data is valid before and after
- Isolation
- Multiple transactions at the same time
- Durability
- Committed data is never lost