Databases in the development process

Developers need a database for local development

  • Option 1
    • Each developer installs a database locally
    • Each developer has his/her own test data
    • Can mess with database comfortably
  • Option 2
    • DB hosted remotely
    • Access it with endpoint and credentials
    • Can't play around without affecting others
  • The ideal solution is to have both, so you can switch back and forth
  • How does app talk to DB?
    • In your application code you configure the DB connectivity
    • Each language has a library/modules for DB connection
  • Use the credentials and database endpoint to connect
  • Use envs instead of hard coding endpoints and credentials
  • Need to configure credentials to be used in code
  • Use configuration files
    • In node, can use config.json
    • Can add logging level to config files

Databases in production

  • Replicate and backup
  • Make sure it performs under high load
  • System admins / DB Engineer / DevOps Engineers handle the databases
  • As a DevOps engineer, you should know
    • how to configure DB
    • how to set it up
    • how to manage DB
    • replications
    • backups
    • restore DB

Database Types

  • Key-value
    • Redis
    • Memcached
    • etcd from K8S
    • Very fast
    • Limited storage
    • For caching
    • Can sometimes be used as a message queue
    • etcd: Store cluster state in realtime
  • Wide Column
    • e.g. Cassandra, Apache HBase
    • More complex key-value data
    • Schema-less
    • Scalable
    • Similar queries to SQL but simpler
    • Large amounts of unstructured data
    • Time series
    • IoT records
    • Historical records
  • Document Databases
    • egs. MongoDB, DynamoDB, CouchDB
    • Documents are containers for key-value pairs
    • Schemaless
    • Slower in updates
    • Faster to read
    • Denormalized
    • Mobile Apps
    • Games
    • CMS
    • Most Apps
    • Scalable
    • Shouldn't be used
      • Correlated data
      • Graphs with a lot of related
      • e.g. Social media. Users have friends, etc
  • Relational Databases
    • e.g. Postgres, MySQL
    • Structured database
    • Schema and data types need to be created first
    • Query format is called Structured Query Language (SQL)
    • Data is organized into tables that have rows and columns
    • Normalizing is done to avoid data duplication
    • Are ACID compliant
      • Atomicity, Consistency, Isolation, Durability
      • Whenever there is a transaction, data consistency and validity is guaranteed
      • All changes get applied or none get applied
      • Makes them difficult to scale
  • Modern databases like CockroachDB that are designed to solve scalability issue
  • Graph Databases
    • eg. Neo4j, Dgraph
    • A lot of many-to-many relations
    • Directly connect entities
    • Edges are relationships
  • Search Databases
    • eg. Elasticsearch, Solr
    • Full text search in efficient and fast way
    • Creates index of words
    • Scans index of relevant results instead of whole database