LinuxCBT.com

Syllabus

Focus: Cassandra Database 1.2x

Duration: 8-Hours

  • Intro to Clustered - Distributed - NoSQL - Database - Cassandra DB

    • Introduction - Cassandra DB - Features - Discussion
      • NoSQL Discussion
      • Features and Benefits of Cassandra DB
      • Data Distribution - Peer-to-Peer
      • Data replication strategies
      • Scalability implementation
      • Data-Center fault-tolerance
      • Various features
      • Explore Cassandra Cluster Topology (CCT)
    •  
    • Single-Node - Implementation
      • Identify and obtain sources
      • Prep cluster nodes with Java environments
      • Ensure cross-platform support
      • Peruse configuration hierarchy
      • Tweak initial settings - ante-cluster invocation
      • Identify key network sockets
      • Start instance of Cassandra
      • Evaluate footprint
      • Evaluate results
    •  
    • Cassandra CLI
      • Discuss applicability - Legacy support - et cetera
      • Reveal cluster details using CLI
      • Define sample key space
      • Discuss data-types supported
      • Set | Get | Delete simple records
      • Evaluate data representation
      • Drop key space accordingly
      • Evaluate results
    •  
    • Cassandra Query Language (CQL) Client
      • Contrast with Legacy Cassandra CLI client
      • Reveal cluster details via CQL
      • Create | Drop key spaces
      • Create Column Families (CF)
      • Populate CF with rudimentary values for general usage
      • Query CF using standard CQL lingo
      • Explore Indices
      • Update data as needed
      • [Dis]Connect as needed
      • Evaluate environment and prepare for usage
    •  
    • Multi-Node - Cluster - Configuration - Deployment
      • Highlight benefits of distributed | peer-to-peer environment
      • Identify other nodes for inclusion
      • Discuss data replication strategies
      • Discuss tokenizationa and virtual nodes support
      • Peruse and update primary Cassandra configuration for multi-node support
      • Restart services and confirm data-availability
      • Introduce new nodes to Cassandra cluster
      • Query data accordingly
      • Update data replication to target an ideal number of nodes
      • Confirm availability of data across nodes
      • Prep configuration for other nodes
      • Distribute configuration using parallel SSH (pssh) as needed
      • Add remaining nodes to cluster
      • Evaluate current configuration
      • Fail nodes arbitrarily and confirm data-survival
    •  
    • Cassandra Node Management
      • Discuss available tools: Commercial | Open Source
      • Expose Cassandra Cluster configuration
      • Reveal cluster network details
      • Abbreviate cluster data and evaluate
      • Decommission node as desired for hypothetical scenario
      • Discuss various Cassandra protocols
    •  
    • Consistency Levels - Affirmatives | Negatives
      • Highlight default configuration
      • Contrast various approaches with respect to risk-tolerance
      • Discuss tunable consistency levels
      • Set Client Read | Write consistency levels
      • Evaluate data access post-consistency tweak
      • Repair data consistency across nodes
      • Confirm QUORUM-level consistency on ALL nodes of cluster
      • Evaluate results
    •  
    • Node Issues
      • Introduce Hypotheticals
      • Identify current data repositories
      • Fail redundant nodes and access data
      • Vary consistency level to reflect shifts in cluster dynamics
      • Confirm data-accessibility
      • Discuss tradeoffs of relegated stance
      • Up replication of data to reflect current cluster description
      • Down ALL nodes responsible for data and evaluate
      • Recover accordingly
      • Discuss risks and evaluate
    •  
    • Multiple Data Centers (DCs) - Custom Replication
      • Discuss available algos
      • Define accurate topology map for Cassandra Cluster
      • Update cluster configuration and re-initialize the cluster
      • Define new - simple key space for multi-DC replication
      • Reveal current multi-DC topology
      • Ensure data are replicated accordingly
      • Update existing key space to support multiple DCs
      • Evaluate configuration
    •  
    • Data Snapshots - Restoration
      • Explore available tools
      • Discuss backup | restoration model
      • Snapshot single node and evaluate traces
      • Use 'pssh' to snapshot ALL nodes
      • Confirm existence of various snapshots on various nodes
      • Purposely remove data
      • Confirm cluster-wide updates reflecting lost data
      • Restore key space as needed
      • Confirm availability

LinuxCBT CassDB Edition

  •  
DEMO