1. Cassandra: The Definitive Guide (2010)
What could you do with data if scalability wasn’t a problem? With this hands-on guide, you’ll learn how Apache Cassandra handles hundreds of terabytes of data while remaining highly available across multiple data centers — capabilities that have attracted Facebook, Twitter, and other data-intensive companies. Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment.
Author Eben Hewitt demonstrates the advantages of Cassandra’s nonrelational design, and pays special attention to data modeling. If you’re a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra’s speed and flexibility.
- Understand the tenets of Cassandra’s column-oriented structure
- Learn how to write, update, and read Cassandra data
- Discover how to add or remove nodes from the cluster as your application requires
- Examine a working application that translates from a relational model to Cassandra’s data model
- Use examples for writing clients in Java, Python, and C#
- Use the JMX interface to monitor a cluster’s usage, memory patterns, and more
- Tune memory settings, data storage, and caching for better performance
Author(s): Eben Hewitt
This book is a step by step beginners guide to learning Cassandra. The book uses tons of charts, graphs, images and code to aid your Cassandra learning.
The book gives a detailed introduction to Cassandra. It proceeds to give step-by-step instructions to installing Cassandra. Cassandra Architecture and Replication Factor Strategy is lucidly explained. Data Modelling, Keyspace CQL are also described in detail.
The book will teach you enough to get started with Cassandra.
Here is what is included
Chapter 1: Introduction
Nosql Cassandra Database
Nosql Cassandra Database Vs Relational databases
Apache Cassandra Features
Cassandra Use Cases
Chapter 2: Download and Install
Prerequisite for Apache Cassandra Installation
How to Download and Install Cassandra
Chapter 3: Architecture
Components of Cassandra
Chapter 4: Data Model and Rules
Cassandra Data Model Rules
Model Your Data in Cassandra
Handling One to One Relationship
Handling one to many Relationship
Handling Many to Many Relationship
Chapter 5: Cassandra CQL
Create, Alter & Drop Keyspace
Cassandra Table: Create, Alter, Drop & Truncate
Cassandra Query Language(CQL): Insert, Update, Delete, Read Data
Create & Drop INDEX
Data Types & Expiration
SET, LIST & MAP
Chapter 6: Cassandra Cluster
Prerequisites for Cassandra Cluster
Enterprise Edition Installation
Starting Cassandra Node
Chapter 7: DevCenter & OpsCenter Installation
Chapter 8: Security
What is Internal Authentication and Authorization
Configure Authentication and Authorization
Create New User
Enabling JMX Authentication
★★★Download Free – For Kindle Unlimited Subscribers!★★★
Author(s): Krishna Rungta
3. NoSQL for Mere Mortals (2015)
The Easy, Common-Sense Guide to Solving Real Problems with NoSQL
The Mere Mortals ® tutorials have earned worldwide praise as the clearest, simplest way to master essential database technologies. Now, there’s one for today’s exciting new NoSQL databases. NoSQL for Mere Mortals guides you through solving real problems with NoSQL and achieving unprecedented scalability, cost efficiency, flexibility, and availability.
Drawing on 20+ years of cutting-edge database experience, Dan Sullivan explains the advantages, use cases, and terminology associated with all four main categories of NoSQL databases: key-value, document, column family, and graph databases. For each, he introduces pragmatic best practices for building high-value applications. Through step-by-step examples, you’ll discover how to choose the right database for each task, and use it the right way.
–Getting started: What NoSQL databases are, how they differ from relational databases, when to use them, and when not to Data management principles and design criteria: Essential knowledge for creating any database solution, NoSQL or relational
–Key-value databases: Gaining more utility from data structures
–Document databases: Schemaless databases, normalization and denormalization, mutable documents, indexing, and design patterns
–Column family databases: Google’s BigTable design, table design, indexing, partitioning, and Big Data
Graph databases: Graph/network modeling, design tips, query methods, and traps to avoid
Whether you’re a database developer, data modeler, database user, or student, learning NoSQL can open up immense new opportunities. As thousands of database professionals already know, For Mere Mortals is the fastest, easiest route to mastery.
Author(s): Dan Sullivan
Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they’re built.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the Book
Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive.
Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You’ll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you’ll learn specific technologies like Hadoop, Storm, and NoSQL databases.
This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful.
- Introduction to big data systems
- Real-time processing of web-scale data
- Tools like Hadoop, Cassandra, and Storm
- Extensions to traditional database skills
About the Authors
Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing.
Table of Contents
- A new paradigm for Big Data
- Data model for Big Data
- Data model for Big Data: Illustration
- Data storage on the batch layer
- Data storage on the batch layer: Illustration
- Batch layer
- Batch layer: Illustration
- An example batch layer: Architecture and algorithms
- An example batch layer: Implementation
- Serving layer
- Serving layer: Illustration
- Realtime views
- Realtime views: Illustration
- Queuing and stream processing
- Queuing and stream processing: Illustration
- Micro-batch stream processing
- Micro-batch stream processing: Illustration
- Lambda Architecture in depth
PART 1 BATCH LAYER
PART 2 SERVING LAYER
PART 3 SPEED LAYER
Author(s): Nathan Marz, James Warren
This guide is an ideal learning tool and reference for Apache Pig, the open source engine for executing parallel data flows on Hadoop. With Pig, you can batch-process data without having to create a full-fledged application—making it easy for you to experiment with new datasets.
Programming Pig introduces new users to Pig, and provides experienced users with comprehensive coverage on key features such as the Pig Latin scripting language, the Grunt shell, and User Defined Functions (UDFs) for extending Pig. If you need to analyze terabytes of data, this book shows you how to do it efficiently with Pig.
- Delve into Pig’s data model, including scalar and complex data types
- Write Pig Latin scripts to sort, group, join, project, and filter your data
- Use Grunt to work with the Hadoop Distributed File System (HDFS)
- Build complex data processing pipelines with Pig’s macros and modularity features
- Embed Pig Latin in Python for iterative processing and other advanced tasks
- Create your own load and store functions to handle data formats and storage mechanisms
- Get performance tips for running scripts on Hadoop clusters in less time
Author(s): Alan Gates
6. Expert Apache Cassandra Administration (2017)
- Takes you through building a Cassandra database from installation of the software and creation of a single database, through to complex clusters and data centers
- Provides numerous examples of actual commands in a real-life Cassandra environment that show how to confidently configure, manage, troubleshoot, and tune Cassandra databases
- Shows how to use the Cassandra configuration properties to build a highly stable, available, and secure Cassandra database that always operates at peak efficiency
- Install the Cassandra software and create your first database
- Understand the Cassandra data model, and the internal architecture of a Cassandra database
- Create your own Cassandra cluster, step-by-step
- Run a Cassandra cluster on Docker
- Work with Apache Spark by connecting to a Cassandra database
- Deploy Cassandra clusters in your data center, or on Amazon EC2 instances
- Back up and restore mission-critical Cassandra databases
- Monitor, troubleshoot, and tune production Cassandra databases, and cut your spending on resources such as memory, servers, and storage
Author(s): Sam R. Alapati
- Install Cassandra and set up multi-node clusters
- Design rich schemas that capture the relationships between different data types
- Master the advanced features available in Cassandra 3.x through a step-by-step tutorial and build a scalable, high performance database layer
Cassandra is a distributed database that stands out thanks to its robust feature set and intuitive interface, while providing high availability and scalability of a distributed data store. This book will introduce you to the rich feature set offered by Cassandra, and empower you to create and manage a highly scalable, performant and fault-tolerant database layer.
The book starts by explaining the new features implemented in Cassandra 3.x and get you set up with Cassandra. Then you’ll walk through data modeling in Cassandra and the rich feature set available to design a flexible schema. Next you’ll learn to create tables with composite partition keys, collections and user-defined types and get to know different methods to avoid denormalization of data. You will then proceed to create user-defined functions and aggregates in Cassandra. Then, you will set up a multi node cluster and see how the dynamics of Cassandra change with it. Finally, you will implement some application-level optimizations using a Java client.
By the end of this book, you’ll be fully equipped to build powerful, scalable Cassandra database layers for your applications.
What you will learn
- Install Cassandra
- Create keyspaces and tables with multiple clustering columns to organize related data
- Use secondary indexes and materialized views to avoid denormalization of data
Author(s): Sandeep Yarabarla
Build, manage, and configure high-performing, reliable NoSQL database for your application with Cassandra
About This Book
- Develop applications for modelling data with Cassandra 2
- Manage large amounts of structured, semi-structured, and unstructured data with Cassandra
- Explore a wide-range of Cassandra components and how they interact to create a robust, distributed system.
Who This Book Is For
The book is aimed at intermediate developers with an understanding of core database concepts who want to become a master at implementing Cassandra for their application.
What You Will Learn
- Write programs using Cassandra’s features more efficiently
- Get the most out of a given infrastructure, improve performance, and tweak JVM
- Use CQL3 in your application, which makes working with Cassandra more simple
- Configure Cassandra and fine-tune its parameters depending on your needs
- Set up a cluster and learn how to scale it
- Monitor Cassandra cluster in different ways
- Use Hadoop and other big data processing tools with Cassandra
With ever increasing rates of data creation comes the demand to store data as fast and reliably as possible, a demand met by modern databases such as Cassandra. Apache Cassandra is the perfect choice for building fault tolerant and scalable databases. Through this practical guide, you will program pragmatically and understand completely the power of Cassandra. Starting with a brief recap of the basics to get everyone up and running, you will move on to deploy and monitor a production setup, dive under the hood, and optimize and integrate it with other software.
You will explore the integration and interaction of Cassandra components, and explore great new features such as CQL3, vnodes, lightweight transactions, and triggers. Finally, by learning Hadoop and Pig, you will be able to analyze your big data.
Author(s): Nishant Neeraj
9. Beginning Apache Cassandra Development (2014)
Beginning Apache Cassandra Development introduces you to one of the most robust and best-performing NoSQL database platforms on the planet. Apache Cassandra is a document database following the JSON document model. It is specifically designed to manage large amounts of data across many commodity servers without there being any single point of failure. This design approach makes Apache Cassandra a robust and easy-to-implement platform when high availability is needed.
Cassandra is one of the leading NoSQL databases, meaning you get unparalleled throughput and performance without the sort of processing overhead that comes with traditional proprietary databases. Beginning Apache Cassandra Development will therefore help you create applications that generate search results quickly, stand up to high levels of demand, scale as your user base grows, ensure operational simplicity, and—not least—provide delightful user experiences.
Author(s): Vivek Mishra
10. Cassandra High Availability (2014)
The book starts with the fundamentals, helping you to understand how the architecture of Apache Cassandra allows it to achieve 100 percent uptime when other systems struggle to do so. You’ll have an excellent understanding of data distribution, replication, and Cassandra’s highly tunable consistency model. This is followed by an in-depth look at Cassandra’s robust support for multiple data centers, and how to scale out a cluster. Next, the book explores the domain of application design, with chapters discussing the native driver and data modeling. Lastly, you’ll find out how to steer clear of common antipatterns and take advantage of Cassandra’s ability to fail gracefully.
What you will learn:
- Understand how the core architecture of Cassandra enables highly available applications
- Use replication and tunable consistency levels to balance consistency, availability, and performance
- Set up multiple data centers to enable failover, load balancing, and geographic distribution
- Add capacity to your cluster with zero down time
- Take advantage of high availability features in the native driver
- Create data models that scale well and maximize availability
- Understand common anti-patterns so you can avoid them
- Keep your system working well even during failure scenarios
Author(s): Robbie Strickland
Learn how to integrate full-stack open source big data architecture and to choose the correct technology―Scala/Spark, Mesos, Akka, Cassandra, and Kafka―in every layer.
Big data architecture is becoming a requirement for many different enterprises. So far, however, the focus has largely been on collecting, aggregating, and crunching large data sets in a timely manner. In many cases now, organizations need more than one paradigm to perform efficient analyses.
Big Data SMACK explains each of the full-stack technologies and, more importantly, how to best integrate them. It provides detailed coverage of the practical benefits of these technologies and incorporates real-world examples in every situation. This book focuses on the problems and scenarios solved by the architecture, as well as the solutions provided by every technology. It covers the six main concepts of big data architecture and how integrate, replace, and reinforce every layer:
What You Will Learn:
- Make big data architecture without using complex Greek letter architectures
- Build a cheap but effective cluster infrastructure
- Make queries, reports, and graphs that business demands
- Manage and exploit unstructured and No-SQL data sources
- Use tools to monitor the performance of your architecture
- Integrate all technologies and decide which ones replace and which ones reinforce
Who This Book Is For:
Developers, data architects, and data scientists looking to integrate the most successful big data open stack architecture and to choose the correct technology in every layer
Author(s): Raul Estrada, Isaac Ruiz