We want to use our decade to build and operate enterprise class Cassandra Deployment experience , by Apache Cassandra Build a world-class cloud native service .
We are Kubernetes Build on “Cassandra Namely service ”, We are also actively making C* Become the best open source in the world 、 Scalable cloud native database .
01 Cloud native Apache Cassandra
We hope Apache Cassandra Become cloud native . In that sense , Thousands of developers and operators , And from startups to Apple and Netflix And other giant companies , They're all in our line .
In order to be qualified to express such a view , We did A lot of work :Astra It's based on Kubernetes、Prometheus and Envoy Based on , And participate in GKE and EKS The local control and management plane of . We also consider what others have done in this area , Especially those in open source Kubernetes operators Aspect of Cassandra People who have made shared contributions .
For the sake of Kubernetes Large scale operation C*, We did some necessary technical work 、 Change and trade-offs . On this basis , We've formed a system for Astra The above views of . As we continue to deliver and improve Astra, Our view will evolve with it .
We pass the right C* Open source ecosystems contribute to sharing this view , Especially through OSS project ( Include Kubernetes operators、 Management AIDS 、 Indicator collector 、 Configuration builder and NoSQL The test system ).
The inevitable result of exchanging opinions is to listen to others . stay strengthening Cassandra Proposal for In the process , We see a series of dynamic and hard won experiences , These experiences are Cassandra Win it quite a lot Kubernetes operators.
Listening to each other will help us to think and cooperate together , Improve our claims with collective experience , And release better code for users .
02 Our focus is
We already know Apache Cassandra Architecture and architecture related to cloud deployment , This helps us contribute to community thinking . stay C* 4.0 After the release of , We're going to have to make common choices about architecture choices .
stay Cassandra The branch of We have put some of our understanding into practice , I hope we can share it with you in the future . In order to make Cassandra To meet the Kubernetes And cloud native specific requirements , We've added as much code as we can .
Use Apache Cassandra 3.11 Running cloud native services involves four areas of work : gateway , operation , Management and deployment .
Astra stay GKE/GCP High level architecture
gateway (Gateway) 2-1
Astra The cluster receives traffic through the gateway .Astra Use Envoy To route Cassandra Binary port request and simplify driver configuration . In the cloud native configuration , With a single entrance IP Connectivity is critical to managing resilient back ends . This also reduces the number of open ports , send Astra Be able to support... In the cloud CQL, At the same time, it has flexibility and security .
Astra It also uses gateways to open REST and GraphQL API, When interacting with data assets , Lighten the burden on developers . We believe that , These will provide a better experience for the new generation of developers in full stack application development .
operation (Operations) 2-2
Cluster operations need to be automated .Astra This is done through a pair of components : Cass Operator And for Apache Cassandra Management of API(MAAC).Cass Operator Is for Apache Cassandra Service Kubernetes operators.MAAC yes Kubernetes The accessory tool of .
Cass Operator Provided by C* The nodes make up StatefulSet, And usually by C* Manual tasks performed by administrators are automated . This includes the starting order of the cluster 、 Drain nodes when deleting nodes 、 And putting the right nodes in the right containers ( For example, prevent multiple nodes from being deployed to the same host ).
In order to be able to participate gracefully in Kubernetes Environmental Science , We have to provide insight into the state of the cluster . actually , This means that some operations that used to be internal to the database ( For example, automatically retrying or creating Gossip Link to track internal cluster status ) Now it's time to be promoted to the application layer .
Kubernetes Decisions can be made based on the health of the entire cluster , Not every one of them C* Nodes make their own decisions . for example , When scrolling to restart , Instead of going through internal checks to see if the node is working properly , These signals will be sent to Cass Operator, from Cass Operator Make sure that everyone in the cluster C* More than half of the nodes (quorum) requirement .
MAAC Provides a JSON Interface to call nodetool Command and inject Vault confidential .C* By using standards C* Tools to maintain operational consistency , It feels more like Kubernetes. This component is responsible for starting 、 stop it 、 Operations to configure and check activity and consistency levels .
MAAC As an aid (sidecar) function , And through local Unix Socket ( adopt mTLS As an option ) adopt CQL And C * signal communication . Every Sidecar Processes are only responsible for local C* example . This simplifies the topology , Enhanced Cass Operator As the role of cluster coordination system .
management (Management) 2-3
Metrics for large-scale operations C* Is essential .Astra Use Apache Cassandra Of Metrics Collector(MCAC) Provide with Kubernetes Management functions integrated with cloud environment .
MCAC To simplify the Cassandra User metrics collection .C* The traditional metric mechanism in is JMX, It is associated with Kubernetes The performance requirements and expansion requirements of the deployment do not match . We put MCAC Based on the collectd On the basis of , And tied to Cass Operator in . It applies to 2.2 to 4.0 beta All open source Apache Cassandra edition .
Use collectd It means that each node can export thousands of index series , And yes C* Has the least performance impact . It is associated with Prometheus(Kubernetes Standards for monitoring in the environment ) bring into correspondence with , And we can analyze MCAC Indicators and operators Level indicators , For example, context switching 、 Disk performance and network performance . Last ,MCAC Write a history journal , Record metrics and life cycle events ( For example, drop the plate , compaction , Anomalies and garbage collection ).
MCAC yes Astra A key part of , It enables users to monitor the real-time running characteristics of their instances .Astra Can manage the following perfectly Cassandra Node operation , But users must understand the impact of their data model and query on the cluster operation characteristics .
Deploy (Deployment) 2-4
stay Kubernetes Deployment in the environment is continuous . By giving control of configuration changes to the operating system ,Astra Can provide more dynamic and reliable data layer .Astra Use cass-config-builder To drive configuration , And use NoSQLBench Continuous testing of the environment .
Cass-config-builder According to the requirements of the environment, it can be generated by parameters cassandra.yaml file . start-up C* When the container , It will absorb this configuration .IP Address 、 Internet Information 、 Performance tuning 、 Security 、 Disk optimization and seed providers are running correctly C* Key components of .Kubernetes Controlling the whole environment , At the same time, it is constantly expanding and shrinking , Respond to hardware failure , Or change cluster wide properties , therefore C* A new configuration is needed to adapt to it .
In order to ensure the reliability of the cluster ,NoSQLBench Users can create composite datasets of any size , And according to the workload needs of the actual application , Using these datasets for large-scale load testing .NoSQLBench Apply to any NoSQL database .
As part of a continuous deployment environment , Database test automation can drive billions of write and read operations , Dramatically reduce the amount of storage , And demonstrate the high-quality performance characteristics of cloud environment with practical experience .
03 summary
We want to use our decade to build and operate enterprise class Cassandra Deployment experience , by Apache Cassandra Build a world-class cloud native service .
from Cassandra In the feedback from users, we can see that ,Kubernetes need Cassandra,Cassandra need Kubernetes. We are Kubernetes Build on “Cassandra Namely service ”, We are also actively making C* Become the best open source in the world 、 Scalable cloud native database .
Cassandra Is becoming a cloud native . I'm glad you can join us in this journey .