Delploy Cassandra Cluster on AppAgile
Version: appache-cassandra, Version 3.9
This deployment supports different storage models
- ephemeral storage - persistent storage with NFS storage and host mounted directories
It supports required/limited resources for memory and CPU consumptions.
If you want to create a cluster with hostmounted Volumes first setup a cluster with ephemeral storage. Then follow instructions to use host mounted volumes.
=== Make template available
Add the template to Openshift in namespace openshift.
$ oc create -f ose-artefacts/cassandra-deployer.yaml -n openshift
Create a project for cassandra cluster and its resources. We use in the following the project cassandra-prj.
oc new-project cassandra-prj
=== Service Account
A pod is used to setup, configure and deploy all the various cassandra resources. This deployer pod is run under the service account cassandra-deployer. Create a service account cassandra-deployer and let it edit resources.
$ oc create -f - <<API apiVersion: v1 kind: ServiceAccount metadata: name: cassandra-deployer secrets: - name: cassandra-deployer
Grant permission to edit resources on openshift.
oc adm policy add-role-to-user edit system:serviceaccount:cassandra-prj:cassandra-deployer
== Deploying all resources of the Cassandra Cluster
=== Persistent Storage
You can deploy the cassandra cluster with or without persistent storage.
Running with persistent storage means that your data will be stored to a link:https://docs.openshift.org/latest/architecture/additional_concepts/storage.html[persistent volume] and be able to survive a pod being restarted or recreated. This requires an admin to have setup and made available a persistent volume of sufficient size. Running with persistent storage is highly recommended if you require metric to be guarded against data loss.
Running with non-persistent storage means that any stored data will be deleted when the pod is deleted or restarted. Data will still survive only a container being restarted. It is much easier to run with non-persistent data, but with the tradeoff of potentially losing this data. Running with non-persistent data should only be done when data loss under certain situations is acceptable.
When using persistent storage you will need to make sure that your storage size is appropriate for your needs. The Cassandra database can and will use up all the available space allocated to the Persistent Volume which will cause serious errors.
You will need to monitor your data usage to make sure your database volume is sized properly.
==== Deployer Template
To deploy the cassandra cluster initially, you will need to deploy the ‘cassandra-deployer’ template.
If you are using non-persistent data, the following command will deploy a cassandra cluster without requiring a persistent volume to be created before hand:
oc process -f cassandra-deployer.yaml -v USE_PERSISTENT_STORAGE=false,CLUSTER_NAME=my-cluster,IMAGE_PREFIX=vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-,IMAGE_VERSION=latest,MEMORY_LIMIT=1073741824,CPU_LIMIT=500,CASSANDRA_NODES=2,CASSANDRA_SEEDS=1 | oc create -f -
If you are using persistent data with NFS Storage, the following command will deploy the cassandra cluster but requires a storage volume of sufficient size to be available:
oc process -f cassandra-deployer.yaml -v USE_PERSISTENT_STORAGE=true,CLUSTER_NAME=my-cluster,IMAGE_PREFIX=vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-,IMAGE_VERSION=latest,MEMORY_LIMIT=1073741824,CPU_LIMIT=500,CASSANDRA_NODES=2,CASSANDRA_SEEDS=1,CASSANDRA_PV_SIZE=1Gi| oc create -f -
Both calls create a cluster consisting of three nodes – 1 Seed node and 2 Nodes. Each should use 1GB (1073741824) of RAM and 500 milicores. If you like to deploy a cluster with for example 5 Nodes consisting of 2 Seeds and 3 Nodes each with 2 GB of RAM and 2 Cores use: ‘CASSANDRA_NODES=3,CASSANDRA_SEEDS=2,MEMORY_LIMIT=2147483648,CPU_LIMIT=2000’.
Deployer Template Parameters
The following table contains descriptions of the deployer’s template parameters
|Parameter Name |Description |Required
|Specify prefix for cassandra components; e.g. for “vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-cassandra:latest”, set prefix “openshift/origin-”
|Specify version for cassandra image; e.g. for “vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-cassandra:1.0.0”, set version “1.0.0”
|not used If set to true the deployer will delete and redeloy all the existing metrics components. All persisted and non-persisted data will be lost.
|Can be used to set the deployment options.
deploy is to be used to perform the initial install
removeis to be used to perform the deletion of all resources created.
refresh will delete and redeploy all existing cassandra components except the persistent volume claims and route. Persisted data will remain available but all non-persisted data will be lost.
redeploy will delete and redeploy all existing cassandra components. All persisted and non-persisted data will be lost.
|Set to true for persistent storage, set to false to use non persistent storage
|The number of Cassandra Nodes to deploy for the initial cluster
|The number of Cassandra Seeds to deploy for the initial cluster
|The persistent volume size for each of the Cassandra nodes
|* not supported *
| Memory resources provided for each Cassandra node. Provide this value in bytes.
Default: “8589934592” (8 Giga Byte)
| CPU resources provided for each Cassandra node. Provide this value milli cores.
Default: “2000” (2 Cores)
| Number of seed nodes should be created in cluster.
| Number of nodes should be created in cluster.
== Using hostmounted directories
It is recommended to use cassandra with local storage. That can be achieved by using direct attached storage from the hostmachine.
First of all create a initial cassandra cluster with ephemeral storage. Then follow given instructions below to replace ephemeral storage with persistent host storage.
=== Run a cassandra node on a dedicated openshift-node
Every cassandra node should be deployed on a dedicated openshift. To achieve this we have to label the node.
oc label <key=value> e.g.:
oc label node vmapgnodapp4n1.appad4.tsi-af.de cassandra=seed-1
=== Add node selector to deployment configuration of the cassandra node
With the nodeSelector you make sure that a container lands only on openshift-nodes labeled with the tag.
=== Allow serviceaccount cassandra to access host mounted filesystem
add hostaccess to service account
oadm policy add-scc-to-user hostaccess \
Note: replace cassandra-prj with your project
==== Access privileges for host directory
Make sure the host directory is accessible. Login to the host, create the directory to be used and grant access to all users to it.
- Should be better a part of a logical volume group in order to extent it *
e.g. =/data/pv/cassandra_data mkdir /data/pv/cassandra_data chmod 777
=== add host mount to deployment
Add the hostmount to the deployment configuration of the cassandra node (here seed-1)
oc set volume deploymentconfig/cassandra-seed-1 --add --overwrite --name=cassandra-data --type=hostPath --path=/data/pv/cassandra_data
== Adding a node
Adding a node after a initial setup should be as easy as setting up a cluster.
Use the template that has been provided by the initial setup.
=== Set number of replicas to 0
In order to not deploy the pod immedeately edit the template which you like to use. Use the template ‘cassandra-node-emptydir’ for creating a cassandra instance for with ephemeral storage, use it as well to create an instance for a cassandra node with persistent storage provided by host mounted directory. If adding a node with persistent storage provided by host mounted directory follow the steps above to use host mounted directory. If you want to add a cassandra instance with persistent storage provided by NFS then use the template ‘cassandra-node-pv’.
oc edit template , e.g.
oc edit template cassandra-node-emptydir
set replicas to 0 and store changes
=== Add a seed node with ephemeral storage
oc process cassandra-node-emptydir -v "IMAGE_PREFIX=,IMAGE_VERSION=,MASTER=true,NODE=,CLUSTER_NAME=,NODE_TYPE="seed",CPU_LIMIT=,MEMORY_LIMIT=" | oc create -f -
=== Add a node with ephemeral storage
oc process cassandra-node-emptydir -v "IMAGE_PREFIX=,IMAGE_VERSION=,NODE=,CLUSTER_NAME=,NODE_TYPE="node",CPU_LIMIT=,MEMORY_LIMIT= | oc create -f -
=== Add a seed node with persistent storage provided by NFS
oc process cassandra-node-pv -v "IMAGE_PREFIX=,IMAGE_VERSION=,MASTER=true,NODE=,PV_SIZE=,CLUSTER_NAME=,NODE_TYPE="seed",CPU_LIMIT=,MEMORY_LIMIT=" | oc create -f -
=== Add a node with persistent storage provided by NFS
oc process cassandra-node-pv -v "IMAGE_PREFIX=,IMAGE_VERSION=,MASTER=false,NODE=node-id>,PV_SIZE=,CLUSTER_NAME=,NODE_TYPE="node",CPU_LIMIT=,MEMORY_LIMIT=" | oc create -f -
Note: If you need information about which value to use, take a look at the deploymentconfiguration that is already available.
==== Add node with host mounted storage
If you need to set up a node with persistent storage provided by host mounted directory, follow the unstruction above.
=== Start new deployment
After changing deploymentconfiguration to your needs, set replicas to 1 and start the deployment.
oc deploy --latest
== Delete all artefacts
With the following command, every single resource created can be removed from the project.
oc process -f cassandra-deployer.yaml -v MODE=remove,IMAGE_PREFIX=vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-,IMAGE_VERSION=latest | oc create -f -
What should I use?::
Use persistent storage :-).
What should I use?::
Use persistent storage :-).