appagile-cassandra3

Delploy Cassandra Cluster on AppAgile

Version: appache-cassandra, Version 3.9

This deployment supports different storage models

- ephemeral storage
- persistent storage with NFS storage and host mounted directories

It supports required/limited resources for memory and CPU consumptions.

If you want to create a cluster with hostmounted Volumes first setup a cluster with ephemeral storage. Then follow instructions to use host mounted volumes.
[%hardbreaks]
== Prerequisites

=== Make template available

Add the template to Openshift in namespace openshift.

$ oc create -f ose-artefacts/cassandra-deployer.yaml -n openshift

=== Project

Create a project for cassandra cluster and its resources. We use in the following the project cassandra-prj.

oc new-project cassandra-prj

=== Service Account
A pod is used to setup, configure and deploy all the various cassandra resources. This deployer pod is run under the service account cassandra-deployer. Create a service account cassandra-deployer and let it edit resources.

Create Serviceaccount.

$ oc create -f - <<API
 apiVersion: v1
 kind: ServiceAccount
 metadata:
 name: cassandra-deployer
 secrets:
 - name: cassandra-deployer

API

Grant permission to edit resources on openshift.

oc adm policy add-role-to-user edit system:serviceaccount:cassandra-prj:cassandra-deployer

== Deploying all resources of the Cassandra Cluster

=== Persistent Storage

You can deploy the cassandra cluster with or without persistent storage.

Running with persistent storage means that your data will be stored to a link:https://docs.openshift.org/latest/architecture/additional_concepts/storage.html[persistent volume] and be able to survive a pod being restarted or recreated. This requires an admin to have setup and made available a persistent volume of sufficient size. Running with persistent storage is highly recommended if you require metric to be guarded against data loss.

Running with non-persistent storage means that any stored data will be deleted when the pod is deleted or restarted. Data will still survive only a container being restarted. It is much easier to run with non-persistent data, but with the tradeoff of potentially losing this data. Running with non-persistent data should only be done when data loss under certain situations is acceptable.

[IMPORTANT]

When using persistent storage you will need to make sure that your storage size is appropriate for your needs. The Cassandra database can and will use up all the available space allocated to the Persistent Volume which will cause serious errors.

You will need to monitor your data usage to make sure your database volume is sized properly.

==== Deployer Template

To deploy the cassandra cluster initially, you will need to deploy the ‘cassandra-deployer’ template.

If you are using non-persistent data, the following command will deploy a cassandra cluster without requiring a persistent volume to be created before hand:


oc process -f cassandra-deployer.yaml -v USE_PERSISTENT_STORAGE=false,CLUSTER_NAME=my-cluster,IMAGE_PREFIX=vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-,IMAGE_VERSION=latest,MEMORY_LIMIT=1073741824,CPU_LIMIT=500,CASSANDRA_NODES=2,CASSANDRA_SEEDS=1 | oc create -f -

If you are using persistent data with NFS Storage, the following command will deploy the cassandra cluster but requires a storage volume of sufficient size to be available:


oc process -f cassandra-deployer.yaml -v USE_PERSISTENT_STORAGE=true,CLUSTER_NAME=my-cluster,IMAGE_PREFIX=vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-,IMAGE_VERSION=latest,MEMORY_LIMIT=1073741824,CPU_LIMIT=500,CASSANDRA_NODES=2,CASSANDRA_SEEDS=1,CASSANDRA_PV_SIZE=1Gi| oc create -f -

Both calls create a cluster consisting of three nodes – 1 Seed node and 2 Nodes. Each should use 1GB (1073741824) of RAM and 500 milicores. If you like to deploy a cluster with for example 5 Nodes consisting of 2 Seeds and 3 Nodes each with 2 GB of RAM and 2 Cores use: ‘CASSANDRA_NODES=3,CASSANDRA_SEEDS=2,MEMORY_LIMIT=2147483648,CPU_LIMIT=2000’.

Deployer Template Parameters

The following table contains descriptions of the deployer’s template parameters

[cols=”2,10,1″,options=”header”]
|===

|Parameter Name |Description |Required

|IMAGE_PREFIX
|Specify prefix for cassandra components; e.g. for “vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-cassandra:latest”, set prefix “openshift/origin-”

Default: “vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-“
|

|IMAGE_VERSION
|Specify version for cassandra image; e.g. for “vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-cassandra:1.0.0”, set version “1.0.0”

Default: “latest”
|

|REDEPLOY
|not used If set to true the deployer will delete and redeloy all the existing metrics components. All persisted and non-persisted data will be lost.

Default: “false”
|

|MODE
|Can be used to set the deployment options.
deploy is to be used to perform the initial install
removeis to be used to perform the deletion of all resources created.
refresh will delete and redeploy all existing cassandra components except the persistent volume claims and route. Persisted data will remain available but all non-persisted data will be lost.
redeploy will delete and redeploy all existing cassandra components. All persisted and non-persisted data will be lost.

Default: “deploy”
|

|USE_PERSISTENT_STORAGE
|Set to true for persistent storage, set to false to use non persistent storage

Default: “true”
|

|CASSANDRA_NODES
|The number of Cassandra Nodes to deploy for the initial cluster

Default: “0”
|

|CASSANDRA_SEEDS
|The number of Cassandra Seeds to deploy for the initial cluster

Default: “1”
|

|CASSANDRA_PV_SIZE
|The persistent volume size for each of the Cassandra nodes

Default: “10Gi”
|

|DYNAMICALLY_PROVISION_STORAGE
|* not supported *

Default: “-“
|

|MEMORY_LIMIT
| Memory resources provided for each Cassandra node. Provide this value in bytes.

Default: “8589934592” (8 Giga Byte)

|

|CPU_LIMIT
| CPU resources provided for each Cassandra node. Provide this value milli cores.

Default: “2000” (2 Cores)

|

|CASSANDRA_SEEDS
| Number of seed nodes should be created in cluster.

Default: 1

|

|CASSANDRA_SEEDS
| Number of nodes should be created in cluster.

Default: 0

|

|===

== Using hostmounted directories
It is recommended to use cassandra with local storage. That can be achieved by using direct attached storage from the hostmachine.
[%hardbreaks]
First of all create a initial cassandra cluster with ephemeral storage. Then follow given instructions below to replace ephemeral storage with persistent host storage.

[%hardbreaks]
=== Run a cassandra node on a dedicated openshift-node

Every cassandra node should be deployed on a dedicated openshift. To achieve this we have to label the node.

oc label  <key=value>
 e.g.:
oc label node vmapgnodapp4n1.appad4.tsi-af.de cassandra=seed-1

=== Add node selector to deployment configuration of the cassandra node
With the nodeSelector you make sure that a container lands only on openshift-nodes labeled with the tag.

e.g.:

kind: DeploymentConfig
spec:
template:
spec:
nodeSelector:

cassandra: "seed-1"

=== Allow serviceaccount cassandra to access host mounted filesystem

add hostaccess to service account
oadm policy add-scc-to-user hostaccess \

system:serviceaccount:cassandra-prj:cassandra

Note: replace cassandra-prj with your project

==== Access privileges for host directory
Make sure the host directory is accessible. Login to the host, create the directory to be used and grant access to all users to it.

  • Should be better a part of a logical volume group in order to extent it *
    [%hardbreaks]
e.g. =/data/pv/cassandra_data

mkdir /data/pv/cassandra_data

chmod 777

=== add host mount to deployment

Add the hostmount to the deployment configuration of the cassandra node (here seed-1)

oc set volume deploymentconfig/cassandra-seed-1 --add --overwrite --name=cassandra-data --type=hostPath --path=/data/pv/cassandra_data

== Adding a node
Adding a node after a initial setup should be as easy as setting up a cluster.
Use the template that has been provided by the initial setup.

[%hardbreaks]
=== Set number of replicas to 0
In order to not deploy the pod immedeately edit the template which you like to use. Use the template ‘cassandra-node-emptydir’ for creating a cassandra instance for with ephemeral storage, use it as well to create an instance for a cassandra node with persistent storage provided by host mounted directory. If adding a node with persistent storage provided by host mounted directory follow the steps above to use host mounted directory. If you want to add a cassandra instance with persistent storage provided by NFS then use the template ‘cassandra-node-pv’.


oc edit template , e.g.

oc edit template cassandra-node-emptydir

set replicas to 0 and store changes


spec:
replicas: 0
selector:
name: cassandra-${NODE}

=== Add a seed node with ephemeral storage

oc process cassandra-node-emptydir -v "IMAGE_PREFIX=,IMAGE_VERSION=,MASTER=true,NODE=,CLUSTER_NAME=,NODE_TYPE="seed",CPU_LIMIT=,MEMORY_LIMIT=" | oc create -f -

=== Add a node with ephemeral storage

oc process cassandra-node-emptydir -v "IMAGE_PREFIX=,IMAGE_VERSION=,NODE=,CLUSTER_NAME=,NODE_TYPE="node",CPU_LIMIT=,MEMORY_LIMIT= | oc create -f -

=== Add a seed node with persistent storage provided by NFS

oc process cassandra-node-pv -v "IMAGE_PREFIX=,IMAGE_VERSION=,MASTER=true,NODE=,PV_SIZE=,CLUSTER_NAME=,NODE_TYPE="seed",CPU_LIMIT=,MEMORY_LIMIT=" | oc create -f -

=== Add a node with persistent storage provided by NFS

oc process cassandra-node-pv -v "IMAGE_PREFIX=,IMAGE_VERSION=,MASTER=false,NODE=node-id>,PV_SIZE=,CLUSTER_NAME=,NODE_TYPE="node",CPU_LIMIT=,MEMORY_LIMIT=" | oc create -f -

Note: If you need information about which value to use, take a look at the deploymentconfiguration that is already available.

[%hardbreaks]
==== Add node with host mounted storage
If you need to set up a node with persistent storage provided by host mounted directory, follow the unstruction above.

[%hardbreaks]
=== Start new deployment

After changing deploymentconfiguration to your needs, set replicas to 1 and start the deployment.

oc deploy  --latest

== Delete all artefacts

With the following command, every single resource created can be removed from the project.

oc process -f cassandra-deployer.yaml -v MODE=remove,IMAGE_PREFIX=vmapgmucrep01.appacd.tsi-af.de:5000/public/appagile-cassandra3-,IMAGE_VERSION=latest | oc create -f -

[%hardbreaks]
[qanda]
What should I use?::
Use persistent storage :-).
[qanda]
What should I use?::
Use persistent storage :-).