Chaos Monkey Alternatives - OpenShift

5 minute read

Monkey-Ops

Monkey-Ops is an open-source Chaos Monkey implementation written in Go and designed to be deployed alongside an OpenShift application. Monkey-Ops will randomly perform one of two possible attacks:

Delete a random pod by calling the DELETE /api/v1/namespaces/{namespace}/pods Kubernetes API endpoint.
Scale the number of replicas for the associated deployment config by calling the PUT /oapi/v1/namespaces/{namespace}/deploymentconfigs/{name}/scale OpenShift API endpoint.

You can install Monkey-Ops either via Docker or as a separate OpenShift project.

Docker Installation

Create a Docker container with the following command. Be sure to replace TOKEN with your own OpenShift auth token and PROJECT_NAME with the appropriate value.

docker run produban/monkey-ops /monkey-ops \
  --TOKEN="<TOKEN>" \
  --PROJECT_NAME="chaos-demo" \
  --API_SERVER="https://api.starter-us-west-2.openshift.com:443" \
  --INTERVAL=30 \
  --MODE="background"

This will randomly execute one of the two possible attacks every INTERVAL seconds. If you wish to have more control over attacks, change MODE to "rest" and use the /chaos REST API to launch an attack.

OpenShift Installation

Installing Monkey-Ops as an OpenShift project is a bit more complex.

Clone the Git repo to a local directory.

 git clone https://github.com/Produban/monkey-ops.git

Create a monkey-ops.json file and paste the following, which will be used to create a Service Account.

 {
   "apiVersion": "v1",
   "kind": "ServiceAccount",
   "metadata": {
     "name": "monkey-ops"
   }
 }

Create the OpenShift Service Account using the OpenShift CLI and grant it privileges for your project (e.g. chaos-demo).
```
 oc create -f monkey-ops.json && oc policy add-role-to-user edit system:serviceaccount:chaos-demo:monkey-ops
```
Now create a new pod using the monkey-ops-template.yaml found in the Monkey-Ops project.
```
 oc create -f ./openshift/monkey-ops-template.yaml -n chaos-demo
```

Finally, create a new app called monkey-ops and pass appropriate values for each PARAM indicating when and how attacks will be executed.

 oc new-app \
   --name=monkey-ops \
   --template=monkey-ops \
   --param APP_NAME=monkey-ops \
   --param INTERVAL=30 \
   --param MODE=background \
   --param TZ=America/Los_Angeles \
   --labels=app_name=monkey-ops -n chaos-demo

Engineering Chaos In OpenShift with Gremlin

Gremlin’s Failure as a Service simplifies your Chaos Engineering workflow for OpenShift by making it safe and effortless to execute Chaos Experiments across all application containers. As a distributed architecture OpenShift is particularly sensitive to instability and unexpected failures. Gremlin can perform a variety of attacks on your OpenShift applications including draining disk space, hogging CPU and memory, overloading IO, manipulating network traffic, terminating instances, and much more.

Check out this tutorial for installing Gremlin on CentOS or this guide for installing Gremlin on OpenShift via a Kubernetes DaemonSet to get started!

Pumba

As discussed in the Chaos Monkey Alternatives - Docker chapter, Pumba is a Chaos injection tool primarily built for Docker. However, it can also be deployed on Kubernetes and, by extension, on OpenShift using a DaemonSet. Pumba can stop, pause, kill, and remove containers, which means it works fairly well with OpenShift pods that are made up of one or more containers.

To deploy Pumba in OpenShift nodes using a DaemonSet you must first add a security policy to allow the OpenShift developer user to administer Kubernetes clusters.
```
 oc adm policy --as system:admin add-cluster-role-to-user cluster-admin developer
```

Add the privileged security context restraint to the default user for your project.

 oc adm policy add-scc-to-user privileged system:serviceaccount:<project>:default

Set the allowHostDirVolumePlugin option to true in the restricted security restraint, which will allow OpenShift to connect to the Docker container.

 oc edit scc restricted

 # Please edit the object below. Lines beginning with a '#' will be ignored,
 # and an empty file will abort the edit. If an error occurs while saving this file will be
 # reopened with the relevant failures.
 #
 allowHostDirVolumePlugin: true
 allowHostIPC: false
 allowHostNetwork: false
 allowHostPID: false
 allowHostPorts: false
 allowPrivilegedContainer: false
 allowedCapabilities: null
 apiVersion: security.openshift.io/v1
 # [...]

Download the pumba_openshift.yml file and modify it as necessary. By default every 30 seconds it will kill a container within a pod containing the string "hello" in its name.

 curl -O https://raw.githubusercontent.com/alexei-led/pumba/master/deploy/pumba_openshift.yml

 apiVersion: extensions/v1beta1
 kind: DaemonSet
 metadata:
   name: pumba
 spec:
   template:
     metadata:
       labels:
         app: pumba
       name: pumba
     spec:
       containers:
       - image: gaiaadm/pumba:master
         imagePullPolicy: Always
         name: pumba
         command: ["pumba"] 
         args: ["--random", "--debug", "--interval", "30s", "kill", "--signal", "SIGKILL", "re2:.*hello.*"]
         securityContext:
           runAsUser: 0
         volumeMounts:
           - name: dockersocket
             mountPath: /var/run/docker.sock
       volumes:
         - hostPath:
             path: /var/run/docker.sock
           name: dockersocket

Finally, create the DaemonSet from the pumba_openshift.yml.

 oc create -f pumba_openshift.yml
 daemonset.extensions "pumba" created

That’s it. Now just add some pods to your project that match the regex used in the DaemonSet, if any, and Pumba should pick up on them and start killing them off. Check out this handy video tutorial for all the details.