Chaos Monkey Alternatives - OpenShift

5 minute read

Monkey-Ops

Monkey-Ops is an open-source Chaos Monkey implementation written in Go and designed to be deployed alongside an OpenShift application. Monkey-Ops will randomly perform one of two possible attacks:

You can install Monkey-Ops either via Docker or as a separate OpenShift project.

Docker Installation

Create a Docker container with the following command. Be sure to replace TOKEN with your own OpenShift auth token and PROJECT_NAME with the appropriate value.

docker run produban/monkey-ops /monkey-ops \
  --TOKEN="<TOKEN>" \
  --PROJECT_NAME="chaos-demo" \
  --API_SERVER="https://api.starter-us-west-2.openshift.com:443" \
  --INTERVAL=30 \
  --MODE="background"

This will randomly execute one of the two possible attacks every INTERVAL seconds. If you wish to have more control over attacks, change MODE to "rest" and use the /chaos REST API to launch an attack.

OpenShift Installation

Installing Monkey-Ops as an OpenShift project is a bit more complex.

  1. Clone the Git repo to a local directory.

     git clone https://github.com/Produban/monkey-ops.git
    
  2. Create a monkey-ops.json file and paste the following, which will be used to create a Service Account.

     {
       "apiVersion": "v1",
       "kind": "ServiceAccount",
       "metadata": {
         "name": "monkey-ops"
       }
     }
    
  3. Create the OpenShift Service Account using the OpenShift CLI and grant it privileges for your project (e.g. chaos-demo).

     oc create -f monkey-ops.json && oc policy add-role-to-user edit system:serviceaccount:chaos-demo:monkey-ops
    
  4. Now create a new pod using the monkey-ops-template.yaml found in the Monkey-Ops project.

     oc create -f ./openshift/monkey-ops-template.yaml -n chaos-demo
    
  5. Finally, create a new app called monkey-ops and pass appropriate values for each PARAM indicating when and how attacks will be executed.

     oc new-app \
       --name=monkey-ops \
       --template=monkey-ops \
       --param APP_NAME=monkey-ops \
       --param INTERVAL=30 \
       --param MODE=background \
       --param TZ=America/Los_Angeles \
       --labels=app_name=monkey-ops -n chaos-demo
    

Engineering Chaos In OpenShift with Gremlin

Gremlin’s Failure as a Service simplifies your Chaos Engineering workflow for OpenShift by making it safe and effortless to execute Chaos Experiments across all application containers. As a distributed architecture OpenShift is particularly sensitive to instability and unexpected failures. Gremlin can perform a variety of attacks on your OpenShift applications including draining disk space, hogging CPU and memory, overloading IO, manipulating network traffic, terminating instances, and much more.

Check out this tutorial for installing Gremlin on CentOS or this guide for installing Gremlin on OpenShift via a Kubernetes DaemonSet to get started!

Pumba

As discussed in the Chaos Monkey Alternatives - Docker chapter, Pumba is a Chaos injection tool primarily built for Docker. However, it can also be deployed on Kubernetes and, by extension, on OpenShift using a DaemonSet. Pumba can stop, pause, kill, and remove containers, which means it works fairly well with OpenShift pods that are made up of one or more containers.

  1. To deploy Pumba in OpenShift nodes using a DaemonSet you must first add a security policy to allow the OpenShift developer user to administer Kubernetes clusters.

     oc adm policy --as system:admin add-cluster-role-to-user cluster-admin developer
    
  2. Add the privileged security context restraint to the default user for your project.

     oc adm policy add-scc-to-user privileged system:serviceaccount:<project>:default
    
  3. Set the allowHostDirVolumePlugin option to true in the restricted security restraint, which will allow OpenShift to connect to the Docker container.

     oc edit scc restricted
    
     # Please edit the object below. Lines beginning with a '#' will be ignored,
     # and an empty file will abort the edit. If an error occurs while saving this file will be
     # reopened with the relevant failures.
     #
     allowHostDirVolumePlugin: true
     allowHostIPC: false
     allowHostNetwork: false
     allowHostPID: false
     allowHostPorts: false
     allowPrivilegedContainer: false
     allowedCapabilities: null
     apiVersion: security.openshift.io/v1
     # [...]
    
  4. Download the pumba_openshift.yml file and modify it as necessary. By default every 30 seconds it will kill a container within a pod containing the string "hello" in its name.

     curl -O https://raw.githubusercontent.com/alexei-led/pumba/master/deploy/pumba_openshift.yml
    
     apiVersion: extensions/v1beta1
     kind: DaemonSet
     metadata:
       name: pumba
     spec:
       template:
         metadata:
           labels:
             app: pumba
           name: pumba
         spec:
           containers:
           - image: gaiaadm/pumba:master
             imagePullPolicy: Always
             name: pumba
             command: ["pumba"] 
             args: ["--random", "--debug", "--interval", "30s", "kill", "--signal", "SIGKILL", "re2:.*hello.*"]
             securityContext:
               runAsUser: 0
             volumeMounts:
               - name: dockersocket
                 mountPath: /var/run/docker.sock
           volumes:
             - hostPath:
                 path: /var/run/docker.sock
               name: dockersocket
    
  5. Finally, create the DaemonSet from the pumba_openshift.yml.

     oc create -f pumba_openshift.yml
     daemonset.extensions "pumba" created
    

That’s it. Now just add some pods to your project that match the regex used in the DaemonSet, if any, and Pumba should pick up on them and start killing them off. Check out this handy video tutorial for all the details.