Chaos Monkey Alternatives - OpenShift
Monkey-Ops
Monkey-Ops is an open-source Chaos Monkey implementation written in Go and designed to be deployed alongside an OpenShift application. Monkey-Ops will randomly perform one of two possible attacks:
- Delete a random pod by calling the
DELETE /api/v1/namespaces/{namespace}/pods
Kubernetes API endpoint. - Scale the number of replicas for the associated deployment config by calling the
PUT /oapi/v1/namespaces/{namespace}/deploymentconfigs/{name}/scale
OpenShift API endpoint.
You can install Monkey-Ops either via Docker or as a separate OpenShift project.
Docker Installation
Create a Docker container with the following command. Be sure to replace TOKEN
with your own OpenShift auth token and PROJECT_NAME
with the appropriate value.
docker run produban/monkey-ops /monkey-ops \
--TOKEN="<TOKEN>" \
--PROJECT_NAME="chaos-demo" \
--API_SERVER="https://api.starter-us-west-2.openshift.com:443" \
--INTERVAL=30 \
--MODE="background"
This will randomly execute one of the two possible attacks every INTERVAL
seconds. If you wish to have more control over attacks, change MODE
to "rest"
and use the /chaos
REST API to launch an attack.
OpenShift Installation
Installing Monkey-Ops as an OpenShift project is a bit more complex.
-
Clone the Git repo to a local directory.
git clone https://github.com/Produban/monkey-ops.git
-
Create a
monkey-ops.json
file and paste the following, which will be used to create a Service Account.{ "apiVersion": "v1", "kind": "ServiceAccount", "metadata": { "name": "monkey-ops" } }
-
Create the OpenShift Service Account using the OpenShift CLI and grant it privileges for your project (e.g.
chaos-demo
).oc create -f monkey-ops.json && oc policy add-role-to-user edit system:serviceaccount:chaos-demo:monkey-ops
-
Now create a new pod using the
monkey-ops-template.yaml
found in the Monkey-Ops project.oc create -f ./openshift/monkey-ops-template.yaml -n chaos-demo
-
Finally, create a new app called
monkey-ops
and pass appropriate values for eachPARAM
indicating when and how attacks will be executed.oc new-app \ --name=monkey-ops \ --template=monkey-ops \ --param APP_NAME=monkey-ops \ --param INTERVAL=30 \ --param MODE=background \ --param TZ=America/Los_Angeles \ --labels=app_name=monkey-ops -n chaos-demo
Engineering Chaos In OpenShift with Gremlin
Gremlin’s Failure as a Service simplifies your Chaos Engineering workflow for OpenShift by making it safe and effortless to execute Chaos Experiments across all application containers. As a distributed architecture OpenShift is particularly sensitive to instability and unexpected failures. Gremlin can perform a variety of attacks on your OpenShift applications including draining disk space, hogging CPU and memory, overloading IO, manipulating network traffic, terminating instances, and much more.
Check out this tutorial for installing Gremlin on CentOS or this guide for installing Gremlin on OpenShift via a Kubernetes DaemonSet to get started!
Pumba
As discussed in the Chaos Monkey Alternatives - Docker chapter, Pumba is a Chaos injection tool primarily built for Docker. However, it can also be deployed on Kubernetes and, by extension, on OpenShift using a DaemonSet. Pumba can stop, pause, kill, and remove containers, which means it works fairly well with OpenShift pods that are made up of one or more containers.
-
To deploy Pumba in OpenShift nodes using a DaemonSet you must first add a security policy to allow the OpenShift
developer
user to administer Kubernetes clusters.oc adm policy --as system:admin add-cluster-role-to-user cluster-admin developer
-
Add the
privileged
security context restraint to thedefault
user for your project.oc adm policy add-scc-to-user privileged system:serviceaccount:<project>:default
-
Set the
allowHostDirVolumePlugin
option totrue
in therestricted
security restraint, which will allow OpenShift to connect to the Docker container.oc edit scc restricted
# Please edit the object below. Lines beginning with a '#' will be ignored, # and an empty file will abort the edit. If an error occurs while saving this file will be # reopened with the relevant failures. # allowHostDirVolumePlugin: true allowHostIPC: false allowHostNetwork: false allowHostPID: false allowHostPorts: false allowPrivilegedContainer: false allowedCapabilities: null apiVersion: security.openshift.io/v1 # [...]
-
Download the pumba_openshift.yml file and modify it as necessary. By default every 30 seconds it will kill a container within a pod containing the string
"hello"
in its name.curl -O https://raw.githubusercontent.com/alexei-led/pumba/master/deploy/pumba_openshift.yml
apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: pumba spec: template: metadata: labels: app: pumba name: pumba spec: containers: - image: gaiaadm/pumba:master imagePullPolicy: Always name: pumba command: ["pumba"] args: ["--random", "--debug", "--interval", "30s", "kill", "--signal", "SIGKILL", "re2:.*hello.*"] securityContext: runAsUser: 0 volumeMounts: - name: dockersocket mountPath: /var/run/docker.sock volumes: - hostPath: path: /var/run/docker.sock name: dockersocket
-
Finally, create the DaemonSet from the
pumba_openshift.yml
.oc create -f pumba_openshift.yml daemonset.extensions "pumba" created
That’s it. Now just add some pods to your project that match the regex used in the DaemonSet, if any, and Pumba should pick up on them and start killing them off. Check out this handy video tutorial for all the details.