In this article I wanted to share my experience, filing a new feature request for a CNCF open source project, going ahead and implementing a solution for this feature, the pull request process and then it’s actual release.
The Open Source Project
Chaos Mesh is a CNCF open source sandbox project which allows you to introduce Chaos into your Kubernetes clusters in different ways via a growing set of Chaos experiments. These Chaos experiments are defined as CRDs, so you just need to apply an experiment to your cluster for it to take effect, just like you would deploy a deployment manifest to your cluster.
One such Chaos Mesh experiment is called Pod Failure. This experiment allows us to target a pod or pods by it’s labels, and inject fault by changing the container’s image(s) to a pause image, specifically the pause image from Google’s gcr.io public registry (gcr.io/google-containers/pause:latest). The effect this has, is that the paused container(s) never enters a running state (it stays in a waiting state) and so the pod as a whole never has all it’s containers in a running state and so itself does not enter a READY state.
The feature request I posted (https://github.com/chaos-mesh/chaos-mesh/issues/1925) was to be able to specify a custom pause image when deploying Chaos Mesh to your clusters, rather than using the static gcr.io one. This is important for highly regulated companies who cannot access public docker registries and/or have preventative policies (using a policy addon like OPA) in place which prevents public registries from been even used.
The implementation is quite simple, expose this pause image variable via an environmental variable which can be set at addon installation time via Helm values.
Where to start?
Once I knew the feature I wanted, and how I intended to add it, I followed the Chaos Mesh Contributing Guide: https://github.com/chaos-mesh/chaos-mesh/blob/master/CONTRIBUTING.md
Before we dive into the contributing guide, for this article, I’ll spin up fresh Azure Ubuntu VM with Docker Community Edition installed and set up some initial software we need.
Next SSH into the machine, first configure your git username and email
git config --global user.name "aido123"
git config --global user.email ***@gmail.com
Switch to the root user
Upgrade your Ubuntu machine and install make and build-essential tooling
apt-get -y upgrade
apt install make
apt install make-guile
apt install build-essential
Install Golang. I’ll install the version mentioned in the go.mod file (1.15)
tar -xvf go1.15.13.linux-amd64.tar.gz
mv go /usr/local
Follow Step 1 on the contributing guide which will fork the Chaos Mesh repo and create a local branch on your forked version, and check that the master code is working as expected via make check
In Step 2 run the unit tests.
NOTE: I had an issue with Step 2 where I needed to copy the kubebuilder binary to the /usr/local directory to proceed
cp -r /home/myhynes/chaos-mesh/output/bin/kubebuilder /usr/local/
At this stage, you can make your changes and add additional unit tests as required and run the make test again. Once you are happy with your local tests, you can test manually in a kind cluster.
As per Step 3, the set of scripts build each docker image required for the Chaos Mesh addon, install a kind Kubernetes cluster on your machine, save each Chaos Mesh local docker image to each node in the kind cluster, install the CRDs and install Chaos Mesh using the locally build docker images. You then have the ability to perform your manual tests.
In my case, I’ll deploy a simple nginx deployment.
- name: nginx
- containerPort: 80
I’ll then add a pod failure Chaos experiment targeting this deployment by it’s labels
In my case, the container image in the nginx deployment gets replaced with the custom one I specified in the install.sh script.
Once you’re happy with your changes and tests, continue to Step 4 to signoff your commit and push your changes. It’s a good idea to run make check again at this stage to fix any linting issues etc.
Now you can proceed to Step 5, where you can create a pull request for your feature/fix into the master branch.
Next you’re code will be reviewed, various checks (make check, e2e, integration) will be kicked off (some manually and some triggered by the core contributor team). Once the core contributors are happy with the change and all the tests have passed, they add a LGTM (Looks Good To Me) comment and once we get two of these, the core contributors can then add a /merge comment which the github TiChi bot picks up and merges the changes to the master branch.
That’s it! Now your changes in the master branch will go into the next Chaos Mesh release. You can consult the Releases section to see the frequency of these releases https://github.com/chaos-mesh/chaos-mesh/releases
Thanks for stopping by and I hope this article will help your on your open source contribution journey.