Fourth - The Build System

So the engineers have written a bunch of code. Now what?

We have to build it so it can be deployed!

This brings me on to a very important sub-topic - namely 'do you need humans to test most code?'.

The best software engineers I have worked with always put tests for the pieces of code they are writing. Of course they also have a peer review system going where another good engineer will ensure they have actually written the appropriate tests before the code is submitted for a build/deploy. By doing this they enable two things:

no need for Quality Assurance (QA) staff for mindless testing of positive/negative/false positive testing of systems such as REST APIs
testing can be automated within the build system for positive/negative/false-positive testing...

Personally I'm a great fan of only putting actual people in positions where the human brain is still capable of being better than a 'dumb' computer. Artificial Intelligence (AI) can sound like a big deal, but in reality we aren't anywhere near being able to replace a human being able to look at a website and see formatting issues. So yes, I still believe in QA staff, just not in areas where we can automate their jobs ;)

OK, enough of a sidetrack. Given the engineers have written some awesome code that doesn't have any bugs, and have implemented their testing, we now have to get that code ready for the deploy system - truly the bridge between engineering and DevOps.

There are quite a few build engines out there - some are old, and some are new. Most of the old ones use an older way of thinking and therefore don't always mesh well with newer ways of working, or for that matter, deploying. That can be said about some of the newer ones as well of course!

Functionality of course is king, so having a list of requirements your build system must give you for how you want to run your deploys is very important - my list looks (at a high level) similar to this:

must be able to take code from github using a commit trigger
must be able to automatically configure a build environment based on packaging instructions within the code (i.e. package.json for a npm install)
must be able to install any needed tools within the build environment
must be able to notify both a chat system (I use Slack for example) as well as the deploy system (a custom node app in my case)
must have an interface for abstracting any private information needed for triggering the deploy (API key or similar) so it is not located in the code
must be easily configurable
must have testing components available, and must fail to continue if a test fails during the build (important for deploys!)
must have the ability to add in custom actions (bash for example) as part of the build config
must have a debug interface (i.e. can SSH to the build server part way through a build if needed for debugging)
must have a good GUI for interacting with jobs and the ability to quickly and easily cancel a running job in an emergency
must be cheap or free - after all, no humans are involved so the cost savings are huge and must be passed on as exactly that - cost savings
must be able to run a build without a deploy being triggered based on who just committed code or where the code is being committed to (a feature branch for example)
optional: ideally no management of the build servers needed (just say no to the legacy 'run this server 24x7 just in case you need to run a build' approach)

Sounds like a horrible list of requirements right? I know of many systems that do some, or even most of those, but so far have only come across one that fits every requirement I have wanted, and then some.

There is one major name that I've found most companies use by default: Jenkins. If they aren't using Jenkins, I've found most of the remaining companies end up using Bamboo just because they are already tied into the Atlasssian suite of tools, or Travis which isn't free (hence the plans link) but is hosted and is therefore the main competitor for what I use personally.

I've also seen a ton of custom written build systems (bunches of perl scripts are not uncommon) - these always end up with one scenario though - the guy that wrote them is invaluable for the consistent building of the applications involved, and replacing that person is always more painful than its worth.

My ambition in this area? Try to use systems that any good DevOps person can pick up in a matter of minutes (ok, maybe hours).

Jenkins

So it turns out Jenkins is Free. But only if you ignore two important pieces - the systems you need in order to have Jenkins running, and the people you require in order to keep it running (any '24x7' server requires DevOps people - that's just life!). If you use Jenkins, you will have an additional overhead beyond what is required for just maintaining your deploy system.

Other than that, it's a very good system, albeit one that due to its more complex nature (1000+ configuration module available and counting!), also requires more care and feeding that in my opinion could be better spent on other activities regardless of having to also ensure the servers are kept up to date etc.

Bamboo

Bamboo is really great if you are already heavily into Atlasssian products. This does however mean that the company doesn't make it easy to work with other systems. Why would they do that you ask? Well it's obvious - they want everybody to use their entire suite, and from a business perspective I totally agree with that as a concept. I just don't use them in reality as the 'suite' approach is not always (and hardly ever is) the right one.

Travis

Travis is a great tool, albeit quite expensive if you aren't involved in open source software. I think it's great that they give back to the open source community, but in the end I'm not always involved in open source, so I would have to pay to use the service, and I don't believe the expense outweighs the benefits when compared to my product of choice.

So what do I use (and therefore recommend)? CircleCI.

CircleCI

First and foremost, CircleCI meets every one of my requirements. Including price. This is no small accomplishment as I'm very demanding when it comes to my services, and I really don't like paying passed-on costs such as for people where people are not necessary.

CircleCI also bring another hidden requirement that we all have, but usually forget about:

must have good customer support if the system is unreliable/unstable, alternatively optional good customer support if the system is reliable/unstable

Let's just say CircleCI have a very reliable/stable system, and somehow also manage to have great customer support. Nay, lets instead say they have great customer service. They really do.

Can you tell I'm a fan? :)

In the end, we have chosen to use GitHub for our source control, and while there are many other code control systems out there, and there are many other build systems out there that support GitHub, CircleCI hits the mark on all the important stuff that we require, and then some.

For the overall develop/build/deploy flow, whenever any branch of a repo in GitHub has a commit, CircleCI will run a build. With our configuration, if that build is into a specific branch (development or production), then a deploy will also happen.

Regardless of if a deploy is going to happen, CircleCI provides a great interface and alerting system for if (and why!) a build fails, and for the engineers they get to see that their code (and tests!) are working as planned. Failed builds never result in a deploy, and therein lies the beauty of a good code, build and then deploy system.

On the pricing side, originally CircleCI charged a token amount (~$20/month) for the capability to run a private system build. Last year they removed that charge for the first instance and only charge if you would like parallelism or two simultaneous builds (or more if you want to buy more). I find I'm more than happy to buy multiple containers as more consistent builds are required, although this is tempered with how long a build takes and how often those commits happen. On most of our projects is actually quite hard for most of the engineers to write enough code to consistently keep a single build node running given how fast the build itself takes.

Deploys are a different issue of course. Given the nature of High Availability (HA) in the 'cloud', rolling deploys (update 1 of x sets of servers, then the 2nd set, then the 3rd set and so on) are mandatory, and while a build may take 3 minutes, a deploy can easily take 15 - so running multiple deploys at once for most cloud based HA scenarios is a big no.

Well, after four main articles, we are getting out of theory, and I'll start to get into practical examples. Now that we have a way for engineers to submit code (GitHub), a build system (CircleCI) that triggers deploys (Rundeck which uses Ansible), we have the overall flow covered and hopefully the actual examples will make more sense.