The Iconic Mr. Nick Russo joins us once again for today's topic of CI / CD ... See Why?! Today we are picking up where Nick and I stopped in ZNDP Episode 30 - Designing for DevOps, if you didn't get a chance to listen to that episode yet you can find it at zigbits.tech/30, If you need some context around DevOps you should definitely listen to Episode 30. For today's topic we are discussing CI/CD which is Continuous Integration / Continuous Delivery/Deployment. And we are live in 3...2...1!!! :)
Continuous Integration in CI/CD
Traditionally, developers would work in their own feature/topic branches for months, then deal with what is called "merge hell" towards the end of the project when entering the "integration testing" phrase. This is typically weeks or months as combining many silos of work is fraught with problems. This goes way past software development; the "divide and conquer" approach for any complex project is best accomplished by continuous lateral communication. CI means merging code together regularly (daily) which kicks off a comprehensive set of tests to ensure code functionality and quality. CI provides immediate feedback as to any problems that arise.
Commonly associated with software development, but can be applied more generically. In this use case, to a complex Ansible library used at a large customer.
Continuing from last time (ZNDP Episode 30), we discussed our specific purpose, along with the processes and tools to support it. How do we ensure quality? How do we know the machine built the right product? We checked the inputs, what about the fabrication steps?
Consider an automotive assembly operation (way overly simplified). First build the frame, then put the wheels on, then the body, then the glass. If the frame is defective, why put the wheels on? Let's try to achieve "Quality at the source" through all the integration steps. This runs whenever code is committed or merged.
The CI Process in CI/CD:
1. Lint all the code. Look for syntax/styling issues, and static code analysis to detect security threats. This takes seconds. Fail fast!
2. Unit (filter) tests. Execute the individual units locally to ensure they function.
3. Role tests. Roles typically rely on filters, so run those against some virtual devices.
4. Playbook tests. Playbooks typically rely on roles, so run those against some virtual devices. For cost/runtime savings, you can stub out the virtual devices and provide mock data to the rest of the playbook, but this is less effective. This takes tens of minutes.
It usually doesn’t make sense to continue to the next step if a previous one fails. For example, if static code analysis reveals a security flaw, executing the code may pose a security risk.
Bonus feature: CI integrates with chat programs (chatops) to notify developers on-the-fly about activities. New comment/issue, code committed, pipeline pass/fail, etc. Common chat tools like Slack, Ryver, and Mattermost are used for this.
Consider even more creative examples, such as Nick's OSPF Cisco Live discussion. Configs and markdown READMEs are up on github. Here is Nick's Github Repository link. There is no real code here, just text documents and a diagram.
Nick's Troubleshooting OSPF - BRKRST-3010 Cisco Live Session:
- Session Presentation
- Video Recording
What's the value of CI in CI/CD?
First, linting the markdown documents ensures they look fresh and high quality. Second, maybe we can do some basic quality checking of the device configurations. There are 19 devices in the lab, and there are two folders (final and initial configs) each with 19 configs, for a total of 38. We can ensure that there are exactly 38 config files. We can also search the files for critical information, such as the author's name and email (for assistance/questions). Last, we can search the files to ensure the hostname of the device (R1) is the same as the file filename (R1.txt) once the file extension is removed.