DevOps and CI/CD Resources

DZone's Featured DevOps and CI/CD Resources

Ansible by Example

By Jan-Rudolph Bührmann

In my previous posting, I explained how to run Ansible scripts using a Linux virtual machine on Windows Hyper-V. This article aims to ease novices into Ansible IAC at the hand of an example. The example being booting one's own out-of-cloud Kubernetes cluster. As such, the intricacies of the steps required to boot a local k8s cluster are beyond the scope of this article. The steps can, however, be studied at the GitHub repo, where the Ansible scripts are checked in. The scripts were tested on Ubuntu20, running virtually on Windows Hyper-V. Network connectivity was established via an external virtual network switch on an ethernet adaptor shared between virtual machines but not with Windows. Dynamic memory was switched off from the Hyper-V UI. An SSH service daemon was pre-installed to allow Ansible a tty terminal to run commands from. Bootstrapping the Ansible User Repeatability through automation is a large part of DevOps. It cuts down on human error, after all. Ansible, therefore, requires a standard way to establish a terminal for the various machines under its control. This can be achieved using a public/private key pairing for SSH authentication. The keys can be generated for an Elliptic Curve Algorithm as follows: ssh-keygen -f ansible -t ecdsa -b 521 The Ansible script to create and match an account to the keys is: YAML --- - name: Bootstrap ansible hosts: all become: true tasks: - name: Add ansible user ansible.builtin.user: name: ansible shell: /bin/bash become: true - name: Add SSH key for ansible ansible.posix.authorized_key: user: ansible key: "{{ lookup('file', 'ansible.pub') }" state: present exclusive: true # to allow revocation # Join the key options with comma (no space) to lock down the account: key_options: "{{ ','.join([ 'no-agent-forwarding', 'no-port-forwarding', 'no-user-rc', 'no-x11-forwarding' ]) }" # noqa jinja[spacing] become: true - name: Configure sudoers community.general.sudoers: name: ansible user: ansible state: present commands: ALL nopassword: true runas: ALL # ansible user should be able to impersonate someone else become: true Ansible is declarative, and this snippet depicts a series of tasks that ensure that: The Ansible user exists; The keys are added for SSH authentication and The Ansible user can execute with elevated privilege using sudo Towards the top is something very important, and it might go unnoticed under a cursory gaze: hosts: all What does this mean? The answer to this puzzle can be easily explained at the hand of the Ansible inventory file: YAML masters: hosts: host1: ansible_host: "192.168.68.116" ansible_connection: ssh ansible_user: atmin ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible comasters: hosts: co-master_vivobook: ansible_connection: ssh ansible_host: "192.168.68.109" ansible_user: atmin ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible workers: hosts: client1: ansible_connection: ssh ansible_host: "192.168.68.115" ansible_user: atmin ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible client2: ansible_connection: ssh ansible_host: "192.168.68.130" ansible_user: atmin ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible It is the register of all machines the Ansible project is responsible for. Since our example project concerns a high availability K8s cluster, it consists of sections for the master, co-masters, and workers. Each section can contain more than one machine. The root-enabled account atmin on display here was created by Ubuntu during installation. The answer to the question should now be clear — the host key above specifies that every machine in the cluster will have an account called Ansible created according to the specification of the YAML. The command to run the script is: ansible-playbook --ask-pass bootstrap/bootstrap.yml -i atomika/atomika_inventory.yml -K The locations of the user bootstrapping YAML and the inventory files are specified. The command, furthermore, requests password authentication for the user from the inventory file. The -K switch, on its turn, asks that the superuser password be prompted. It is required by tasks that are specified to be run as root. It can be omitted should the script run from the root. Upon successful completion, one should be able to login to the machines using the private key of the ansible user: ssh ansible@172.28.110.233 -i ansible Note that since this account is not for human use, the bash shell is not enabled. Nevertheless, one can access the home of root (/root) using 'sudo ls /root' The user account can now be changed to ansible and the location of the private key added for each machine in the inventory file: YAML host1: ansible_host: "192.168.68.116" ansible_connection: ssh ansible_user: ansible ansible_ssh_common_args: "-o ControlMaster=no -o ControlPath=none" ansible_ssh_private_key_file: ./bootstrap/ansible One Master To Rule Them All We are now ready to boot the K8s master: ansible-playbook atomika/k8s_master_init.yml -i atomika/atomika_inventory.yml --extra-vars='kubectl_user=atmin' --extra-vars='control_plane_ep=192.168.68.119' The content of atomika/k8s_master_init.yml is: YAML # k8s_master_init.yml - hosts: masters become: yes become_method: sudo become_user: root gather_facts: yes connection: ssh roles: - atomika_base vars_prompt: - name: "control_plane_ep" prompt: "Enter the DNS name of the control plane load balancer?" private: no - name: "kubectl_user" prompt: "Enter the name of the existing user that will execute kubectl commands?" private: no tasks: - name: Initializing Kubernetes Cluster become: yes # command: kubeadm init --pod-network-cidr 10.244.0.0/16 --control-plane-endpoint "{{ ansible_eno1.ipv4.address }:6443" --upload-certs command: kubeadm init --pod-network-cidr 10.244.0.0/16 --control-plane-endpoint "{{ control_plane_ep }:6443" --upload-certs #command: kubeadm init --pod-network-cidr 10.244.0.0/16 --upload-certs run_once: true #delegate_to: "{{ k8s_master_ip }" - pause: seconds=30 - name: Create directory for kube config of {{ ansible_user }. become: yes file: path: /home/{{ ansible_user }/.kube state: directory owner: "{{ ansible_user }" group: "{{ ansible_user }" mode: 0755 - name: Copy /etc/kubernetes/admin.conf to user home directory /home/{{ ansible_user }/.kube/config. copy: src: /etc/kubernetes/admin.conf dest: /home/{{ ansible_user }/.kube/config remote_src: yes owner: "{{ ansible_user }" group: "{{ ansible_user }" mode: '0640' - pause: seconds=30 - name: Remove the cache directory. file: path: /home/{{ ansible_user }/.kube/cache state: absent - name: Create directory for kube config of {{ kubectl_user }. become: yes file: path: /home/{{ kubectl_user }/.kube state: directory owner: "{{ kubectl_user }" group: "{{ kubectl_user }" mode: 0755 - name: Copy /etc/kubernetes/admin.conf to user home directory /home/{{ kubectl_user }/.kube/config. copy: src: /etc/kubernetes/admin.conf dest: /home/{{ kubectl_user }/.kube/config remote_src: yes owner: "{{ kubectl_user }" group: "{{ kubectl_user }" mode: '0640' - pause: seconds=30 - name: Remove the cache directory. file: path: /home/{{ kubectl_user }/.kube/cache state: absent - name: Create Pod Network & RBAC. become_user: "{{ ansible_user }" become_method: sudo become: yes command: "{{ item }" with_items: kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml - pause: seconds=30 - name: Configure kubectl command auto-completion for {{ ansible_user }. lineinfile: dest: /home/{{ ansible_user }/.bashrc line: 'source <(kubectl completion bash)' insertafter: EOF - name: Configure kubectl command auto-completion for {{ kubectl_user }. lineinfile: dest: /home/{{ kubectl_user }/.bashrc line: 'source <(kubectl completion bash)' insertafter: EOF ... From the host keyword, one can see these tasks are only enforced on the master node. However, two things are worth explaining. The Way Ansible Roles The first is the inclusion of the atomika_role towards the top: YAML roles: - atomika_base The official Ansible documentation states that: "Roles let you automatically load related vars, files, tasks, handlers, and other Ansible artifacts based on a known file structure." The atomika_base role is included in all three of the Ansible YAML scripts that maintain the master, co-masters, and workers of the cluster. Its purpose is to lay the base by making sure that tasks common to all three member types have been executed. As stated above, an ansible role follows a specific directory structure that can contain file templates, tasks, and variable declaration, amongst other things. The Kubernetes and ContainerD versions are, for example, declared in the YAML of variables: YAML k8s_version: 1.28.2-00 containerd_version: 1.6.24-1 In short, therefore, development can be fast-tracked through the use of roles developed by the Ansible community that open-sourced it at Ansible Galaxy. Dealing the Difference The second thing of interest is that although variables can be passed in from the command line using the --extra-vars switch, as can be seen, higher up, Ansible can also be programmed to prompt when a value is not set: YAML vars_prompt: - name: "control_plane_ep" prompt: "Enter the DNS name of the control plane load balancer?" private: no - name: "kubectl_user" prompt: "Enter the name of the existing user that will execute kubectl commands?" private: no Here, prompts are specified to ask for the user that should have kubectl access and the IP address of the control plane. Should the script execute without error, the state of the cluster should be: atmin@kxsmaster2:~$ kubectl get pods -o wide -A NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-flannel kube-flannel-ds-mg8mr 1/1 Running 0 114s 192.168.68.111 kxsmaster2 <none> <none> kube-system coredns-5dd5756b68-bkzgd 1/1 Running 0 3m31s 10.244.0.6 kxsmaster2 <none> <none> kube-system coredns-5dd5756b68-vzkw2 1/1 Running 0 3m31s 10.244.0.7 kxsmaster2 <none> <none> kube-system etcd-kxsmaster2 1/1 Running 0 3m45s 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-apiserver-kxsmaster2 1/1 Running 0 3m45s 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-controller-manager-kxsmaster2 1/1 Running 7 3m45s 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-proxy-69cqq 1/1 Running 0 3m32s 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-scheduler-kxsmaster2 1/1 Running 7 3m45s 192.168.68.111 kxsmaster2 <none> <none> All the pods required to make up the control plane run on the one master node. Should you wish to run a single-node cluster for development purposes, do not forget to remove the taint that prevents scheduling on the master node(s). kubectl taint node --all node-role.kubernetes.io/control-plane:NoSchedule- However, a cluster consisting of one machine is not a true cluster. This will be addressed next. Kubelets of the Cluster, Unite! Kubernetes, as an orchestration automaton, needs to be resilient by definition. Consequently, developers and a buggy CI/CD pipeline should not touch the master nodes by scheduling load on it. Therefore, Kubernetes increases resilience by expecting multiple worker nodes to join the cluster and carry the load: ansible-playbook atomika/k8s_workers.yml -i atomika/atomika_inventory.yml The content of k8x_workers.yml is: YAML # k8s_workers.yml --- - hosts: workers, vmworkers remote_user: "{{ ansible_user }" become: yes become_method: sudo gather_facts: yes connection: ssh roles: - atomika_base - hosts: masters tasks: - name: Get the token for joining the nodes with Kuberenetes master. become_user: "{{ ansible_user }" shell: kubeadm token create --print-join-command register: kubernetes_join_command - name: Generate the secret for joining the nodes with Kuberenetes master. become: yes shell: kubeadm init phase upload-certs --upload-certs register: kubernetes_join_secret - name: Copy join command to local file. become: false local_action: copy content="{{ kubernetes_join_command.stdout_lines[0] } --certificate-key {{ kubernetes_join_secret.stdout_lines[2] }" dest="/tmp/kubernetes_join_command" mode=0700 - hosts: workers, vmworkers #remote_user: k8s5gc #become: yes #become_metihod: sudo become_user: root gather_facts: yes connection: ssh tasks: - name: Copy join command to worker nodes. become: yes become_method: sudo become_user: root copy: src: /tmp/kubernetes_join_command dest: /tmp/kubernetes_join_command mode: 0700 - name: Join the Worker nodes with the master. become: yes become_method: sudo become_user: root command: sh /tmp/kubernetes_join_command register: joined_or_not - debug: msg: "{{ joined_or_not.stdout }" ... There are two blocks of tasks — one with tasks to be executed on the master and one with tasks for the workers. This ability of Ansible to direct blocks of tasks to different member types is vital for cluster formation. The first block extracts and augments the join command from the master, while the second block executes it on the worker nodes. The top and bottom portions from the console output can be seen here: YAML janrb@dquick:~/atomika$ ansible-playbook atomika/k8s_workers.yml -i atomika/atomika_inventory.yml [WARNING]: Could not match supplied host pattern, ignoring: vmworkers PLAY [workers, vmworkers] ********************************************************************************************************************************************************************* TASK [Gathering Facts] ************************************************************************************************************************************************************************ok: [client1] ok: [client2] ........................................................................... TASK [debug] **********************************************************************************************************************************************************************************ok: [client1] => { "msg": "[preflight] Running pre-flight checks\n[preflight] Reading configuration from the cluster...\n[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Starting the kubelet\n[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...\n\nThis node has joined the cluster:\n* Certificate signing request was sent to apiserver and a response was received.\n* The Kubelet was informed of the new secure connection details.\n\nRun 'kubectl get nodes' on the control-plane to see this node join the cluster." } ok: [client2] => { "msg": "[preflight] Running pre-flight checks\n[preflight] Reading configuration from the cluster...\n[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Starting the kubelet\n[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...\n\nThis node has joined the cluster:\n* Certificate signing request was sent to apiserver and a response was received.\n* The Kubelet was informed of the new secure connection details.\n\nRun 'kubectl get nodes' on the control-plane to see this node join the cluster." } PLAY RECAP ************************************************************************************************************************************************************************************client1 : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 client1 : ok=23 changed=6 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0 client2 : ok=23 changed=6 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0 host1 : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 Four tasks were executed on the master node to determine the join command, while 23 commands ran on each of the two clients to ensure they were joined to the cluster. The tasks from the atomika-base role accounts for most of the worker tasks. The cluster now consists of the following nodes, with the master hosting the pods making up the control plane: atmin@kxsmaster2:~$ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8xclient1 Ready <none> 23m v1.28.2 192.168.68.116 <none> Ubuntu 20.04.6 LTS 5.4.0-163-generic containerd://1.6.24 kxsclient2 Ready <none> 23m v1.28.2 192.168.68.113 <none> Ubuntu 20.04.6 LTS 5.4.0-163-generic containerd://1.6.24 kxsmaster2 Ready control-plane 34m v1.28.2 192.168.68.111 <none> Ubuntu 20.04.6 LTS 5.4.0-163-generic containerd://1.6.24 With Nginx deployed, the following pods will be running on the various members of the cluster: atmin@kxsmaster2:~$ kubectl get pods -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default nginx-7854ff8877-g8lvh 1/1 Running 0 20s 10.244.1.2 kxsclient2 <none> <none> kube-flannel kube-flannel-ds-4dgs5 1/1 Running 1 (8m58s ago) 26m 192.168.68.116 k8xclient1 <none> <none> kube-flannel kube-flannel-ds-c7vlb 1/1 Running 1 (8m59s ago) 26m 192.168.68.113 kxsclient2 <none> <none> kube-flannel kube-flannel-ds-qrwnk 1/1 Running 0 35m 192.168.68.111 kxsmaster2 <none> <none> kube-system coredns-5dd5756b68-pqp2s 1/1 Running 0 37m 10.244.0.9 kxsmaster2 <none> <none> kube-system coredns-5dd5756b68-rh577 1/1 Running 0 37m 10.244.0.8 kxsmaster2 <none> <none> kube-system etcd-kxsmaster2 1/1 Running 1 37m 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-apiserver-kxsmaster2 1/1 Running 1 37m 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-controller-manager-kxsmaster2 1/1 Running 8 37m 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-proxy-bdzlv 1/1 Running 1 (8m58s ago) 26m 192.168.68.116 k8xclient1 <none> <none> kube-system kube-proxy-ln4fx 1/1 Running 1 (8m59s ago) 26m 192.168.68.113 kxsclient2 <none> <none> kube-system kube-proxy-ndj7w 1/1 Running 0 37m 192.168.68.111 kxsmaster2 <none> <none> kube-system kube-scheduler-kxsmaster2 1/1 Running 8 37m 192.168.68.111 kxsmaster2 <none> <none> All that remains is to expose the Nginx pod using an instance of NodePort, LoadBalancer, or Ingress to the outside world. Maybe more on that in another article... Conclusion This posting explained the basic concepts of Ansible at the hand of scripts booting up a K8s cluster. The reader should now grasp enough concepts to understand tutorials and search engine results and to make a start at using Ansible to set up infrastructure using code. More

Paved vs. Golden Paths in Platform Engineering

By Steve Fenton

If you follow the platform engineering trend, you'll have heard people talking about paved paths and golden paths. They're sometimes used as synonyms but can also reflect different approaches. In this article, I discuss the critical difference between paved paths and golden paths in platform engineering. Paved Paths If you were a city planner designing a park, you'd need to provide areas for people to stop and routes to pass through. The perfect park is a public space you can see, stroll through, and use for recreational activities. One of the tricky parts of park design is where to place the paths. People who want to take a leisurely walk prefer a winding scenic stroll with pleasant views. But people passing through from the coffee shop to their office prefer more direct routes. Most parks offer winding trails through the park or a series of direct paths forming a giant X crisscrossing the park. As an alternative to planning the routes through the park, you can let people use it for a while. People crossing the park will wear tracks in the grass that indicate where paths may be most helpful. They literally vote with their feet. Building these paths after the demand for a route is visible means you're more likely to put them in the right place, though you can't please everyone. This approach is also dangerous, as Sam Walter Foss warned. His 1895 poem tells how a playful calf influences the design of a large city. The city's main road gets built around the trail the calf made through the woods some 300 years earlier. Paved Paths in Software You can use the paved path technique in software. You can observe how users currently achieve a goal and use what you find to generate a design for the software. Before people created software source control systems, the source wall was a common way to avoid change collisions. To create a source wall, you'd write each file name on a sticky note and add it to the wall. If you wanted to edit a file, you'd go to the source wall and find the file you wanted to change. If the sticky note was on the wall, you could take it back to your desk and make your edit. If you couldn't find the sticky note, you had to wait for its return before making your change. This manual process meant your changes would never clash, and you'd never overwrite another developer's changes. The first source control system paved this path. You'd check out a file, and the system would prevent another developer from changing it until you checked it back in. This pattern was the paved path equivalent of the source wall. If you use a modern source control system, you'll notice it doesn't work this way. That's because something better has replaced the paved path — a golden path. Golden Paths Going back to the city park example, if you had a design in mind for the use of different spaces, you might want to tempt people to take a slightly longer route that lets you make better use of the overall space. Instead of optimizing the park for commuters, you want to balance the many different uses. In this case, you'd need to find ways to attract people to your preferred route to avoid them damaging the grass and planting. In Brisbane, the park in the South Bank area features just such a path. Instead of offering an efficient straight line between common destinations, it has sweeping curves along its entire length. The path has a decorative arbor that provides shelter from the hot sun and light showers. Instead of attempting to block other routes with fences, people are attracted to the path because they can stay cool or dry. The Brisbane Grand Arbor walk is 150 meters longer than a straight-line route, but it creates spaces for restaurants, a pond, a rainforest walk, and a lagoon. Golden paths are a system-level design technique. They're informed by a deep understanding of the different purposes of the space. Golden Paths in Platform Engineering In Platform Engineering, Golden Paths are just like Brisbane's Grand Arbor. Instead of forcing developers to do things a certain way, you design the internal developer platform to attract developers by reducing their burden and removing pain points. It's the optimal space between anything goes and forced standardization. Golden paths provide a route toward alignment. Say you have 5 teams, all using different continuous integration tools. As a platform engineer, you'd work out the best way to build, test, and package all the software and provide this as a golden path. It needs to be better than what developers currently do and easy to adopt, as you can't force it on a team. The teams that adopt the golden path have an easy life as far as their continuous integration activities are concerned. Nothing makes a platform more attractive than seeing happy users. When done well, an internal development platform may feel like a paved path to the developers, but it should reduce the overall cognitive load. This often involves both consolidation and standardization. You won't solve all developer pain at once. Platform engineers will need to go and see what pain exists and think about how they might design a product that will remove it. When you start this journey, it's worth understanding the patterns and anti-patterns of platform engineering. Take the High Road World champion weightlifter Jerzy Gregorek once said: "Hard choices, easy life. Easy choices, hard life." You need to make many hard choices to create a great internal developer platform. You have to decide what problems the platform will solve and which it won't. You need to determine when a feature should flex to meet the needs of a development team and when you should let them strike out on their own path. These hard choices are the difference between a golden path and a paved path. With a paved path, you can reduce the burden on developers; the pain just moves into your platform team. A golden path will reduce the total cognitive load for everyone by dedicating the platform team to its elimination. Happy deployments! More

Platform Engineering Trends in Cloud Native: Q&A With Thomas Graf

By Tom Smith CORE

cdCon + GitOpsCon: Co-evolving Open Source DevOps Communities in One Conference

By Dwayne McDaniel

Avoid Merge Conflicts, Don't Manage Them

By Jonathan Hall

Automated Testing: The Missing Piece of Your CI/CD Puzzle

This is an article from DZone's 2023 Automated Testing Trend Report.For more: Read the Report DevOps and CI/CD pipelines help scale application delivery drastically — with some organizations reporting over 208 times more frequent code deployments. However, with such frequent deployments, the stability and reliability of the software releases often become a challenge. This is where automated testing comes into play. Automated testing acts as a cornerstone in supporting efficient CI/CD workflows. It helps organizations accelerate applications into production and optimize resource efficiency by following a fundamental growth principle: build fast, fail fast. This article will cover the importance of automated testing, some key adoption techniques, and best practices for automated testing. The Importance of Automated Testing in CI/CD Manual tests are prone to human errors such as incorrect inputs, misclicks, etc. They often do not cover a broad range of scenarios and edge cases compared to automated testing. These limitations make automated testing very important to the CI/CD pipeline. Automated testing directly helps the CI/CD pipeline through faster feedback cycles to developers, testing in various environments simultaneously, and more. Let's look at the specific ways in which it adds value to the CI/CD pipeline. Validate Quality of Releases Releasing a new feature is difficult and often very time-consuming. Automated testing helps maintain the quality of software releases, even on a tight delivery timeline. For example, automated smoke tests ensure new features work as expected. Similarly, automated regression tests check that the new release does not break any existing functionality. Therefore, development teams can have confidence in the release's reliability, quality, and performance with automated tests in the CI/CD pipeline. This is especially useful in organizations with multiple daily deployments or an extensive microservices architecture. Identify Bugs Early Another major advantage of automated testing in CI/CD is its ability to identify bugs early in the development cycle. Shifting testing activities earlier in the process (i.e., shift-left testing) can detect and resolve potential issues during the non-development phases. For example, instead of deploying a unit of code to a testing server and waiting for testers to find the bugs, you can add many unit tests in the test suite. This will allow developers to identify and fix issues on their local systems, such as data handling or compatibility with third-party services in the proof of concept (PoC) phase. Figure 1: Shift-left testing technique Faster Time to Market Automated testing can help reduce IT costs and ensure faster time to market, giving companies a competitive edge. With automated testing, the developer receives rapid feedback instantly. Thus, organizations can catch defects early in the development cycle and reduce the inherent cost of fixing them. Ease of Handling Changes Minor changes and updates are common as software development progresses. For example, there could be urgent changes based on customer feedback on a feature, or an issue in a dependency package, etc. With automated tests in place, developers receive quick feedback on all their code changes. All changes can be validated quickly, making sure that new functionalities do not introduce unintended consequences or regressions. Promote Collaboration Across Teams Automated testing promotes collaboration among development, testing, and operations teams through DevTestOps. The DevTestOps approach involves ongoing testing, integration, and deployment. As you see in Figure 2, the software is tested throughout the development cycle to proactively reduce the number of bugs and inefficiencies at later stages. Using automated testing allows teams to be on the same page regarding the expected output. Teams can communicate and align their understanding of the software requirements and expected behavior with a shared set of automated tests. Figure 2: DevTestOps approach Maintain Software Consistency Automated testing also contributes to maintaining consistency and agility throughout the CI/CD pipeline. Teams can confirm that software behaves consistently by generating and comparing multiple test results across different environments and configurations. This consistency is essential in achieving predictable outcomes and avoiding deployment issues. Adoption Techniques Adopting automated testing in a CI/CD pipeline requires a systematic approach to add automated tests at each stage of the development and deployment processes. Let's look at some techniques that developers, testers, and DevOps can follow to make the entire process seamless. Figure 3: Automated testing techniques in the CI/CD process Version Control for Test Data Using version control for your test assets helps synchronize tests with code changes, leading to collaboration among developers, testers, and other stakeholders. Organizations can effectively manage test scripts, test data, and other testing artifacts with a version control system, such as Git, for test assets. For example, a team can use centralized repositories to keep all test data in sync instead of manually sharing Java test cases between different teams. Using version control for your test data also allows for quick database backups if anything goes wrong during testing. Test data management involves strategies for handling test data, such as data seeding, database snapshots, or test data generation. Managing test data effectively ensures automated tests are performed with various scenarios and edge cases. Test-Driven Development Test-driven development (TDD) is an output-driven development approach where tests are written before the actual code, which guides the development process. As developers commit code changes, the CI/CD system automatically triggers the test suite to check that the changes adhere to the predefined requirements. This integration facilitates continuous testing, and allows developers to get instant feedback on the quality of their code changes. TDD also encourages the continuous expansion of the automated test suite, and hence, greater test coverage. Implement Continuous Testing By implementing continuous testing, automated tests can be triggered when code is changed, a pull request (PR) is created, a build is generated, or before a PR is merged within the CI/CD pipeline. This approach helps reduce the risk of regression issues, and ensures that software is always in a releasable state. With continuous testing integration, automated tests are seamlessly integrated into the development and release process, providing higher test coverage and early verification of non-functional requirements. Use Industry Standard Test Automation Frameworks Test automation frameworks are crucial to managing test cases, generating comprehensive reports, and seamlessly integrating with CI/CD tools. These frameworks provide a structured approach to organizing test scripts, reducing redundancy, and improving maintainability. Test automation frameworks offer built-in features for test case management, data-driven testing, and modular test design, which empower development teams to streamline their testing efforts. Example open-source test automation frameworks include — but are not limited to — SpecFlow and Maven. Low-Code Test Automation Frameworks Low-code test automation platforms allow testers to create automated tests with minimal coding by using visual interfaces and pre-built components. These platforms enable faster test script creation and maintenance, making test automation more accessible to non-technical team members. A few popular open-source low-code test automation tools include: Robot Framework Taurus Best Practices for Automated Testing As your automated test suite and test coverage grow, it's important to manage your test data and methods efficiently. Let's look at some battle-tested best practices to make your automated testing integration journey simpler. Parallel vs. Isolated Testing When implementing automated testing in CI/CD, deciding whether to execute tests in isolation or parallel is important. Isolated tests run independently and are ideal for unit tests, while parallel execution is great for higher-level tests such as integration and end-to-end tests. Prioritize tests based on their criticality and the time required for execution. To optimize testing time and accelerate feedback, consider parallelizing test execution. Developers can also significantly reduce the overall test execution time by running multiple tests simultaneously across different environments or devices. However, make sure to double-check that the infrastructure and test environment can handle the increased load to avoid any resource constraints that may impact test accuracy. DECISION MATRIX FOR ISOLATED vs. PARALLEL TESTING Factor Isolated Tests Parallel Tests Test execution time Slower execution time Faster execution time Test dependencies Minimal dependencies Complex dependencies Resources Limited resources Abundant resources Environment capacity Limited capacity High capacity Number of test cases Few test cases Many test cases Scalability Scalable Not easily scalable Resource utilization efficiency High Low Impact on CI/CD pipeline performance Minimal Potential bottleneck Testing budget Limited Sufficient Table 1 One-Click Migration Consider implementing a one-click migration feature in the CI/CD pipeline to test your application under different scenarios. Below is how you can migrate automated test scripts, configurations, and test data between different environments or testing platforms: Store your automated test scripts and configurations in version control. Create a containerized test environment. Create a build automation script to automate building the Docker image with the latest version of test scripts and all other dependencies. Configure your CI/CD tool (e.g., Jenkins, GitLab CI/CD, CircleCI) to trigger the automation script when changes are committed to the version control system. Define a deployment pipeline in your CI/CD tool that uses the Docker image to deploy the automated tests to the target environment. Finally, to achieve one-click migration, create a single button or command in your CI/CD tool's dashboard that initiates the deployment and execution of the automated tests. Use Various Testing Methods The next tip is to include various testing methods in your automated testing suite. Apart from traditional unit tests, you can incorporate smoke tests to quickly verify critical functionalities and regression tests to check that new code changes do not introduce regressions. Other testing types, such as performance testing, API testing, and security testing, can be integrated into the CI/CD pipeline to address specific quality concerns. In Table 2, see a comparison of five test types. COMPARISON OF VARIOUS TEST TYPES Test Type Goal Scope When to Perform Time Required Resources Required Smoke test Verify if critical functionalities work after changes Broad and shallow After code changes — build Quick — minutes to a few hours Minimal Sanity test Quick check to verify if major functionalities work Focused and narrow After smoke test Quick — minutes to a few hours Minimal Regression test Ensure new changes do not negatively impact existing features Comprehensive — retests everything After code changes — build or deployment Moderate — several hours to a few days Moderate Performance test Evaluate software's responsiveness, stability, and scalability Load, stress, and scalability tests Toward end of development cycle or before production release Moderate — several hours to a few days Moderate Security test Identify and address potential vulnerabilities and weaknesses Extensive security assessments Toward end of development cycle or before production release Moderate to lengthy — several days to weeks Extensive Table 2 According to the State of Test Automation Survey 2022, the following types of automation tests are preferred by most developers and testers because they have clear pass/fail results: Functional testing (66.5%) API testing (54.2%) Regression testing (50.5%) Smoke testing (38.2%) Maintain Your Test Suite Next, regularly maintain the automated test suite to match it to changing requirements and the codebase. An easy way to do this is to integrate automated testing with version control systems like Git. This way, you can maintain a version history of test scripts and synchronize your tests with code changes. Additionally, make sure to document every aspect of the CI/CD pipeline, including the test suite, test cases, testing environment configurations, and the deployment process. This level of documentation helps team members access and understand the testing procedures and frameworks easily. Documentation facilitates collaboration and knowledge sharing while saving time in knowledge transfers. Conclusion Automated testing processes significantly reduce the time and effort for testing. With automated testing, development teams can detect bugs early, validate changes quickly, and guarantee software quality throughout the CI/CD pipeline. In short, it helps development teams to deliver quality products and truly unlock the power of CI/CD. This is an article from DZone's 2023 Automated Testing Trend Report.For more: Read the Report

By Lipsa Das CORE

CI/CD Docker: How To Create a CI/CD Pipeline With Jenkins, Containers, and Amazon ECS

If you’re still building and delivering your software applications the traditional way, then you are missing out on a major innovation in the software development process or software development life cycle. To show you what I’m talking about, in this article, I will share how to create a CI/CD Pipeline with Jenkins, Containers, and Amazon ECS that deploys your application and overcomes the limitations of the traditional software delivery model. This innovation greatly affects deadlines, time to market, quality of the product, etc. I will take you through the whole step-by-step process of setting up a CI/CD Docker pipeline for a sample Node.js application. What Is a CI/CD Pipeline? A CI/CD Pipeline or Continuous Integration Continuous Delivery Pipeline is a set of instructions to automate the process of Software tests, builds, and deployments. Here are a few benefits of implementing CI/CD in your organization. Smaller code change: The ability of CI/CD Pipelines to allow the integration of a small piece of code at a time helps developers recognize any potential problem before too much work is completed. Faster delivery: Multiple daily releases or continual releases can be made a reality using CI/CD Pipelines. Observability: Having automation in place that generates extensive logs at each stage of the development process helps to understand if something goes wrong. Easier rollbacks: There are chances that the code that has been deployed may have issues. In such cases, it is very crucial to get back to the previous working release as soon as possible. One of the biggest advantages of using the CI/CD Pipelines is that you can quickly and easily roll back to the previous working release. Reduce costs: Having automation in place for repetitive tasks frees up the Developer and Operation guys’ time that could be spent on Product Development. Now, before we proceed with the steps to set up a CI/CD Pipeline with Jenkins, Containers, and Amazon ECS, let’s see, in short, what tools and technologies we will be using. CI/CD Docker Tool Stack GitHub: It is a web-based application or a cloud-based service where people or developers collaborate, store, and manage their application code using Git. We will create and store our sample Nodejs application code here. AWS EC2 Instance: AWS EC2 is an Elastic Computer Service provided by Amazon Web Services used to create Virtual Machines or Virtual Instances on AWS Cloud. We will create an EC2 instance and install Jenkins and other dependencies in it. Java: This will be required to run the Jenkins Server. AWS CLI: aws-cli i.e AWS Command Line Interface, is a command-line tool used to manage AWS Services using commands. We will be using it to manage AWS ECS Task and ECS Service. Node.js and NPM: Node.js is a back-end JavaScript runtime environment, and NPM is a package manager for Node. We will be creating a CI CD Docker Pipeline for the Node.js application. Docker: Docker is an open-source containerization platform used for developing, shipping, and running applications. We will use it to build Docker Images of our sample Node.js application and push/pull them to/from AWS ECR. Jenkins: Jenkins is an open-source, freely available automation server used to build, test, and deploy software applications. We will be creating our CI/CD Docker Pipeline to build, test, and deploy our Node.js application on AWS ECS using Jenkins AWS ECR: AWS Elastic Container Registry is a Docker Image Repository fully managed by AWS to easily store, share, and deploy container images. We will be using AWS ECR to store Docker Images of our sample Node.js application. AWS ECS: AWS Elastic Container Service is a container orchestration service fully managed by AWS to easily deploy, manage, and scale containerized applications. We will be using it to host our sample Node.js application. Architecture This is how our architecture will look like after setting up the CI/CD Pipeline with Docker. After the CI/CD Docker Pipeline is successfully set up, we will push commits to our GitHub repository, and in turn, GitHub Webhook will trigger the CI/CD Pipeline on Jenkins Server. Jenkins Server will then pull the latest code, perform unit tests, build a docker image, and push it to AWS ECR. After the image is pushed to AWS ECR, the same image will be deployed in AWS ECS by Jenkins. CI/CD Workflow and Phases Workflow CI and CD Workflow allows us to focus on Development while it carries out the tests, build, and deployments in an automated way. Continuous Integration: This allows the developers to push the code to the Version Control System or Source Code Management System, build & test the latest code pushed by the developer, and generate and store artifacts. Continuous Delivery: This is the process that lets us deploy the tested code to the Production whenever required. Continuous Deployment: This goes one step further and releases every single change without any manual intervention to the customer system every time the production pipeline passes all the tests. Phases The primary goal of the automated CI/CD pipeline is to build the latest code and deploy it. There can be various stages as per the need. The most common ones are mentioned below. Trigger: The CI/CD pipeline can do its job on the specified schedule when executed manually or triggered automatically on a particular action in the Code Repository. Code pull: In this phase, the pipeline pulls the latest code whenever the pipeline is triggered. Unit tests: In this phase, the pipeline performs tests that are there in the codebase. This is also referred to as unit tests. Build or package: Once all the tests pass, the pipeline moves forward and builds artifacts or docker images in case of dockerized applications. Push or store: In this phase, the code that has been built is pushed to the Artifactory or Docker Repository in case of dockerized applications. Acceptance tests: This phase or stage of the pipeline validates if the software behaves as intended. It is a way to ensure that the software or application does what it is meant to do. Deploy: This is the final stage in any CI/CD pipeline. In this stage, the application is ready for delivery or deployment. Deployment Strategy A deployment strategy is a way in which containers of the micro-services are taken down and added. There are various options available; however, we will only discuss the ones that are available and supported by ECS Rolling Updates In rolling updates, the scheduler in the ECS Service replaces the currently running tasks with new ones. The tasks in the ECS cluster are nothing but running containers created out of the task definition. Deployment configuration controls the number of tasks that Amazon ECS adds or removes from the service. The lower and the upper limit on the number of tasks that should be running is controlled by minimumHealthyPercent and maximumPercent, respectively. minimumHealthyPercent example: If the value of minimumHealthyPercent is 50 and the desired task count is four, then the scheduler can stop two existing tasks before starting two new tasks maximumPercent example: If the value of maximumPercent is four and the desired task is four, then the scheduler can start four new tasks before stopping four existing tasks. If you want to learn more about this, visit the official documentation here. Blue/Green Deployment Blue/Green deployment strategy enables the developer to verify a new deployment before sending traffic to it by installing an updated version of the application as a new replacement task set. There are primarily three ways in which traffic can shift during blue/green deployment. Canary — Traﬃc is shifted in two increments. The percentage of traﬃc shifted to your updated task set in the ﬁrst increment and the interval, in minutes, before the remaining traﬃc is shifted in the second increment. Linear — Traﬃc is shifted in equal increments, the percentage of traﬃc shifted in each increment, and the number of minutes between each increment. All-at-once — All traﬃc is shifted from the original task set to the updated task set all at once. To learn more about this, visit the official documentation here. Out of these two strategies, we will be using the rolling-updates deployment strategy in our demo application. Dockerize Node.js App Now, let’s get started and make our hands dirty. The Dockerfile for the sample Nodejs application is as follows. There is no need to copy-paste this file. It is already available in the sample git repository that you cloned previously. Let’s just try to understand the instructions of our Dockerfile. FROM node:12.18.4-alpineThis will be our base image for the container. WORKDIR /appThis will be set as a working directory in the container. ENV PATH /app/node_modules/.bin:$PATHPATH variable is assigned a path to /app/node_modules/.bin. COPY package.json ./Package.json will be copied in the working directory of the container. RUN npm installInstall dependencies. COPY . ./Copy files and folders with dependencies from the host machine to the container. EXPOSE 3000Allow to port 300 of the container. CMD [“node”, “./src/server.js”]Start the application This is the Docker file that we will use to create a Docker image. Setup GitHub Repositories Create a New Repository Go to GitHub, create an account if you don’t have it already else log in to your account and create a new repository. You can name it as per your choice; however, I would recommend using the same name to avoid any confusion. You will get the screen as follows: copy the repository URL and keep it handy. Call this URL a GitHub Repository URL and note it down in the text file on your system. Note: Create a new text file on your system and note down all the details that will be required later. Create a GitHub Token This will be required for authentication purposes. It will be used instead of a password for Git over HTTP or can be used to authenticate to the API over Basic Authentication. Click on the user icon in the top-right, go to “Settings,” then click on the “Developers settings” option in the left panel. Click on the “Personal access tokens” options and “Generate new token” to create a new token. Tick the “repo” checkbox. The token will then have “full control of private repositories” You should see your token created now. Clone the Sample Repository Check your present working directory.pwd Note: You are in the home directory, i.e.,/home/ubuntu. Clone my sample repository containing all the required code.git clone Create a new repository. This repository will be used for CI/CD Pipeline setup.git clone Copy all the code from my node.js repository to the newly created demo-nodejs-app repository.cp -r nodejs/* demo-nodejs-app/ Change your working directory.cd demo-nodejs-app/ Note: For the rest of the article, do not change your directory. Stay in the same directory. Here it is /home/ubuntu/demo-nodejs-app/, and execute all the commands from there. ls -l git status Push Your First Commit to the Repository Check your present working directory. It should be the same. Here it is:/home/ubuntu/demo-nodejs-app/pwd Set a username for your git commit message.git config user.name “Rahul” Set an email for your git commit message.git config user.email “<>” Verify the username and email you set.git config –list Check the status, see files that have been changed or added to your git repository.git status Add files to the git staging area.git add Check the status, see files that have been added to the git staging area.git status Commit your files with a commit message.git commit -m “My first commit” Push the commit to your remote git repository.git push Setup the AWS Infrastructure Create an IAM User With Programmatic Access Create an IAM user with programmatic access in your AWS account and note down the access key and secret key in your text file for future reference. Provide administrator permissions to the user. We don’t need admin access; however, to avoid permission issues and for the sake of the demo, let’s proceed with administrator access. Create an ECR Repository Create an ECR Repository in your AWS account and note its URL in your text file for future reference. Create an ECS Cluster Go to ECS Console and click on “Get Started” to create a cluster. Click on the “Configure” button available in the “custom” option under “Container definition.” Specify a name to the container as “nodejs-container,” the ECR Repository URL in the “Image” text box, and “3000” port in the Port mappings section, and then click on the “Update” button. You can specify any name of your choice for the container. You can now see the details you specified under “Container definition.” Click on the “Next” button to proceed. Select “Application Load Balancer” under “Define your service” and then click on the “Next” button. Keep the cluster name as “default” and proceed by clicking on the “Next” button. You can change the cluster name if you want. Review the configuration, and it should look as follows. If the configurations match, then click on the “Create” button. This will initiate the ECS Cluster creation. After a few minutes, you should have your ECS cluster created, and the Launch Status should be something as follows. Create an EC2 Instance for Setting up the Jenkins Server Create an EC2 Instance with Ubuntu 18.04 AMI and open Port 22 for your IP and Port 8080 for 0.0.0.0/0 in its Security Group. Port 22 will be required for ssh and 8080 for accessing the Jenkins Server. Port 8080 is where GitHub Webhook will try to connect to on Jenkins Server hence we need to allow it for 0.0.0.0/0 Setup Jenkins on the EC2 Instance After the instance is available, let’s install Jenkins Server on it along with all the dependencies. Prerequisites of the EC2 Instance Verify if the OS is Ubuntu 18.04 LTScat /etc/issue Check the RAM, minimum of 2 GB is what we require.free -m The User that you use to log in to the server should have sudo privileges. “ubuntu” is the user available with sudo privileges for EC2 instances created using “Ubuntu 18.04 LTS” AMI.whoami Check your present working directory, it will be your home directory.pwd Install Java, JSON Processor jq, Node.js/NPM, and aws-cli on the EC2 Instance Update your system by downloading package information from all configured sources.sudo apt update Search and Install Java 11sudo apt search openjdksudo apt install openjdk-11-jdk Install jq command, the JSON processor.sudo apt install jq Install Nodejs 12 and NPMcurl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash –sudo apt install nodejs Install aws cli tool.sudo apt install awscli Check the Java version.java –version Check the jq version.jq –version Check the Nodejs versionnode –version Check the NPM versionnpm –version Check the aws cli versionaws –version Note: Make sure all your versions match the versions seen in the above image. Install Jenkins on the EC2 Instance Jenkins can be installed from the Debian repositorywget -q -O – http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | sudo apt-key add -sudo sh -c ‘echo deb http://pkg.jenkins-ci.org/debian binary/ > /etc/apt/sources.list.d/jenkins.list’ Update the apt package indexsudo apt-get update Install Jenkins on the machinesudo apt-get install jenkins Check the service status if it is running or not.service jenkins status You should have your Jenkins up and running now. You may refer to the official documentation here if you face any issues with the installation. Install Docker on the EC2 Instance Install packages to allow apt to use a repository over HTTPS:sudo apt-get install apt-transport-https ca-certificates curl gnupg lsb-release Add Docker’s official GPG key:curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg –dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg Set up the stable repositoryecho “deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable” | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null Update the apt package indexsudo apt-get update Install the latest version of Docker Engine and containerd,sudo apt-get install docker-ce docker-ce-cli containerd.io Check the docker version.docker –version Create a “docker” group, this may exit.sudo groupadd docker Add “ubuntu” user to the “docker” groupsudo usermod -aG docker ubuntu Add “jenkins” user to the “docker” groupsudo usermod -aG docker jenkins Test if you can create docker objects using “ubuntu” user.docker run hello-world Switch to “root” usersudo -i Switch to “jenkins” usersu jenkins Test if you can create docker objects using “jenkins” user.docker run hello-world Exit from “jenkins” userexit Exit from “root” userexit Now you should be back in “ubuntu” user. You may refer to the official documentation here if you face any issues with the installation. Configure the Jenkins Server After Jenkins has been installed, the first step is to extract its password.sudo cat /var/lib/jenkins/secrets/initialAdminPassword Hit the URL in the browserJenkins URL: http://<public-ip-of-the-ec2-instace>:8080 Select the “Install suggested plugins” option Specify the user-name, and password for the new admin user to be created. You can use this user as an admin user. This URL field will be auto-filled. Click on the “Save and Finish” button to proceed. Your Jenkins Server is ready now. Here is what its Dashboard looks like. Install Plugins Let’s install all the plugins that we will need. Click on “Manage Jenkins” in the left panel. Here is a list of plugins that we need to install CloudBees AWS Credentials:Allows storing Amazon IAM credentials keys within the Jenkins Credentials API. Docker Pipeline:This plugin allows building, testing, and using Docker images from Jenkins Pipeline. Amazon ECR:This plugin provides integration with AWS Elastic Container Registry (ECR)Usage: AWS Steps:This plugin adds Jenkins pipeline steps to interact with the AWS API. In the “Available” tab, search all these plugins and click on “Install without restart.” You will see the screen as follows after the plugins have been installed successfully. Create Credentials in Jenkins CloudBees AWS Credentials plugin will come to the rescue here. Go to “Manage Jenkins,” and then click on “Manage Credentials." Click on “(global)” “Add credentials”. Select Kind as “AWS Credentials” and provide ID as “demo-admin-user.” This can be provided as per your choice. Keep a note of this ID in the text file. Specify the Access Key and Secret Key of the IAM user we created in the previous steps. Click on “OK” to store the IAM credentials. Follow the same step, and this time select Kind as “Username with password” to store the GitHub Username and Token we created earlier. Click on “Ok” to store the GitHub credentials. You should now have IAM and GitHub credentials in your Jenkins. Create a Jenkins Job Go to the main dashboard and click on “New Item” to create a Jenkins Pipeline. Select the “Pipeline” and name it “demo-job,” or provide a name of your choice. Tick the “GitHub project” checkbox under the “General” tab, and provide the GitHub Repository URL of the one we created earlier. Also, tick the checkbox “GitHub hook trigger for GitScm polling” under the “Build Trigger” tab. Under the “Pipeline” tab, select “Pipeline script from the SCM” definition, specify our repository URL, and select the credential we created for Github. Check the branch name if it matches the one you will be using for your commits. Review the configurations and click on “Save” to save your changes to the pipeline. Now you can see the pipeline we just created. Integrate GitHub and Jenkins The next step is to integrate Github with Jenkins so that whenever there is an event on the Github Repository, it can trigger the Jenkins Job. Go to the settings tab of the repository and click on “Webhooks” in the left panel. You can see the “Add webhook” button. Click on it to create a webhook. Provide the Jenkins URL with context as “/github-webhook/.” The URL will look as follows.Webhook URL: http://<Jenkins-IP>:8080/github-webhook/You can select the events of your choice; however, for the sake of simplicity, I have chosen “Send me everything.” Make sure the “Active” checkbox is checked. Click on “Add webhook” to create a webhook that will trigger the Jenkins job whenever there is any kind of event in the GitHub Repository. You should see your webhook. Click on it to see if it has been configured correctly or not. Click on the “Recent Deliveries” tab, and you should see a green tick mark. The green tick mark shows that the webhook was able to connect to the Jenkins Server. Deploy the Node.js Application to the ECS Cluster Before we trigger the Pipeline from GitHub Webhook, let's try to execute it manually. Build the Job Manually Go to the Job we created and Build it. If you see its logs, you will see that it failed. The reason is we have not yet assigned values to the variable we have in our Jenkinsfile. Push Your Second Commit Reminder Note: For the rest of the article, do not change your directory. Stay in the same directory, i.e.,/home/ubuntu/demo-nodejs-app, and execute all the commands from here. Assign values to the variable in the Jenkinsfile To overcome the above error, you need to make some changes to the Jenkinsfile. We have variables in that file, and we need to assign values to those variables to deploy our application to the ECS cluster we created. Assign correct values to the variables having “CHANGE_ME.”cat Jenkinsfile Here is the list of variables for your convenience.We have the following variables in the Jenkinsfile. AWS_ACCOUNT_ID=”CHANGE_ME”Assign your AWS Account Number here. AWS_DEFAULT_REGION=”CHANGE_ME”Assign the region you created your ECS Cluster in CLUSTER_NAME=”CHANGE_ME”Assign the name of the ECS Cluster that you created. SERVICE_NAME=”CHANGE_ME”Assign the Service name that got created in the ECS Cluster. TASK_DEFINITION_NAME=”CHANGE_ME”Assign the Task name that got created in the ECS Cluster. DESIRED_COUNT=”CHANGE_ME”Assing the number of tasks you want to be created in the ECS Cluster. IMAGE_REPO_NAME=”CHANGE_ME”Assign the ECR Repositoy URL IMAGE_TAG=”${env.BUILD_ID}”Do not change this. REPOSITORY_URI = “${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com/${IMAGE_REPO_NAME}”Do not change this. registryCredential = “CHANGE_ME”Assign the name of the credentials you created in Jenkins to store the AWS Access Key and Secret Key Check the status to confirm that the file has been changed.git statuscat Jenkinsfile Add a file to the git staging area, commit it, and then push it to the remote Github Repository.git statusgit add Jenkinsfilegit commit -m “Assigned environment specific values in Jenkinsfile”git push Error on Jenkins Server After pushing the commit, the Jenkins Pipeline will get triggered. However, you will see an error “Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock” in your Jenkins Job. The reason for this is a “Jenkins” user that is used by the Jenkins Job is not allowed to create docker objects. To give permission to a “Jenkins” user, we added it to the “docker” group in the previous step; however, we did not restart the Jenkins service after that. I kept this deliberately so that I could show you the need to add the “Jenkins” user to the “docker” group in your EC2 Instance. Now you know what needs to be done to overcome the above error. Restart the Jenkins service.sudo service jenkins restart Check if the Jenkins service has started or not.sudo service jenkins status Push Your Third Commit Make some changes in README.md to commit, push, and test if the Pipeline gets triggered automatically or not.vim README.md Add, commit, and push the file.git statusgit diff README.mdgit add README.mdgit commit -m “Modified README.md to trigger the Jenkins job after restarting the Jenkins service”git push This time, you can observe that the job must have been triggered automatically. Go to the Jenkins job and verify the same. This is what the Stage View looks like. It shows us the stages that we have specified in our Jenkinsfile. Check the Status of the Task in the ECS Cluster Go to the Cluster, click on the “Tasks” tab, and then open the running “Task.” Click on the “JSON” tab and verify the image. The image tag should match the Jenkins Build number. In this case, it is “6,” and it matches my Jenkins Job Build number. Hit the ELB URL to check if the Nodejs application is available or not. You should get the message as follows in the browser after hitting the ELB URL. Push Your Fourth Commit Open the “src/server.js” file and make some changes in the display message to test the CI CD Pipeline again.vim src/server.js Check the files that have been changed. In this case, only one file can be seen as changed.git status Check the difference that your change has caused in the file.git diff src/server.js Add the file that you changed to the git staging area.git add src/server.js Check the status of the local repository.git status Add a message to the commit.git commit -m “Updated welcome message” Push your change to the remote repository.git push Go to the Task. This time, you will see two tasks running. One with the older revision and one with the newer revision. You see two tasks because of the rolling-update deployment strategy configured by default in the cluster. Wait for around 2-3 minutes, and you should only have one task running with the latest revision. Again, hit the ELB URL, and you should see your changes. In this case, we had changed the display message.Congratulations! You have a working Jenkins CI CD Pipeline to deploy your Nodejs containerized application on AWS ECS whenever there is a change in your source code. Cleanup the Resources We Created If you were just trying to set up a CI/CD pipeline to get familiar with it or for POC purposes in your organization and no longer need it, it is always better to delete the resources you created while carrying out the POC. As part of this CI/CD pipeline, we created a few resources. We created the below list to help you delete them. Delete the GitHub Repository Delete the GitHub Token Delete the IAM User Delete the EC2 Instance Delete the ECR Repository Delete the ECS Cluster Deregister the Task Definition Summary Finally, here is the summary of what you have to do to set up a CI/CD Docker pipeline to deploy a sample Node.js application on AWS ECS using Jenkins. Clone the existing sample GitHub Repository Create a new GitHub Repository and copy the code from the sample repository in it Create a GitHub Token Create an IAM User Create an ECR Repository Create an ECS Cluster Create an EC2 Instance for setting up the Jenkins Server Install Java, JSON processor jq, Node.js, and NPM on the EC2 Instance Install Jenkins on the EC2 Instance Install Docker on the EC2 Instance Install Plugins Create Credentials in Jenkins Create a Jenkins Job Integrate GitHub and Jenkins Check the deployment Cleanup the resources Conclusion A CI/CD Pipeline serves as a way of automating your software applications’ builds, tests, and deployments. It is the backbone of any organization with a DevOps culture. It has numerous benefits for software development, and it boosts your business greatly. In this blog, we demonstrated the steps to create a Jenkins CI/CD Docker Pipeline to deploy a sample Node.js containerized application on AWS ECS. We saw how GitHub Webhooks can be used to trigger the Jenkins pipeline on every push to the repository, which in turn deploys the latest docker image to AWS ECS. CI/CD Pipelines with Docker is best for your organization to improve code quality and deliver software releases quickly without any human errors. We hope this blog helped you learn more about the integral parts of the CI/CD Docker Pipeline.

By Rahul Shivalkar

Continuous Delivery (CD): A New Approach to Deliver Your Software

Continuous delivery (CD) is a software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time and following a pipeline through a “production-like environment” without doing so manually. It aims at building, testing, and releasing software with greater speed and frequency. The approach helps reduce the cost, time, and risk of delivering changes by allowing for more incremental updates to applications in production. A straightforward and repeatable deployment process is important for continuous delivery. What Is Continuous Delivery? Continuous delivery is a software development practice that automates the process of delivering software to production. This means that code changes are automatically built, tested, and deployed to production without any manual intervention. Continuous delivery is an extension of continuous integration (CI). CI is a software development practice that automates the process of building and testing software. CI ensures that code changes are compatible with each other and that the software is working as expected. Continuous delivery and continuous integration are often used together. CI ensures that code changes are working as expected, and CD ensures that those changes can be deployed to production quickly and easily. Benefits of Continuous Delivery Continuous delivery is a software development practice that automates the process of delivering software to production. This means that code changes are automatically built, tested, and deployed to production without any manual intervention. Continuous delivery is an extension of continuous integration (CI), which is a software development practice that automates the process of building and testing software. CI ensures that code changes are compatible with each other and that the software is working as expected. Continuous delivery builds on top of CI by automating the deployment of software to production. This means that developers can release new features or bug fixes to production much more quickly than they could with manual deployments. There are many benefits to using continuous delivery, including: Faster time to market: Continuous delivery allows developers to release new features or bug fixes to production much more quickly than they could with manual deployments. This can help businesses to stay ahead of the competition and to meet the needs of their customers. Increased reliability: Continuous delivery helps to improve the reliability of software by automating the testing and deployment process. This can help to reduce the number of bugs and errors in software, which can lead to a better user experience. Reduced costs: Continuous delivery can help to reduce the costs associated with software development and deployment. This is because it eliminates the need for manual deployments, which can be time-consuming and expensive. Improved collaboration: Continuous delivery can help to improve collaboration between developers, testers, and operations teams. This is because it provides a shared environment for everyone to work in, which can help to improve communication and reduce the risk of errors. Here are some additional benefits of continuous delivery: Increased customer satisfaction: Continuous delivery can help to increase customer satisfaction by providing them with access to new features and bug fixes more quickly. Improved security: Continuous delivery can help to improve security by automating the testing of software for security vulnerabilities. Increased agility: Continuous delivery can help businesses to be more agile by allowing them to quickly respond to changes in the market or customer needs. How To Implement Continuous Delivery Continuous delivery is a software development practice that automates the process of delivering software to production. This means that code changes are automatically built, tested, and deployed to production without any manual intervention. Continuous delivery is an extension of continuous integration (CI), which is a software development practice that automates the process of building and testing software. CI ensures that code changes are compatible with each other and that the software is working as expected. Continuous delivery builds on top of CI by automating the deployment of software to production. This means that developers can release new features or bug fixes to production much more quickly than they could with manual deployments. Here are the steps on how to implement continuous delivery: Establish a culture of automation: Continuous delivery requires a culture of automation. This means that developers, testers, and operations teams need to be comfortable using automated tools and processes. Create a continuous integration pipeline: A continuous integration pipeline is a set of automated tools and processes that build, test, and deploy software. The pipeline should be designed to be as efficient and reliable as possible. Implement a continuous delivery process: A continuous delivery process is a set of steps that developers follow to release software to production. The process should be well-defined and documented so that everyone knows what they need to do. Use a continuous delivery platform: A continuous delivery platform is a software application that automates the continuous integration and continuous delivery process. There are a number of different continuous delivery platforms available, so businesses should choose one that meets their specific needs. Here are some additional tips for implementing continuous delivery: Start small: Don’t try to implement continuous delivery all at once. Start with a small project and gradually scale it up. Get buy-in from all stakeholders: It’s important to get buy-in from all stakeholders before implementing continuous delivery. This includes developers, testers, operations teams, and management. Use the right tools: There are a number of tools available to help with continuous delivery. Choose the tools that are right for your needs. Test thoroughly: It’s important to test thoroughly before releasing software to production. This will help to reduce the risk of errors and outages. Be prepared for failure: Even if you do everything right, there’s always a chance that something will go wrong. Be prepared for failure and have a plan in place to recover. Challenges of Continuous Delivery Continuous delivery is a powerful software development practice that can help businesses to improve their time to market, reliability, costs, and collaboration. However, it is important to be aware of the challenges that can be associated with implementing continuous delivery. Businesses should carefully consider their needs before deciding whether or not to implement continuous delivery. Here are some of the challenges of continuous delivery: Culture change: Implementing continuous delivery requires a culture change. This means that developers, testers, and operations teams need to be comfortable using automated tools and processes. Technical challenges: Continuous delivery can be technically challenging. This is because it requires a complex set of automated tools and processes to be in place. Cost: Continuous delivery can be expensive. This is because it requires a number of different tools and services to be purchased. Risk: Continuous delivery can be risky. This is because it involves automating the release of software to production. If something goes wrong, it could lead to a major outage. Here are some tips for overcoming the challenges of continuous delivery: Start small: Don’t try to implement continuous delivery all at once. Start with a small project and gradually scale it up. Get buy-in from all stakeholders: It’s important to get buy-in from all stakeholders before implementing continuous delivery. This includes developers, testers, operations teams, and management. Use the right tools: There are a number of tools available to help with continuous delivery. Choose the tools that are right for your needs. Test thoroughly: It’s important to test thoroughly before releasing software to production. This will help to reduce the risk of errors and outages. Be prepared for failure: Even if you do everything right, there’s always a chance that something will go wrong. Be prepared for failure and have a plan in place to recover. Conclusion Continuous delivery is a powerful tool that can help teams to deliver software more quickly, reliably, and efficiently. However, it is important to be aware of the challenges that may be faced when implementing continuous delivery. By understanding the challenges and taking steps to overcome them, teams can reap the benefits of continuous delivery.

By Aditya Bhuyan

How To Start a Successful Career in DevOps

If you're considering working in DevOps, you're likely aware that it can be a challenging and rewarding field. The good news is that there are several key steps that you can take to launch a successful career. Starting a successful career in DevOps requires more than just technical skills. It's equally important to understand industry-specific knowledge, tools, and methodologies to maximize the impact of your technical skills. One of the most important things you can do is to dive in with determination and adaptability. 1. Understand the Fundamentals of DevOps DevOps is a term that has been gaining popularity in the world of tech. It is a combination of the cultural and technical aspects of the software development process. In simple terms, DevOps is a way of deploying and managing technology. It focuses on collaboration, communication, automation, monitoring, and fast feedback. DevOps brings together the development and operations teams to work on the same projects with the same goals in mind. The main idea behind DevOps is to streamline the software development process, reduce inefficiencies, and improve the speed of delivery. It is a new way of thinking about software development that has transformed a lot of organizations, especially those that leverage distributed systems and cloud computing. 2. Acquire the Necessary Technical Skills Any aspiring DevOps professional must learn the essential DevOps tools such as Jenkins, Docker, Kubernetes, and Ansible or Terraform. Each of these tools provides a unique feature set that allows for better processes and automation in software development. It is important to understand how these components interact with each other to create an efficient system. Infrastructure as Code and CI/CD pipelines are concepts that are the backbone of the DevOps culture and provide the foundation for streamlined and efficient software development processes. Infrastructure as Code ensures the entire infrastructure is programmable, making it easier to manage and reproduce. Continuous integration allows for frequent code changes and testing to be done, ensuring that the code is always functional and improving. Continuous deployment is the heart of DevOps, as it allows developers to release new features and updates in a timely and efficient manner. Without a deep understanding of these core concepts, aspiring DevOps engineers will struggle to succeed in their roles and keep up with the demands of the industry. 3. Gain Practical Experience When it comes to starting a successful career in any technical field, gaining practical experience is essential. Learning theory is only useful to a certain extent. To truly understand the ins and outs of the day-to-day responsibilities, it is necessary to have hands-on experience. One way to gain practical experience is by joining open-source projects or contributing to community initiatives. This offers an opportunity to learn from others while also building a portfolio of work. Pursuing internships or entry-level positions is another great way to gain practical experience. This could help you understand how to work on real projects with an experienced team, further developing your skills. Finally, creating personal projects can serve as a way to practice and demonstrate skills to potential employers. Having a well-maintained GitHub repository also helps you build your online presence. 4. Attend DevOps Events Networking has always been an important aspect of any industry. It allows professionals to broaden their horizons, generate new ideas, and expand their knowledge. As a DevOps professional, you can benefit greatly from attending popular events, webinars, and meetups dedicated to discussing and showcasing breakthrough approaches and practices. Follow a DevOps conferences list to make sure you don’t miss anything happening near you and add the most interesting events to your calendar. These events can provide an opportunity to connect with people who share your interests and aspirations while also honing your skills through engaging in discussions, workshops, and training sessions. You may also discover new technologies, methodologies, and solutions that could help you become more competitive and successful in your work. 5. Continuously Update Your Skills Continuous learning is no longer just an option - it's necessary to succeed personally and professionally. This is where platforms like Udemy, Coursera, and Pluralsight come in handy, offering advanced DevOps courses that help professionals stay up-to-date with the latest industry developments. With the help of these courses, you can continuously update your skills without having to sacrifice your commitments. This point also ties nicely with gaining more practical experience. Consider challenging yourself with new problems to solve to learn even more while building your technical portfolio. 6. Understand the Broader Tech Ecosystem Having an interdisciplinary knowledge base is essential for top performance — not just from a technical standpoint but also from an organizational one in the future. This means that you should have at least a basic understanding of different aspects of technology like networks, databases, coding languages, and tools used in development environments so you can identify potential issues before they become real problems. Moreover, great communication between colleagues who use different technologies is key to successful collaboration within teams—interdisciplinary knowledge is crucial here as well. Understanding the broader tech ecosystem will help you come up with creative solutions to complex problems or develop innovative approaches to address customer needs better than competitors do. In practice, this means keeping active track of new trends, such as AI/ML applications or serverless computing, so that you can adjust strategies accordingly when these new mechanisms are introduced into your organization's systems and subsequently craft smarter solutions faster than ever before. Wrapping Up DevOps can be a challenging yet rewarding path, with endless opportunities to grow and learn. The key to success lies in your determination and adaptability. The world of DevOps is constantly evolving, and the only way to stay ahead of the game is to continuously learn and improve your skills. Launching your career requires taking key steps, such as learning essential coding languages, earning relevant certifications, and gaining practical experience through internships or personal projects. So don't hesitate. Jump in with both feet and embrace the ever-changing landscape of DevOps.

By Mariusz Michalowski

How DevOps Can Cost You Millions if Not Implemented the Right Way

The word DevOps comes from the term development and operations. The development and operations team had their separate functions and objectives. As both teams worked separately, it led to long development hours, smaller batch releases, and unhappy customers. Both teams merged to bring uniformity to speed up the developmental process. Since then, DevOps has become a popular application development and deployment approach for various companies. However, with the increase in companies using DevOps, it has also brought challenges. Many questions remain unanswered: “Where to begin?” What will be the challenges? “How to solve them?” In this blog, you will know the difficulties faced by companies and the solutions for them. Top Challenges of DevOps You Should Be Aware Of The biggest pitfall lies not only in knowing what could be the challenge but also in how to mitigate it. Keeping the above statement in check, let’s discuss some common challenges companies face today. 1. Cultural Adaptation It requires a lot of patience when dealing with such a transformation. Since it is a long process, the workplace undergoes a major shuffle while implementing DevOps. Organizations should ease the atmosphere and maintain positivity in the environment. 2. Shift from Legacy Applications to Microservices Sticking to old technologies could reduce your company’s prospects in the competitive marketplace. You cannot expect a fast development process just by shifting to microservices. The most significant burden that comes with transition is complexity. Organizations should update their software and hardware so that new technologies can co-exist with existing ones. 3. Tools Confusion After implementing DevOps, there is a possibility that developers could be dependent on tools to solve minor issues. It may seem an advantage in the short run. However, it could be detrimental in the long run. Additionally, selecting a new tool requires scrutiny, as they need to meet security requirements and should be able to integrate with the software. 4. The Bottleneck in the SDLC Process The effectiveness of SDLC (Software Development Life Cycle) has a direct relation to the effect on software delivery and deployment. An enterprise can deliver top-notch quality and trusted software if SDLC is carried out systematically. In DevOps, the software is offered in a short time with a higher level of reliability. Having a mature process becomes necessary for the team. However, a few organizations are unable to move forward with the speed of DevOps. 5. Monitoring the Overall Process Companies face issues in adopting DevOps if they follow specific rules and guidelines. DevOps doesn't have any particular frameworks stating procedures that developers should follow to achieve their goals. DevOps consists of various applications, each with its respective parameters to measure. For example, metrics like deployment frequency might deal with CI/CD processes, while Defect Escape Rate is a part of the continuous testing pipeline. How Netflix Mastered DevOps The way Netflix managed to implement DevOps in its work culture is truly remarkable. They didn’t go out to build a DevOps culture, nor did they set predefined rules. Instead, they DevOps culture organically. The turning point for them was the worst outage in their history. In 2008, Netflix was a pioneer in online DVD rental services. During the outrage, one-third of the 8.4 million customers were affected by it. The incident pushed Netflix to move its servers to the cloud and overhaul the entire ecosystem. Netflix successfully converted its data-centric Java application into a cloud-based Java microservices architecture. Netflix: The Chaos Monkey and Simian Army If you have ever used Netflix, you may have noticed that though the software is reliable, the “Recommended Picks” stream will not appear. It happens because the server in AWS that serves it is down. Chaos Monkey Source Despite that, your Netflix does not crash, nor does it have errors. Netflix simply removes the stream or displays an alternative one. Netflix was able to achieve this by introducing a tool called Chaos Monkey. It is one of the first series of tools called the Netflix Simian Army. Chaos Monkey is a script that runs in all environments, causing chaos and shutting down servers. While writing codes, developers are in a constant atmosphere of unexpected outages. The tool provides an opportunity for developers to not only test the server but also incentivize them to build fault-tolerant systems. Simian Army Source After the success of Chaos Monkey, engineers at Netflix built a set of tools to check all kinds of failures and identify abnormal conditions. That's when the Simian Army came into existence. Let’s discuss each of them in brief. Latency Monkey The tool causes false delays in the RESTful client-server communication layer to simulate service degradation and measure if upstream servers respond perfectly. Moreover, creating delays for a longer time can simulate an entire service downtime without physically bringing it down. Latency Monkey was useful while testing new services by simulating the failure without affecting the rest. Conformity Monkey This tool finds occurrences that don’t follow best practices and shuts them down. Let’s say you find something that doesn’t belong to an auto-scaling group. It feels like waiting for something wrong to happen. Conformity Monkey shuts it down and provides the opportunity to relaunch it properly. Doctor Monkey The tool detects unhealthy checks that run on each instance as well as monitors external factors. Then, those occurrences are removed from service after service owners find out the problem. Janitor Monkey It checks that the cloud environment runs without clutter and waste and disposes of unused resources. Security Monkey It is an extension of Conformity Monkey that identifies security violations (such as improperly configured AWS security groups) and ceases offending instances. 10-18 Monkey Short for Localization-Internationalization, it scans configuration and run-time errors in instances serving users in multiple geographical areas with diverged languages and character sets. Chaos Gorilla Like Chaos Monkey, this tool simulates an outage of the entire Amazon availability zone to verify if the services automatically re-balance to the functional availability zones without manual intervention or any visible impact on users. What Can We Learn From Netflix’s DevOps Strategy? Netlfix practices are unique to their work environment and might not be suitable for all organizations. Yet, we can learn some of their software product development strategies in DevOps. In DevOps organizations, the leader must ask, “ What should we do to encourage enterprises to achieve the desired results?” This kind of thinking is required to improve the outcomes in the future. Netflix focuses on providing developers the freedom to solve problems on their own. So, it doesn’t create artificial restrictions on developers of what they need to do. The end goal of DevOps is to be customer-driven and focus on enhancing the user experience with every release.

By Pritesh Patel

Experience vs. Certifications in DevOps: Bridging Theory and Practice

"When hiring for DevOps engineering roles, what matters more—certifications or experience?" This question reverberates through the corridors of countless tech companies as the significance of DevOps engineering roles only grows in the evolving digital landscape. Both elements — certifications and experience — offer valuable contributions to an engineer's career. Certifications such as AWS, CKA, GCP, Azure, Docker, and Jenkins represent the structured, theoretical understanding of the technology landscape. On the other hand, experience serves as the real-world proving ground for that theoretical knowledge. But which of these two carries more weight? Here's an analysis infused with curiosity and passion, grounded in the technical and business realities of our day. The Case for Certifications Certifications provide a clear, standardized benchmark of an engineer's skill set. They attest to the individual's current knowledge of various tools, systems, and methodologies, ensuring their technical prowess aligns with industry standards. For businesses, hiring certified professionals can bring assurance of the engineer's ability to handle specific systems or technologies. This is particularly crucial in the early stages of one's career, where the lack of hands-on experience can be supplemented by formal, industry-recognized credentials. Certifications also speak to an engineer’s dedication to continuous learning — an invaluable attribute in a sector driven by relentless innovation. Furthermore, they can offer competitive advantages when dealing with clients, projecting the organization's commitment to expertise and quality. The Strength of Experience However, while certifications ensure theoretical knowledge, the chaotic, unpredictable terrain of DevOps often demands a kind of learning that only experience can provide. Real-world situations seldom stick to the script. Experience helps engineers tackle these unpredictable scenarios, providing them with a nuanced understanding that's hard to derive from certifications alone. Experience translates into tangible skills: problem-solving, strategizing, decision-making, and team collaboration — all of which are critical to managing DevOps. An experienced engineer can leverage past learnings, understanding when to apply standard procedures and when to think outside the box. The maturing engineer who has faced the heat of critical system failures or the pressure of ensuring uptime during peak loads often develops a tenacity that cannot be simulated in a testing environment. Such experiential learning is priceless and can make a marked difference in high-stakes situations. Perception and Certifications: The "Customer's" View While businesses are right to weigh the benefits of certification against experience, they must also factor in another crucial element — the perspective of "customers," who can be either paying customers in a B2B relationship or internal stakeholders from other teams or departments. Often, these "customers" feel more confident knowing that certified professionals are managing their critical infrastructure. Certifications serve as a validation of a service provider's technical skills, reassuring "customers" of the team's capability to manage complex tasks efficiently. From the "customers'" viewpoint, seeing a certified engineer indicates that the individual, and by extension, the company, has met stringent, industry-approved standards of knowledge and skills. While experience is highly valued, it is sometimes seen as more subjective and challenging to quantify, leading to "customers" placing substantial emphasis on certifications. Certification Renewals and Organizational Goals Certifications, particularly those that require renewals, ensure that engineers stay current with the evolving technology landscape. However, it's important to assess whether pursuing certification renewals aligns with the organizational goals. If a particular certification does not contribute directly to the objectives of a project or the broader organizational strategy, its renewal might not be necessary. The resources spent on such renewals might be better directed toward areas that contribute directly to the organization's mission. The Organizational Benefits of Certification Furthermore, when an organization itself earns certification, such as becoming an AWS Partner or a Kubernetes Certified Service Provider (KCSP), it opens a new realm of possibilities. These certifications not only validate the company's expertise and capabilities but also enhance its market credibility and competitive edge. As an AWS Partner, for example, companies can access a range of resources such as training, marketing support, and sales-enablement tools. They can also avail of AWS-sponsored promotional credits, allowing them to test and build solutions on AWS. Being a KCSP, on the other hand, demonstrates a firm's commitment to delivering high-quality Kubernetes services. This certification also assures "customers" that they are partnering with a knowledgeable and experienced service provider. Such partnerships and certifications can help organizations win more significant contracts, attract more clients, and also retain talented engineers seeking to work with recognized industry leaders. They demonstrate the organization's commitment to industry best practices, continual learning, and staying at the forefront of technological advancements. Bridging the Gap Certifications Experience Provide a structured, theoretical understanding of technology Provide practical, hands-on knowledge Prove an individual's skills against industry standards Offer real-world problem-solving abilities Indicate dedication to continuous learning Display adaptability and tenacity in face of real-world challenges Provide an edge in competitive scenarios Offer insights into effective team collaboration and decision-making It's crucial to remember that neither certifications nor experience can stand alone as the defining factor in DevOps engineering roles. The stage of an engineer's career and the maturity they bring to the role are products of a judicious blend of both. For those at the early stages, certifications can help them stand out and demonstrate a foundational knowledge of DevOps principles. As their career progresses, their accumulated experience, coupled with advanced certifications, exhibits a growth mindset, adaptability, and an in-depth understanding of DevOps systems and practices. Final Thoughts As we draw this discussion to a close, let's return to our initial question: "When hiring for DevOps engineering roles, what matters more — certifications or experience?" Well, we've navigated through the different stages of a DevOps engineer's career, weighed the importance of certification against the gold of experience, and taken into account the perspectives of various "customers." The conclusion is clear: it's not a case of either-or. The debate should not be about choosing one over the other, but understanding how they can symbiotically contribute to an engineer's career. Can we truly measure the importance of the structured learning that certifications offer? Can we quantify the practical wisdom that comes with experience? These are questions we may ponder, but what remains unquestionable is the unique value they both bring to the table. When we consider the perspective of the "customers", who wouldn't want the assurance that their DevOps team is armed with both certified skills and hands-on experience? And for organizations seeking to boost their reputations, why not aspire to hold industry-recognized certifications and partnerships? After all, they enhance market credibility and pave the way for bigger opportunities and promising collaborations. In conclusion, experience is an invaluable asset, a truth universally acknowledged, but the value of certifications — for individuals and businesses alike — should never be understated. Certifications and experience form a powerful combination that assures "customers," motivates teams, and drives business growth in the world of DevOps. The question then is not whether we choose between them, but how we harmoniously integrate both in our practices and operations. No matter where you stand on the spectrum of experience versus certification, remember this: they are not mutually exclusive. Both can coexist, intertwining to form a stronger, more versatile DevOps engineer. For professionals seeking to stay relevant and competitive in the fast-paced world of DevOps, the path forward is clear — embrace both theory and practice. Pursue certifications to keep up with the evolving landscape, and continually hone your skills through hands-on experience. This is the recipe for success in the thriving and dynamic field of DevOps. In the realm of DevOps, the balance between experience and certification is a delicate one, and the pendulum should never swing too far in either direction. Instead, let's allow them to work in concert, building a stronger, more comprehensive understanding of DevOps and its practices. After all, isn't that the essence of DevOps itself — bridging the gap, fostering collaboration, and creating more holistic, efficient, and powerful systems? "Knowledge comes, but wisdom lingers." — Alfred Lord Tennyson

By Rob Newsome

Compliance Automated Standard Solution (COMPASS), Part 6: Compliance to Policy for Multiple Kubernetes Clusters

(Note: A list of links for all articles in this series can be found at the conclusion of this article.) In Part 4 of this multi-part series on continuous compliance, we presented designs for Compliance Policy Administration Centers (CPAC) that facilitate the management of various compliance artifacts connecting the Regulatory Policies expressed as Compliance-as-Code with technical policies implemented as Policy-as-Code. The separation of Compliance-As-Code and Policy-As-Code is purposeful, as different personas (see Part 1) need to independently manage their respective responsibilities according to their expertise, be they controls and parameters selection, crosswalks mapping across regulations, or policy check implementations. The CPAC enables users to deploy and run technical policy checks according to different Regulatory Policies on different Policy Validation Points (PVPs) and, depending upon the level of generality or specialty of the inspected systems, the CPAC performs specific normalization and aggregation transformations. We presented three different designs for CPAC: two for handling specialized PVPs with their declarative vs. imperative policies, and one for orchestrating diverse PVP formats across heterogeneous IT stack levels and cloud services. In this blog, we present an example implementation of CPAC that supports the declarative policy in Kubernetes, whose design was introduced in section 2 of COMPASS Part 4. There are various policy engines in Kubernetes, such as GateKeeper/OPA, Kyverno, Kube-bench, etc. Here, we explore a CPAC using Open Cluster Management (OCM) to administer the different policy engines. This design is just one example of how a CPAC can be integrated with a PVP, and a CPAC is not limited to this design only. We flexibly allow the extension of our CPAC through plugins to any specific PVP, as we will see in upcoming blog posts in this series. We also describe how our CPAC can connect the compliance artifacts from Compliance-as-Code produced using our OSCAL-based Agile Authoring methodology to artifacts in Policy-as-Code. This bridging is the key enabler of end-to-end continuous compliance: from authoring controls and profiles to mapping to technical policies and rules, to collecting assessment results from PVPs, to aggregating them against regulatory compliance into an encompassing posture for the whole environment. We assume the compliance artifacts have been authored and approved for production runtime using our open-source Trestle-based Agile Authoring tool. Now the challenge is how to deal with the runtime policy execution and integrate the policy with compliance artifacts represented in NIST OSCAL. In this blog, we focus on the Kubernetes policies and related PVPs and show end-to-end compliance management with NIST OSCAL and the technical policies for Kubernetes. Using Open Cluster Management for Managing Policies in Kubernetes In Kubernetes, the cluster configuration comprises policies that are written in a YAML manifest, and its format depends upon which particular policy engine is used. In order to accommodate the differences among policy engines, we have used Open Cluster Management (OCM) in our CPAC. OCM provides various functionalities for managing Kubernetes clusters: Governance Policy Framework to distribute manifests to managed clusters (by a unified object called OCM Policy) and collect the status from managed clusters, PolicyGenerator to compose OCM Policy from raw Kubernetes manifests, Template function to embed parameters in OCM Policy, PolicySets for grouping of policies, and Placement (or PlacementRule)/PlacementBinding for cluster selection. Once an OCM Policy is composed from a Kubernetes manifest specific to a policy engine, it can be deployed and compliance posture status can be collected using the OCM unified approach. The OCM community maintains OCM Policies in the Policy Collection repository. However, these policies are published with compliance metadata and PlacementRule/PlacementBinding embedded, making it difficult to maintain and reuse policies across regulation programs without constant editing of the policies themselves, while considering them regulation agnostic. Figure 1 is a schematic diagram of policy-kyverno-image-pull-policy.yaml. It illustrates the OCM Policy containing not only the Kubernetes manifests, but also additional compliance metadata, PlacementRule, and PlacementBinding. Figure 1: Example of Today's OCM Policy. Compliance metadata, PlacementRule, and PlacementBinding are embedded in OCM Policy In order to make the policies reusable and composable by the OCM PolicyGenerator, we decompose from each policy its set of Kubernetes manifests. We call this manifest set “Policy Resource." Figure 2 is an example of a decomposed policy that contains three raw Kubernetes manifests (in the middle), along with a PolicyGenerator manifest and its associated kustomization.yaml (on the right). The original OCM Policy can be re-composed by running PolicyGenerator in the directory displayed on the left. Figure 2: Decomposed OCM Policy C2P for End-To-End Compliance Automation Enablement Now that we have completely decoupled compliance and policy as OSCAL artifacts and Policy Resource, we bridge compliance into policy that takes compliance artifacts in OSCAL format and applies policies (including installing policy engines) on managed Kubernetes clusters. We call this bridging process "Compliance-to-Policy" (C2P). The Component Definition is an OSCAL entity that provides a mapping of controls to specific rules for a service and its implementation (check) by a PVP. For example, we can have a Component Definition defined for Kubernetes that specifies that cm-6 in NIST SP 800-53 maps to a rule checked by policy-kyverno-image-pull-policy in Kubernetes. Then, C2P interprets this Component Definition by fetching policy-kyverno-image-pull-policy directory and running PolicyGenerator with given compliance metadata to generate OCM Policy. The generated OCM Policy is pushed to GitHub along with Placement and PlacementBinding. OCM automatically distributes the OCM policy to managed clusters specified in Placement and PlacementBinding. Each managed cluster periodically updates the status field of OCM policy in the OCM Hub. C2P collects and summarizes the OCM policy statuses from OCM Hub and pushes it as the compliance posture. Figure 3 illustrates the end-to-end flow diagram of the compliance management and policy administration thus achieved. Figure 3: Diagram of end-to-end of C2P with Trestle, OSCAL, and OCM for multiple Kubernetes clusters Figure 3 depicts the end-to-end flow steps as follows: Regulators provide OSCAL Catalog and Profile by using Trestle-based agile authoring methodology (see also COMPASS Part 3). Vendors or service providers create Component Definition referring to Catalog, Profile, and Policy Resource by Trestle (Component Definition representation in the spreadsheet below). Compliance officers or auditors create Compliance Deployment CR that defines: Compliance information OSCAL artifact URLs Policy Resources URL Inventory information clusterGroups for grouping clusters by label selectors Binding of cluster group and compliance OCM connection information The example Compliance Deployment CR is as follows: YAML apiVersion: compliance-to-policy.io/v1alpha1 kind: ComplianceDeployment metadata: name: nist-high spec: compliance: name: NIST_SP-800-53-HIGH # name of compliance catalog: url: https://raw.githubusercontent.com/usnistgov/oscal-content/main/nist.gov/SP800-53/rev5/json/NIST_SP-800-53_rev5_catalog.json profile: url: https://raw.githubusercontent.com/usnistgov/oscal-content/main/nist.gov/SP800-53/rev5/json/NIST_SP-800-53_rev5_HIGH-baseline_profile.json componentDefinition: url: https://raw.githubusercontent.com/IBM/compliance-to-policy/template/oscal/component-definition.json policyResources: url: https://github.com/IBM/compliance-to-policy/tree/template/policy-resources clusterGroups: - name: cluster-nist-high # name of clusterGroup matchLabels: level: nist-high # label's key value pair of clusterlabel binding: compliance: NIST_SP-800-53-HIGH # compliance name clusterGroups: - cluster-nist-high # clusterGroup name ocm: url: http://localhost:8080 # OCM Hub URL token: secretName: secret-ocm-hub-token # name of secret volume that stores access to hub namespace: c2p # namespace to which C2P deploys generated resources 4. C2P takes OSCAL artifacts and CR, retrieves required policies from Policy Resources, generates OCM Policy using PolicyGenerator, and pushes the generated policies with Placement/PlacementBindingto GitHub. GitOps engine (for example, ArgoCD) pulls the OCM Policies and Placement/PlacementBinding into OCM Hub. OCM Hub distributes them to managed clusters. OCM Hub updates the statuses of OCM Policies of each managed cluster. 5. C2P periodically fetches the statuses of OCM Policy from OCM Hub and pushes compliance posture summary to GitHub. An example compliance posture summary: 6. Compliance officers or auditors check the compliance posture and take appropriate actions. As a result of the decoupling of Compliance and Policy and bridging them by C2P, each persona can effectively play their role without needing to be aware of the specifics of different Kubernetes Policy Engines. Conclusion In this blog, we detailed the making of a Compliance and Policy Administration Center implementation for integrating Regulatory Programs with supportive Kubernetes declarative policies and showed how this design can be applied for the compliance management of the Kubernetes multi-cluster environment. Coming Next Besides configuration policies, regulatory programs also require complex processes and procedures that entail batch processing for their validation such as provided by Policy Validation Points which support imperative language for policies. In our next blog, we will introduce another design of CPAC for integrating PVPs supporting imperative policies such as Auditree. Learn More If you would like to use our C2P tool, see the compliance-to-policy GitHub project. For our open-source Trestle SDK see compliance-trestle to learn about various Trestle CLIs and their usage. For more details on the markdown formats and commands for authoring various compliance artifacts see this tutorial from Trestle. Below are the links to other articles in this series: Compliance Automated Standard Solution (COMPASS), Part 1: Personas and Roles Compliance Automated Standard Solution (COMPASS), Part 2: Trestle SDK Compliance Automated Standard Solution (COMPASS), Part 3: Artifacts and Personas Compliance Automated Standard Solution (COMPASS), Part 4: Topologies of Compliance Policy Administration Centers Compliance Automated Standard Solution (COMPASS), Part 5: A Lack of Network Boundaries Invites a Lack of Compliance

By Takumi Yanagawa

Top 7 Best Practices DevSecOps Team Must Implement in the CI/CD Process

Almost every organization has implemented CI/CD processes to accelerate software delivery. However, with this increased speed, a new security challenge has emerged. Deployment speed is one thing, but without proper software checks, developers may inadvertently introduce security vulnerabilities, leading to grave risks to business operations. As a result, most organizations are either making DevOps responsible for ensuring the security of their delivery process or creating a dedicated DevSecOps team with the same goal. In this article, we will discuss the top seven best practices that DevSecOps teams can implement in their CI/CD process to make their software delivery process more secure. Top Security Challenges Faced by DevSecOps The top challenges that DevSecOps teams face when trying to secure the CI/CD process include: Diverse DevOps toolchains in CI/CD processes lead to fragmented security-related data. Evaluation of software risks involves processing a lot of data at various stages – build, test, and deploy. Manual checks to ensure SDLC compliance with numerous incremental software changes every week are overwhelming the DevSecOps team. Vulnerability management is especially challenging because there is no proper mechanism to allow specific low-risk vulnerabilities and keep track of them. Auditing an application at regular intervals is very time-consuming because it requires the DevSecOps team to review system logs manually. To overcome these challenges, DevSecOps teams must rethink their strategies and implement the necessary tools and best practices to ensure security in the software delivery process. 1. Integrate Security Testing Into CI/CD Pipelines The rise of open-source tools and libraries makes applications more vulnerable. Hence, security testing to analyze source code and find security vulnerabilities should be automated. Without proper checks, vulnerabilities can go into production, and applications will be susceptible to attack. A well-known security testing strategy to find source code and binary vulnerabilities is Static/Dynamic Application Security Testing (SAST/DAST), which is offered in scanning solutions such as Sonarqube, Prisma Cloud, HCL AppScan, Jfrog Xray Scanning, and Aquawave. The goal is to seamlessly integrate this scanning technology into the CI/CD pipeline by automatically executing it after the build process is complete. Then, based on the software quality and vulnerabilities, the pipeline proceeds to the next step or fails. 2. Understand the Risk of Applications and Dependent Services DevSecOps must empower application owners to understand the security risks of an application and all its services. Holistic information about the security vulnerabilities of each service, deployment date, and the development team will help owners make decisions faster regarding deployments and delivery. Collecting and centralizing this information should be automated, not manual. 3. Track the Delivery Bill of Materials for Each Software Release Going Through the CI/CD Process The Delivery Bill of Materials (DBOM) is a software supply chain and security management building block. DBOM includes reports related to security risks, quality, performance, and testing, as well as the development and deployment history of the tools used to deliver the application. To expedite the decision-making and delivery process, DevSecOps must attempt to centralize the DBOM. However, doing so in larger software development organizations is cumbersome without a purpose-built solution that includes a dashboard that tracks DBOM, integrates with various DevOps tools in the ecosystem, and provides key information for each phase of the delivery process – Source, Build, Artifact, and Deploy. In the Source phase, for example, the solution could present all the vulnerabilities from the security scanning technology to the application owner. This would enable faster decision-making by the DevSecOps team while keeping all stakeholders in the loop to ensure no exceptions or bugs are introduced without management being informed. In the Build phase, the solution could connect with build or CI tools such as Jenkins or Travis CI to aggregate the data. If there is a policy violation, the DevSecOps team can quickly hinder the progress of a pipeline or inform the individual team to stall the release process. In the Artifact phase, the solution could empower the DevSecOps team with information about vulnerabilities related to dependencies by integrating with tools such as AquaSec, which provides information about supply chain security, malware protection, cloud security, etc. Similarly, in the Deploy phase, the DevSecOps team must get security benchmarking information, such as from the Center for Internet Security (CIS) benchmarking. The system should fetch this data from tools that provide deployment verification and scores related to log analysis, metrics analysis, quality, reliability, and business impact. This information would help the DevSecOps team answer key questions before deciding to roll out software to an environment: Was the right image used in the deployment? What is the risk of the new application with regard to various dimensions? Who approved the deployment? All this verification information from various tools is vital to securely progressing a release. At an operational level, a consolidated dashboard to track the DBOM has several benefits, including: Eliminate the need to fetch information from multiple sources, such as source code systems, CI systems, scanners, etc. Independently monitor software delivery and control it from a security perspective. See various security patterns from an organizational perspective (something that can be significantly different for developers working in silos) and implement policies to improve security posture. Bring various teams and stakeholders under the same umbrella using the DBOM dashboard to discuss and quickly resolve an issue. 4. Make SBOM a Part of Your DBOM for Dynamic Deployments One of the core responsibilities of the DevSecOps team is maintaining a Software Bill of Materials (SBOM), which helps track associated security and license risks. SBOM includes a list of all the open-source and third-party components in a codebase, as well as: A list of names of the licensed application components. The versions of the ingredients used in the codebase. Their patch status Today’s deployments are dynamic, with constantly changing infrastructure and dependencies. DevSecOps teams must automate the tracking and maintenance of SBOM in a single place. As delivery accelerates, teams must be able to react faster regarding each deployment and their respective vulnerabilities. By making SBOM part of DBOM, teams can see supply chain and security-related information in one place and formulate and implement policies faster. 5. Create Policies and Take Automated Actions in CI/CD Once all the information is in place, the DevSecOps team should be able to create and implement policies, such as a policy to prevent a deployment based on a threshold of particular metrics related to code vulnerabilities or security scores. An automated mechanism to create new rules and enforce them in the delivery pipeline should be implemented to attain a risk-free software delivery process. Policy creation tools should easily integrate with CD tools like Spinnaker, Argo, and Jenkins. 6. Central Exception Management for Distributed Teams There may be instances within a specific environment when a scanner and risk-assessment tool misidentify a high risk to specific dependencies or libraries. The DevSecOps team should have a way to identify situations where open-source libraries are not exploitable given the setup and allow developers to proceed with their deployment activities. For example, developers testing applications with open-source libraries in a sandbox environment or Dev instance may introduce vulnerable open-source libraries. But if the environment is in a VPC behind a firewall, then the DevSecOps team may consider allowing those exceptions for a specific period. Having an exception management capability as part of the DevSecOps toolset is critical. 7. Audit and Attestation DevSecOps conducts auditing exercises to ensure applications adhere to SDLC standards and regulations. Instead of manually fetching the information from ten disparate systems, they should be able to automate software auditing in their ecosystem via an audit and attestation dashboard that provides the who, what, and when of pipeline execution and policy violations. The solution should allow internal or external auditors to list, search, and filter on a date, deployment, environment, and event data collected from workflow tasks or pipeline activities and deployments. Conclusion DevSecOps is a new concept with new responsibilities for enforcing security in CI/CD processes. Leading American organizations prioritize security and look to their DevSecOps team to lead the transformation. Due to the speed and scale of the CI/CD process, the DevSecOps team might find themselves in firefighting mode, ensuring security with manual checks of each release process. However, this manual approach at each application level can be overwhelming, leading to the risk of missing key security checks from an organizational perspective. To ensure security and avoid the burnout that comes from manual activities, DevSecOps teams must understand the above best practices and adopt the new tools and technologies necessary to implement them.

By Shashank Srivastava

Continuous Integration vs. Continuous Deployment

The terms Continuous Integration and Continuous Delivery/Deployment tend to be combined into the acronym CI/CD to describe the process of building and deploying software, often without distinction between the two. The terms describe distinct processes, even if combining them suggests that Continuous Delivery and Continuous Deployment are an extension of Continuous Integration and the execution of both processes is the responsibility of a single tool. Assuming CI/CD is just CI with a deployment step ignores some fundamental differences between the two processes. In this post, we look at: The reasons why CI and CD are distinct processes. The features provided by good CD tools. Why you may consider using separate tools for your CI/CD workflow. What Is Continuous Integration? At a high level, Continuous Integration tooling is concerned with: Taking the code written by developers and compiling it into an artifact. Running automated tests. Capturing the log files so any failed builds or tests can be resolved. A Continuous Integration server facilitates this process by running builds and tests with each commit. Continuous Integration servers can be described as solving the equation: code + dependencies + build tools + execution environment = test results + logs + compiled artifact The left side of the equation takes the code written by developers, any dependencies of the code, a build tool, and the environment where the build and tests are executed. When these inputs are available, a Continuous Integration server completes the build to produce the elements on the right side of the equation. When a Continuous Integration server has been configured correctly, each commit to a repository results in the build being run, thus solving the equation without manual intervention from a human. This means the process implemented by Continuous Integration servers is machine-driven, so much so that it's common for Continuous Integration servers to have read-only user interfaces, like the Jenkins Blue Ocean UI. The other important aspect of the Continuous Integration equation is that developers provide the inputs, and the outputs are created for developers or people in other technical roles. Employees outside the IT department rarely interact with the Continuous Integration server. What Are Continuous Deployment and Continuous Delivery? Continuous Deployment takes the compiled artifacts from a successful build performed by the Continuous Integration server and deploys them into the production environment, resulting in a completely automated deployment workflow. In this scenario, Continuous Deployment is quite rightly an extension of Continuous Integration, and the distinction between the two becomes somewhat arbitrary. Such commit-to-consumer workflows are common in simple projects. More complex projects can also have a completely automated deployment workflow if the appropriate tests and monitoring systems are in place. But while fully automated deployments have many benefits, it's not uncommon for deployments to involve human decision-making. There are many valid reasons for not automatically deploying every commit to the main branch into production, including: Coordinating deployments with legacy systems. Acquiring sign-off from product owners. Usability testing that is impossible to automate. Regulatory requirements Dogfooding your own product. Integrating deployments with back-end changes like databases. Not having 100% confidence in your tests. The term Continuous Delivery is used to distinguish workflows that incorporate human decision-making from Continuous Deployment workflows that are fully automated. Where Continuous Integration tooling is machine-driven for many teams, Continuous Delivery is human-driven. Much of the grunt work of performing a deployment is still automated, but the decision to promote a release through to production is a human one. Importantly, the decision may not be made by technical employees but rather by product owners, managers, or someone who stayed up until midnight to click the deploy button. Why Use Separate Continuous Integration and Continuous Delivery Tools? A typical CI/CD pipeline, with no distinction between the two. This slide is from a talk titled How to build cloud-native CI/CD pipelines with Tekton on Kubernetes. It's a classic example of how simple projects merge Continuous Integration and Continuous Deployment into a single process where a production deployment starts as soon as the code has been compiled. There's nothing wrong with this process, and it works as intended if every part of the pipeline remains fully automated. But what happens if a human needs to test and approve the application before it's released? For this decision to be made, the deployment process must be interrupted. For example, we'd first deploy the application to a test environment, allow the appropriate parties to verify the changes, and when everyone is happy, the release is promoted to production. This single decision point means our once machine-driven equation now: Requires a UI to expose the releases that have been made to the testing environments. Introduces auditing and security concerns so we can limit and then review who promoted which releases to which environments. Requires a UI to allow deployments to be promoted to the next environment. Requires a system that can model environments in a first-class manner so they can be reliably secured and managed through the UI, API, and reporting interfaces. This focus on the human element is frequently lost when CI/CD is presented as nothing more than a deployment step, automatically performed after the code has been compiled. For instance, the Jenkins documentation recommends that the test and production environments be modeled as stages in a Continuous Integration pipeline. At first glance, this example appears to provide a point in the process for a human to approve the deployment, but what happens to a build that was never intended to be pushed to production? Such a build would be canceled before the application is exposed to customers, resulting in a failed build. These failed builds are difficult to distinguish from builds that failed to compile or failed their tests, even though not promoting to production is the expected behavior of the Continuous Delivery process in this instance. In short, a good deployment tool facilitates the human decision-making process that is so common (if not essential) to deployments, or at the very least surfaces the current state of the deployments between environments and automates the deployment, so promotions between environments are easy and reliable. Conclusion Recognizing the different requirements between a machine-driven Continuous Integration process and a human-driven Continuous Delivery process is essential for delivering features to your customers in a fast, reliable, and repeatable manner. This is why using dedicated tools for Continuous Integration and Continuous Delivery can make sense. Happy deployments!

By Matthew Casperson

DevOps and Platform Engineering

In this post, you discover where Platform Engineering fits into your broader software delivery process. You see how Platform Engineering works with a DevOps process and why both DevOps and Platform Engineering can help your organization attain high performance. The Quick Version of DevOps DevOps stems from the simple idea of developers and ops working together. This became difficult to do in many organizations because these teams had conflicting goals. Organizations had aligned goals to the specialism of each team. The operations team needed to keep systems stable, while developers had to deliver more value more frequently. When the teams work in isolation, the increase in change from developers lowers system stability. You can see how this might create the conditions for conflict. You could overcome these conflicting goals by having dev and ops work more collaboratively. When people tried this, they found it was possible to deliver more changes in shorter timescales and increase reliability. Over ten years, the vague value statement of "developers and ops working together" grew into a well-defined set of capabilities, thanks to extensive research by Puppet and DORA. The DevOps structural equation model maps the capabilities and relationships found in the research. It was initially described in the book Accelerate, and DORA has continued to update it as part of their research program. The 2021 DevOps structural equation model. This model is helpful for teams looking for improvement opportunities and organizations looking to adopt DevOps and attain the benefits of high performance. You may have seen an older version of this diagram with fewer boxes. As you can see, the 2021 model is packed with ideas for specific capabilities you can adopt to become more DevOps. If you feel overwhelmed, read how to start using Continuous Delivery in our DevOps engineer's handbook. The crucial insight in this model is the importance of culture to your organization's technical performance and its performance against commercial and non-commercial goals. In 2022, DevOps has grown to mean: Developers and ops working together. A well-defined set of technical and non-technical capabilities. Assessing your success using whole-system measures. If you've been around long enough, you might notice that many of the changes that DevOps encouraged look like how we developed systems before dev and ops silos were created. Specialist teams were created for a reason, so as we back up and try another path, you should ensure you resolve those original issues without recreating the nasty side effects. The problem of scale and specialization still exists, so how do we overcome them healthily? Enter the Platform Engineering team. Platform Engineering Despite many new teams and job titles springing up around DevOps, the Platform Engineering team is, perhaps, the most aligned with the mindset and objectives of DevOps. Platform teams work with development teams to create one or more golden pathways representing a supported set of technology choices. These pathways don't prevent teams from using something else. Pathways encourage alignment without enforcing centralized decisions on development teams. Rather than pick up tickets, such as "create a test environment," platform teams create easy-to-use self-service tools for the development teams' use. A critical part of Platform Engineering is treating developers as customers, solving their problems, and reducing friction while advocating the adoption of aligned technology choices. For example, say your organization has plenty of experience running MySQL databases and has worked out how to solve issues such as: Scaling Backups Maintenance Security Replication Deployments Test databases A team choosing MySQL will get all these for free at the push of a button. Another team might still need to use something different, but they'll be responsible for their selections when they step off the pathway. Choosing a golden pathway accelerates your software delivery, lets you focus on the differentiating work, and gives you a support channel when things go wrong. Your time as a developer is best spent on the features that provide value to your customers, not setting up builds, environments, and other similar activities. Platform Engineering can make many tasks easier: Build pipelines Test and production environments Automated deployments Test frameworks Logging and monitoring Security features Platform Engineering reduces your operations burden when you scale up your software delivery team. You need fewer of these hard-to-find platform engineers overall, and by working on a platform team, they can make more impact than they could if they were embedded in development. Platform Engineering helps your organization scale its software delivery without losing some of the best small-team benefits. DevOps and Platform Engineering As you can see, Platform Engineering complements, rather than competes, with DevOps. To provide further proof of this positive relationship, the Puppet State of DevOps report found that DevOps high-performers are more likely to have Platform Engineering teams than low-performers. Category % with Platform Engineering Low 8% Mid 25% High 48% Platform Engineering alone doesn't provide a complete organizational view of performance. The DevOps structural equation model shows us capabilities for leadership, management, culture, and product that are outside a platform team's scope. This is why Platform Engineering belongs in a broader process, such as DevOps, rather than offering a replacement for one. Used together with DevOps, Platform Engineering is an excellent tool for scaling your software delivery capability. DevOps wants you to: Measure the performance of the whole system. Shorten and amplify feedback loops. Create a culture of continuous learning and improvement. Platform Engineering wants you to: Smooth the development experience. Create tools and workflows that enable self-service. Make it easy for developers to achieve system quality attributes (such as performance, observability, and security.) Conclusion As you grow your software delivery team, you must carefully manage the complexity of scale. Some organizations limit complexity by limiting a team's autonomy, but Platform Engineering provides a mechanism that tames complexity while preserving development team autonomy. Happy deployments!

By Steve Fenton

DevOps and CI/CD

DZone's Featured DevOps and CI/CD Resources

Top DevOps and CI/CD Experts

The Latest DevOps and CI/CD Topics