Does this happen to you a lot? You write a shell script and try to run it and it exits halfway due to an error. You patch the error and run the script again. But half the actions in your scripts inevitably fail because they have already been executed in your system. In order to create a great application, you need to write idempotent software.
Idempotency was first introduced by Benjamin Peirce in 1870 based on a mathematical concept but the term now applies in multiple areas of Programming, REST APIs, DevOps, and IT administration fields.
An Idempotent script can run on the same system multiple times, and the results will always be the same. In a script, a resource is specified which then determines the actions to be performed on the system state. An idempotent script ensures that activities are not performed unless the resources have changed and any activities that are done, lead to the same result each time.
Most of the time Great scripts are written in an idempotent way, especially if the script is used in a distributed system, where some critical actions should be done exactly once and that critical action is a part of the script.
So an idempotent script can help us execute the critical task only once even if the script ran several times. Let me try to put this in a simpler way. Idempotency is a behavior where an activity even if performed multiple times will have no tangible effect on the overall system. It will only be performed once.
Let’s see an easy example,
Let’s say we need to create a directory and move some files in it before doing something critical. So if we use mkdir directly, it will create the folder and may run the whole script properly for the first time. But what happens when you run the script again? The folder exists already and the script fails to run.
But if we are to use an idempotent script, it will check first if the directory is already created and if not then only run the script to make the directory.
So in this case we use mkdir -p because the p tag helps us to make sure that mkdir won’t give an error if the directory exists.
The same thing happens when you remove the non-existent file using rm command. If you don’t use rm -f command, the remove command fails because of a non-existent file.
Another example :
Let’s say we have a function(x) and after we execute the function it will perform a set of operations and the result is y. Now if we run the function(x) again the result will be the same but not the set of operations it executed. Calling function(x) a second time will have the same result as calling it the first time.
Most of this information and tips are already known, but they can be easily overlooked without even thinking about it when we write Bash scripts. Some of these are very basic (such as mounting or formatting), but as we have shown, it is often helpful to build idempotent and robust applications in the long term. Nevertheless, learning them is useful.
An idempotent script needs to be able to verify the current state and act accordingly. For that, some parts of the script must check the system state if some actions are already performed, verifying the actual required state.
Benefits of idempotent software in automated configuration management
Now this is where that real idempotent script shines. Whenever the current status of the system is in question, the whole script can be run again, returning the system to a known state.
Aaron Hunter a software architect from Florida says “ Idempotency, if implemented as a true specification, would also enable security, auditing, monitoring, and many other tasks. If I specify the state of a system, for example, any deviation from that state is a potential security issue.”
So as he says with true idempotent script/software we can not only configure a VM but also create the specification of that system and check that specification in a timely manner and keep its original specification.
Traditionally the system admins may configure a highly defined and audited workload in the VM using DevOps tools like jenkins, chef or puppet. However, without an idempotent tool, the administrator can’t be sure if the same results will occur every time.
Without idempotent tools, some configuration management tools can cause critical issues because of phantom script fragments. That affects memory and storage.
In many cases, idempotency is not a best practice rather it’s a fundamental behavior that must be applied to keep the system safe and robust. In design best practices we need to ensure the state of an object does not change due to the unintentional repetition of an activity. Idempotent systems do conform to this rule and make it safe from unintentional retry activities. As we all know retry activities can be triggered for many reasons including network and system faults.
However, idempotent systems have some weakness in them as they know the desired state, and will do what is necessary to make the system return to the desired state. So if the administrator manually applies some easy fix on the system, in an idempotent configuration this manual change is like a bug in the system. So, that fix needs to go through a complex process to be in the idempotent tool itself. It enhances the overall overhead of an already convoluted process.