TLDR; Caching is the most efficient way to increase your build speed. Optimize your workflows. Talk to us today.
Caching helps optimize CI/CD build times with a minimal amount of effort — and yet, our research has shown that few developers know how to cache, or what the true benefits of caching are. Long build times slow down development, hamper your release frequency, and negatively impact the user experience — affecting your bottom line. One solution to this is caching.
In this article, we cover topics like what cache is, how caching works, how it can speed up your build times, why you should cache your builds, and more.
CI/CD caching with Bitrise is a series of articles that takes you through all there is to know about caching. Other caching articles in the series currently include Dependency caching — then vs now. and Dependancy caching with Bitrise.
What is a cache and what does caching mean?
You may be wondering what a cache is and what it means to cache data. In broad terms, caching is a data-management heuristic that allows you to reuse a previously created piece of information (such as a file) instead of creating it again (such as downloading from a remote server). By doing this, data can be served up faster, and computations run quicker, as it eliminates the need to recreate something again and again. If you think about it, a cache — in its simplest form — is simply temporary data storage.
In the context of mobile CI/CD systems, caching relates to moving data between isolated builds. By definition, each CI build runs in ephemeral, isolated virtual machines. This means that a typical CI workflow needs to take extra steps to bootstrap the environment that is normally already available on a developer’s local machine. This bootstrapping includes installing CLI tools, downloading 3rd party dependencies, and fetching the source code. These operations take precious execution time, so correctly caching any of these operations makes your CI workflows faster.
To illustrate this is the image below. Without a cache, builds are completely isolated from one another; Build#1, Build#2, and Build#3. However, with caching, data can be shared across builds, reducing execution time and making CI validation faster.
Caching and the cooking concept: How it helps speed up build times
In “Caching – An Introduction”, GeeksforGeeks explained caching by means of a cooking concept.
“Let’s say you prepare dinner every day and you need some ingredients for food preparation. Whenever you prepare the food, will you go to your nearest shop to buy these ingredients? Absolutely no. That’s a time-consuming process and every time instead of visiting the nearest shop, you would like to buy the ingredients once and you will store that in your refrigerator. That will save a lot of time. This is caching and your refrigerator works like a cache/local store/temporary store. The cooking time gets reduced if the food items are already available in your refrigerator.
The same things happen in the system. In a system accessing data from primary memory (RAM) is faster than accessing data from secondary memory (disk). Caching acts as the local store for the data and retrieving the data from this local or temporary storage is easier and faster than retrieving it from the database. Consider it as a short-term memory that has limited space but is faster and contains the most recently accessed items. So If you need to rely on a certain piece of data often then cache the data and retrieve it faster from the memory rather than the disk.
In a similar way, if you want to speed up your build times, you should store all the required content ‘locally’, just like someone would store cooking ingredients in a fridge. This way, all your content is close by and on the same system, meaning they’re readily available when you’re starting your builds. 'Content' includes things like your source code and other software dependencies*.
*A software dependency is a code library or package that is reused in a new piece of software. It contains many built in functionalities that can be used directly in your software. Developers use dependencies to avoid having to reinvent the wheel, to speed up our coding process, save time and increase efficiency.
Thinking about this in practice: When you run your builds on your local system, the source code is already available locally. However, software dependencies may not be as these are usually developed and published by other developers on the internet. You thus need to download dependencies before starting your build - which can take up a lot of time.
With caching, on the other hand, you can save downloaded dependencies to use when needed. So, when you run your CI build, dependencies are automatically restored (from low-latency storage) and the need to download them off the public internet is eliminated. This helps optimize your workflow and saves you time.