Decreasing CI Build times up to 50% by caching derived data using github actions.

We had a problem. Our CI pipeline was increasingly becoming a bottleneck in our iOS continuous integration. We here at Kogan like to develop at a fast pace, however we were constantly being held up waiting for builds to complete, leading to a lot of frustration within the team. The rest of the engineering team had switched to using Github Actions(GHA), and with us still using CircleCI, it was time for us to make the change. This was the perfect time for us to re-evaluate how our pipeline was working, to ensure it was the most efficient that it can be. With a build time of over 30 minutes currently, there was a lot of room for improvement.

As we were making a switch, we brainstormed some ideas of ways to improve the overall efficiency of our pipeline, and we kept returning to Derived data. This is how Apple handles build caching within Xcode, but could we use this within our CI Pipeline? Through a bit of investigation, it turns out we weren’t the first to have this thought, and we used this blog post (https://michalzaborowski.medium.com/circleci-60-faster-builds-use-xcode-deriveddata-for-caching-96fb9a58930) as a base for our improvement.

So, where to begin? We first started by replicating our current CI pipeline to GHA, which was pretty smooth other than a few challenges trying to access some of our private repositories. Our build times with this switch had slightly improved, but were still regularly more than 30 minutes to complete. We were already caching the swift packages we use in the project, however there was still plenty of room for improvement.

First we need to ensure that we have fetched the latest changes to the repository, which can be done simply by using the option fetch-depth: 0 on the checkout action in our existing initial step.

  • uses: actions/checkout@v4 with: token: ${{ secrets.GITHUB_TOKEN }} fetch-depth: 0

We then need to cache the derived data. We need to do this in two parts - caching the derived data when a pull request has been successfully merged, and then also restoring the latest derived data cache to the CI pipeline whenever a pull request is opened.

In order to identify the latest develop commit, we use the GHA marketplace action which finds and creates a variable to be used for the latest develop commit SHA.

  • name: Create variable for the nearest develop commit SHA uses: nrwl/nx-set-shas@v3 with: main-branch-name: 'develop'

Then, we need to create a separate pipeline, which will be used to save the derived data whenever a pull request is successfully saved to develop. This will be a very similar flow to our original however the difference will be that we save the cache at the end like the below. This will cache the tmp/derived-data file (which we have set to be the location of derived data in fastlane) to be stored against the latest develop commit SHA.

  • uses: actions/cache/save@v3 name: Save Derived Data Cache with: path: tmp/derived-data key: v1-derived-data-cache-${{ steps.setSHAs.outputs.head }}

Next we need to get the correct cached derived data in our CI pipeline for pull requests. We need to again use the latest develop commit SHA to find the correct derived data cache. We use the restore version of the same action used above in order to find the right cache. This will either find a cache with an exact match, or it will fall back and use the most recent derived data with a partial match.

  • uses: actions/cache/restore@v3 name: Restore Derived Data Cache with: path: tmp/derived-data key: | v1-derived-data-cache-${{ steps.setSHAs.outputs.head }} v1-derived-data-cache-

Similar to the mentioned blog post, GHA will also set the last modified time to be the time that the file was cloned. As Xcode is using this time, we need to update this in order to take advantage of the derived data caching. We managed to find a GHA marketplace action which allowed us to do this.

  • name: Update mtime for incremental builds uses: chetan/git-restore-mtime-action@v2

Last but not least, we need to set the IgnoreFileSystemDeviceInodeChanges=YES in order to ensure Xcode does not consider our cached derived data to be out of date.

  • name: Set IgnoreFileSystemDeviceInodeChanges flag run: defaults write com.apple.dt.XCBuild IgnoreFileSystemDeviceInodeChanges -bool YES

Now that is all complete, we have successfully sped up our CI Pipelines, and decreased our build times by up to 50%. Before we started with CircleCI we were regularly exceeding 30 mins, and after caching derived data and switching to GHA, we got our builds down to roughly 15 mins. This is a massive improvement and has definitely made us developers much happier!

Looking forward we do want to carry on improving our pipeline, and are always looking for ways to keep it up to date, and as fast as it can be. One problem we have encountered using this caching method is that there is no way to clear cache fully and force the build to run without the cached data in case of any build problems other than manually deleting each cache individually. This can be time consuming so we would like to investigate this further and try to find ways to mitigate this.