--- title: System images layout: docs permalink: /docs/system-images/ --- This documentation's intent is to share what we currently know about making system images build reproducibly: for example, VM and cloud images, live systems, OS installer ISO images. General problems of reproducible system images ============================================== Usually the problems are: * Filesystem needs to be created *at once* * Filesystems have creation and/or modification timestamps * Filesystems containing UUID or a label which are not set explicitly * Included files have timestamps * Included files may be generated or updated at build time in a non-reproducible manner * The bootloader which is integrated might have timestamps * The integrated initramfs images may be built non-reproducibly while building the system image * Timestamps which don't depend on `SOURCE_DATE_EPOCH` are not reproducible. E.g. if the filesystem has been created by `mkfs.ext4`, mounted, modified and saved as an image, it would not be reproducible: the allocation of the inodes is undefined. Thankfully there are known solutions to most of these problems, read on. ext2, ext3, ext4 ================ Instead of using `mkfs.ext`, `make_ext4fs` can be used. `make_ext4fs` is creating the whole filesystem at once. `make_ext4fs -T ` allows to set the mtime to `$SOURCE_DATE_EPOCH`. `make_ext4fs` should include the following commit: * https://git.lede-project.org/?p=project/make_ext4fs.git;a=commit;h=bd53eaafbc2a89a57b8adda38f53098a431fa8f4 ISO filesystem ============== When building ISO filesystems with `xorriso`: - use a recent versions of xorriso which honors [`$SOURCE_DATE_EPOCH`](https://reproducible-builds.org/specs/source-date-epoch/) for various ISO image metadata - pass `$SOURCE_DATE_EPOCH` to xorriso's `--modification-date` to clamp all mtimes. It might also be necessary to: * pass a fixed value to `isohybrid --id` * ensure initrd images are built reproducibly () SquashFS metadata & compression =============================== When compressing SquashFS images, metadata and compression can make the output unreproducible. When building SquashFS images, older versions of the tools sometimes yielded unreproducible results. A good `mksquashfs` will: * Honor $SOURCE_DATE_EPOCH for various timestamps * 'Clamp' content timestamps to $SOURCE_DATE_EPOCH * Not reorder fragments based on multithreading conditions [squashfs-tools](https://github.com/plougher/squashfs-tools) 4.5.1 (in Debian *bookworm*) is should work here, having absorbed important features from [squashfskit](https://github.com/squashfskit/squashfskit). Root filesystem content ======================= A system image often contains a root filesystem, generated during the build, and packed in some format such as SquashFS. Exclude unneeded files ---------------------- A number of files can simply be emptied or excluded when creating the root filesystem image (some to optimize size, some because they are not needed). E.g. Tails does this: - - Beware: this can have hard to predict consequences. For example, Tails considered dropping even more stuff - such as the fontconfig cache -, but they've seen unexpected results and performance issues when doing so and finally discarded the idea. Files metadata -------------- The build process for a system image often creates or updates files, which generates file metadata that depends on when the build is performed. One approach that's been used successfully to fix this problem is to clamp mtimes of files in the root filesystem to `$SOURCE_DATE_EPOCH` before generating the system image. Files generated or updated at build time ---------------------------------------- Package managers such as **dpkg** or **RPM** support `postinst` scripts and triggers that are run on the target system after unpacking a package, e.g. to generate or update caches and indices… potentially in a non-deterministic manner. In order to counter this, one possible approach is to replace these scripts and do the same work later (e.g. at first boot). Another approach is to ensure these scripts generate/update files in a reproducible manner. This approach has the advantage of addressing the root cause of the problem and fixing it for every project that uses these programs. For example such problems were fixed in: - `/etc/kernel/postinst.d/apt-auto-removal`: - `/etc/shadow`: - fontconfig cache: , - gdk-pixbuf's `loaders.cache`: , - `giomodule.cache`: - GTK+ `immodules.cache`: - `/usr/share/applications/mimeinfo.cache`: - `/var/cache/cracklib/src-dicts`: - `/var/lib/gconf/defaults/%gconf-tree-*.xml`: , . Finally, occasionally one may want to use `strip-nondeterminism` to normalize the content of such files. gettext ------- GNU gettext's POT, PO and MO files were an interesting challenge. One way to approach this problem is to: * only update POT files when it is really needed, e.g. if the POT-Creation-Date field is the only change after refreshing a POT file, the file doesn't need to be updated; * avoid updating PO — and thus MO — files when only comments (e.g. line numbers) have changed.