--- layout: event_detail title: Collaborative Working Sessions - Filtering diffoscope output event: hamburg2023 order: 301 permalink: /events/hamburg2023/filtering-diffoscope/ --- Goal: add patterns to filter out some parts of output, or filters to only show some parts of output Requirements: - print info that parts output are being ignored - indicate in return code that files are not identical A number of options exist: - `--exclude` - `--exclude-command=REGEXP`: this skips command matching REGEXP (`--exclude-command '^readelf.*gdb_index'`) but then diffoscope tries the next command, possibly falling back to hexdump comparison - output formats: `--json`, `--html`, `--htmldir`. Multiple output formats can be use together. - `--load-existing-diff FILE`. Diffoscope will produce all kinds of output from JSON. This can be combined with 'jq' filtering or some other way to filter. Internally, state is a series of deeply nested dictionaries. The comparator is called with a paths of keys. Issues about --exclude* already exist: https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/130 https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/53 https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/52 Filtering by "output level" is not enough. For example, in an RPM header, some specific fields should be ignored, but only those. Idea: provide a command to filter the output using a jq-like path.