Comment 12 for bug 1770877

Revision history for this message
Alex Murray (alexmurray) wrote :

I reviewed tracker-miners (2.0.4-2) as checked into disco. This is not
a full security audit but rather a quick gauge of maintainability.

tracker-miners is used to automatically crawl and index a users documents and
other files, extracting useful metadata from the documents, images, videos and
other various file types and storing it in the main tracker store so it can be
queried by other applications (gnome-documents, nautilus etc)

- Build dependencies:
  - libglib2.0-dev, libgstreamer1.0-dev, libgstreamer-plugins-base1.0-dev,
    libstemmer-dev, libtracker-miner-2.0-dev, libtracker-sparql-2.0-dev,
    libpoppler-glib-dev, libgsf-1-dev, libexif-dev, libgexiv2-dev, libpng-dev,
    libtiff-dev, libvorbis-dev, libflac-dev, libtotem-plparser-dev, zlib1g-dev,
    libexempi-dev, libxml2-dev, libupower-glib-dev, libenca-dev,
    libiptcdata0-dev, valac, libgif-dev, libgxps-dev, libosinfo-1.0-dev,
    libtagc0-dev, libcue-dev, libseccomp-dev, dbus
  - Note this includes valac however at no point in the build log does this
    appear to actually be used / executed - should this be removed?

- No CVE history itself but has been used to leverage other vulnerabilities
  automatically in the past
  https://scarybeastsecurity.blogspot.com/2016/11/0day-poc-risky-design-decisions-in.html
  - this is now mitigated by a seccomp sandbox - see below and
    https://blogs.gnome.org/carlosg/2016/12/08/oh-the-security/ for some
    discussion of this

- preinst script to remove old obsolete conffile /etc/sysctl.d/30-tracker.conf
- postinst script on configure to increases the maximum number of user inotify
  watches via sysctl
- three user systemd unit files, dbus-activated and auto-started on login
- no system dbus services
- no setuid files
- no binaries in PATH
- no sudo fragments
- no udev rules
- tests run during the build, quite a few tests, but I'm unsure of coverage
- no cron jobs
- clean build log

- one instance of spawning a process, to spawn gunzip to extract compressed
  postscript files; it uses g_spawn_async_with_pipes() with an array for argv
  and reads the decompressed postscript from stdin of the spawned process
- memory management looked careful
- extensive file IO - all opened read-only and a bunch of custom parsers for
  different file formats, as well as piggybacking of external libs to extract
  metadata
- logging looked fine
- Uses environment variables: TRACKER_USE_CONFIG_FILES, TRACKER_EXTRACTORS_DIR,
  TRACKER_EXTRACTOR_RULES_DIR, TRACKER_BUS_TYPE,
  TRACKER_TEST_DOMAIN_ONTOLOGY_RULE, TRACKER_DB_ONTOLOGIES_DIR,
  TRACKER_USE_LOG_FILES, TRACKER_VERBOSITY, LANG, HOME
  - Will also try and evaluate anything that looks like an environment variable
    within the list of user configured directories to index - so if user
    specified "$HOME/Downloads" this will evaluate $HOME etc
- uses ioctl() to read FAT filesystem attributes to check if is marked hidden
  or not
- Does not use cryptography
- DBus is used to communicate back to main tracker store with results of
  extracted files
- Does not use webkit
- Temporary file handling uses g_mkstemp_full() during atomic writeback to
  files
- Does not use javascript
- No cppcheck errors
- Does not use polkit

- Doesn't use bubblewrap / AppArmor to sandbox extractors which would have been good to see however
  - Does use libseccomp to setup a reasonably tight syscall filter - only
    allows open for read, not write, and socket() for unix sockets only,
    doesn't permit exec*() so should generally be pretty safe
    https://gitlab.gnome.org/GNOME/tracker/commit/9a924b172074aa15b6c8e14bfc442d3d03cafb6b
    https://gitlab.gnome.org/GNOME/tracker/commit/5b4132cd2f8528800c187df20336b67063adf2e3
  - Good discussion of seccomp vs bubblewrap on upstream bug
    https://bugzilla.gnome.org/show_bug.cgi?id=764786
  - Also most extractors don't use full libs to extract
    - Some are hand-coded and just look for metadata within docs - eg. PS is
      hand-coded and doesn't try and interpret the whole doc just extracts a
      few particular key elements
    - Others use libs (eg poppler for PDF, giflib, libtiff etc) but again since
      doesn't usually try and do full render / interpretation of doc is safer -
      for instance is immune to the crash in poppler in LP #1803059
  - Would be interesting to try and use AppArmor to confine extractors - at
    least to use a similar profile to evince-thumbnail to restrict access to
    sensitive bits from HOME etc

- ACK from security team to promote to main.