2023-10-06

[Technology] Element and Signal crashing on notification closure :(: novel-edition

I run Fedora and have messaging app Element (and Signal) installed via Flatpak.  Last week, I upgraded to versions that upped their base dependencies on its runtime org.freedesktop.Platform and base image org.electronjs.Electron2.BaseApp from 22.08 to 23.08.  Since, clicking on notifications have made Element go kaboom :(  

 I decided to see if I could identify the problem and submit a fix.  (Spoiler: someone else has solved it first, but I learned a lot along the way).

 tl;dr: filed some issues, and learned new things about Flatpak

0. tl;dr #2

In the general case, following debugging instructions from these two resources has been sufficient in the past:

Basically, for im.riot.Riot, I would normally want to:

  • $ flatpak install im.riot.Riot.Debug   # install .Debug extension
  • $ flatpak run --devel --command=sh im.riot.Riot # run a shell in devel mode for debug info and tools like 'gdb'

In this case, that was insufficient because the debugging information was in the Electron base layer, so I ultimately rebuilt Element locally, specifying the electron .Debug extension in Element's YAML file's "base-extensions:" list, and spent too much time figuring out how to install my locally-built Flatpak's own .Debug extension from my build directory.  (See "Detour 2.1" below for details.)

Here are all my steps in case that helps someone else:

1. Investigate

Check for existing bug reports

First I wanted to look to see if another user/developer had already noticed/identified the issue.  I thought the problem might be specific to the its Flatpak package, given the constraints on how it can interact with the host system (e.g portals).   So my first stop was Flathub, going to Element (im.riot.Riot)'s app page:

Detour 1: broken app pages

That led to the above issue getting filed, as visiting Element's app page .  No one directly replied but I was flattered to see it referenced in the #flathub:matrix.org chat:

Check for existing bug reports, take 2

So for now I just googled and found that Flathub Flatpaks have their package repos stored under Flathub's github like:

There wasn't an existing issue created for this, so I made one!

Finding the change that broke things

First I had to see if I could find the version/change that introduced the problem.  I referred to Flatpak's bisect documentation but for reasons that I will investigate later, `flatpak bisect` didn't actually work for me, so I went the classic route of ... clone the package repo and bisect it myself!

Building locally

I referred to more documentation (Building your first Flatpak (even though this is like my third!))

~$ git clone https://github.com/flathub/im.riot.Riot.git
[~]$ cd im.riot.Riot/
[im.riot.Riot]$ git clone https://github.com/flathub/shared-modules.git
[im.riot.Riot]$ flatpak-builder build-dir im.riot.Riot.yaml
error: org.freedesktop.Sdk/x86_64/23.08 not installed
Failed to init: Unable to find sdk org.freedesktop.Sdk version 23.08

:(  Let's install our prerequisites:

$ flatpak remote-add --user --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo
$ flatpak install --user --no-related  org.freedesktop.Sdk/x86_64/23.08
$ flatpak install --user --no-related  org.freedesktop.Platform/x86_64/23.08
$ flatpak install --user --no-related  org.electronjs.Electron2.BaseApp/x86_64/23.08

Notes:

  • remote-add: for my test environment, flathub exists already for the system, but I just want to test this locally, so adding it for user.
  • --no-related: I use this to skip Locale extensions that are very large and likely unnecessary for testing (in this case, they're going to get installed anyway, ah well)

Now it basically builds!

Manually bisecting

I added a branch tag to im.riot.Riot.yaml to mark it as my debugging version:

app-id: im.riot.Riot
branch: dbg
base: org.electronjs.Electron2.BaseApp
...

Repeat this a few times:

$ git log       # view changes and commit IDs
$ git checkout <COMMIT_ID>    # test a commit
$ flatpak-builder --user --install --force-clean build-dir im.riot.Riot.yaml 
$ flatpak run im.riot.Riot/x86_64/dbg    # run my dbg branch to test

Testing in this case saw me messaging one of my Matrix accounts from another to generate the notification, clicking, and seeing if it crashed.

This quickly narrowed it down to this:

While the commit name references the org.freedesktop.Platform runtime change, spoiler: the issue is more related to the org.electronjs.Electron2.BaseApp base image change to 23.08.

 This led to issue 406 (above) being filed, but while working on this, a friend sent me a message via Signal (also based on Electron) and that crashed when I clicked its notification!  Uh oh.

Check for existing bug reports, Signal-edition

In Signal's case, someone had already filed an issue:

They had identified the issue as existing in libnotify (rather than specifically how the Flatpak used it), and had already filed an issue there:

2. Debug

Finding the change that broke things, part 2: libnotify edition

 So from there I decided to try to debug the source of the crash.  First I tried installing the debug info for Element (im.riot.Riot) and running it in devel mode, so I could use gdb to catch the crash and look at the backtrace:

$ flatpak install --user im.riot.Riot.Debug
$ flatpak run --user --devel --command=sh --devel im.riot.Riot
[📦 im.riot.Riot ~]$ cd /app/bin
[📦 im.riot.Riot bin]$ ./element &
[📦 im.riot.Riot bin]$ gdb --pid=<PID>
...
Enable debuginfod for this session? (y or [n]) y
...
(gdb) continue
...
Downloading separate debug info for /app/lib/libnotify.so.4
(element-desktop:6): libnotify-WARNING **: 18:22:08.633: Running in confined mode, using Portal notifications. Some features and hints won't be supported
Thread 1 "element-desktop" received signal SIGSEGV, Segmentation fault.
0x00007f945f8d1418 in ?? () from /app/lib/libnotify.so.4
(gdb) bt
#0  0x00007f945f8d1418 in  () at /app/lib/libnotify.so.4
#1  0x00007f946f4fb4ea in g_closure_invoke (closure=0x173001ee90b0, return_value=0x0, n_param_values=4, param_values=0x7ffd598a9cb0, invocation_hint=0x7ffd598a9c30) at ../gobject/gclosure.c:832
...

Huh, despite installing the .Devel extension for Element, running it using --devel, and even having gdb try to download debug info for libnotify.so.4, there are no symbols.

I took a peek inside my Flatpak to see that running with --devel meant /app/lib/debug/ did have debug info for various libraries and packages, but not libnotify.

Note:

  • for PID above, element launches multiple element-desktop processes, but the one I want is the following invocation, which abrt was kind enough to identify when I first encountered this problem:
    • /app/Element/element-desktop --enable-wayland-ime --ozone-platform-hint=auto --enable-features=WaylandWindowDecorations,WebRTCPipeWireCapturer 

Detour 2.0: the quest for libnotify debugging symbols

So I tried a variety of things and did a lot of reading.

Looking around, I saw that libnotify wasn't directly included in im.riot.Riot but in its base layer, org.electronjs.Electron2.BaseApp:

  • https://github.com/flathub/org.electronjs.Electron2.BaseApp/

I went back to #flatpak:matrix.org and asked around and barthalion helpedfully noted that apps using Electron usually didn't retain their debugging symbols.

I tried to build a local version of that under my own dbg branch, and then rebuilt my local dbg build of im.riot.Riot using that dbg branch of Electron2, but they were still missing.  Hmm.

Detour 2.1: successfully including debugging symbols from a base image

After some more reading, I found this very helpful blog post by Redhat's Stefan Hajnoczi, and a helpful digging deeper into Flatpak builder docs:

So I was finally able to test Element with libnotify debugging info by:

  1. Adding the following to my im.riot.Riot.yaml, telling it to actually use the Debug extension for org.electronjs.Electron2.BaseApp (using the flathub version, rather than my dbg build):
    base-extensions:
    app-id: im.riot.Riot
    branch: dbg
    base: org.electronjs.Electron2.BaseApp
    base-version: '23.08'
    base-extensions:
      - org.electronjs.Electron2.BaseApp.Debug
    runtime: org.freedesktop.Platform
    ...
  2. actually locally installing the Debug extension of my locally-built dbg build (thanks to Stefan Hajnoczi's blog post above!).  From my local git repo,
    [im.riot.Riot]$ flatpak install --user im.riot.Riot.Debug
    [im.riot.Riot]$ flatpak install --user --reinstall --assumeyes "$(pwd)/.flatpak-builder/cache" im.riot.Riot.Debug

Repeating earlier steps to run gdb and catch the crash, I got to:

...
[New Thread 0x7f61bff1b6c0 (LWP 184)]
(element-desktop:4): libnotify-WARNING **: 18:20:44.116: Running in confined mode, using Portal notifications. Some features and hints won't be supported
Thread 1 "element-desktop" received signal SIGSEGV, Segmentation fault.
0x00007f61b33cb418 in close_notification (reason=NOTIFY_CLOSED_REASON_DISMISSED, notification=0x25d801795f20) at ../libnotify/notification.c:708
708            if (notification->priv->closed_reason != NOTIFY_CLOSED_REASON_UNSET ||
(gdb) bt
#0  0x00007f61b33cb418 in close_notification (reason=NOTIFY_CLOSED_REASON_DISMISSED, notification=0x25d801795f20) at ../libnotify/notification.c:708
#1  proxy_g_signal_cb (proxy=<optimized out>, sender_name=<optimized out>, signal_name=<optimized out>, parameters=0x25d8002ae990, notification=0x25d801795f20) at ../libnotify/notification.c:795
#2  0x00007f61c2f154ea in g_closure_invoke (closure=0x25d8002d1150, return_value=0x0, n_param_values=4, param_values=0x7ffc29493d70, invocation_hint=0x7ffc29493cf0) at ../gobject/gclosure.c:832
...

The notification wasn't valid by the time we got to close_notification.  Likely, it was unref'd prematurely, possibly by Electron.  From there, I spent some time reading source code, setting break points, etc.  However, I ultimately ran out of time before Maximiliano beat me to it!

3. Resolution

As you can see in the libnotify issue #34 previously noted above, Maximiliano (@msandova) was hard at work today on libnotify's fix-electron branch working on a fix.

 

I rebuilt Electron's BaseApp using a gitlab-generated .tar.gz from the fix-electron branch, and rebuilt Element atop that, and that resolved the problem!

While I have to switch to other projects in life now, and I didn't really help solve the actual problem, it's been fun learning more about Flatpak and hopefully these notes will prove useful when debugging Flatpak issues in the future.

Thank yous

Thank you to the following very-helpful people:

  • Maximiliano (@msandova)
  • Stefan Hajnoczi
  • barthalion

Dieses Blog durchsuchen

Labels

#Technology #GNOME gnome gxml fedora bugs linux vala google #General firefox security gsoc GUADEC android bug xml fedora 18 javascript libxml2 programming web blogger encryption fedora 17 gdom git emacs libgdata memory mozilla open source serialisation upgrade web development API Spain containers design evolution fedora 16 fedora 20 fedora 22 fedup file systems friends future glib gnome shell internet luks music performance phone photos php podman preupgrade tablet testing typescript yum #Microblog Network Manager adb apache art automation bash brno catastrophe css data loss debian debugging deja-dup disaster docker emusic errors ext4 facebook fedora 19 gee gir gitlab gitorious gmail gobject google talk google+ gtk html libxml mail microsoft mtp mysql namespaces nautilus nextcloud owncloud picasaweb pitivi ptp python raspberry pi resizing rpm school selinux signal sms speech dispatcher systemd technology texting time management uoguelph usability video web design youtube #Tech Air Canada C Electron Element Empathy Europe GError GNOME 3 GNOME Files Go Google Play Music Grimes IRC Mac OS X Mario Kart Memento Nintendo Nintendo Switch PEAP Selenium Splatoon UI VPN Xiki accessibility advertising ai albums anaconda anonymity apple ask asus eee top automake autonomous automobiles b43 backup battery berlin bit rot broadcom browsers browsing canada canadian english cars chrome clarity comments communication compiler complaints computer computers configuration console constructive criticism cron cropping customisation dataloss dconf debug symbols design patterns desktop summit development discoverability distribution diy dnf documentation drm duplicity e-mail efficiency email english environment estate experimenting ext3 fedora 11 festival file formats firejail flac flatpak forgottotagit freedom friendship fuse galaxy nexus galton gay rights gdb german germany gimp gio gjs gnome software gnome-control-center google assistant google calendar google chrome google hangouts google reader gqe graphviz growth gtest gtg gvfs gvfs metadata hard drive hard drives hardware help hp humour ide identity instagram installation instant messaging integration intel interactivity introspection jabber java java 13 jobs kernel keyboard language language servers languages law learning lenovo letsencrypt libreoffice librpm life livecd liveusb login lsp macbook maintainership mariadb mario matrix memory leaks messaging mounting mouse netflix new zealand node nodelist numix obama oci ogg oggenc oh the humanity open open standards openoffice optimisation org-mode organisation package management packagekit paint shedding parallelism pdo perl pipelight privacy productivity progress progressive web apps pumpkin pwa pyright quality recursion redhat refactoring repairs report rhythmbox rust sandboxes scheduling screenshots self-navigating car shell sleep smartphones software software engineering speed sql ssd synergy tabs test tests themes thesis tracker travel triumf turtles tv tweak twist typing university update usb user experience valadoc video editing volunteering vpnc waf warm wayland weather web apps website wifi wiki wireless wishes work xinput xmpp xorg xpath
Powered by Blogger.