I run Fedora and have messaging app Element (and Signal) installed via Flatpak. Last week, I upgraded to versions that upped their base dependencies on its runtime org.freedesktop.Platform and base image org.electronjs.Electron2.BaseApp from 22.08 to 23.08. Since, clicking on notifications have made Element go kaboom :(
I decided to see if I could identify the problem and submit a fix. (Spoiler: someone else has solved it first, but I learned a lot along the way).
tl;dr: filed some issues, and learned new things about Flatpak
0. tl;dr #2
In the general case, following debugging instructions from these two resources has been sufficient in the past:
- https://docs.flatpak.org/en/latest/debugging.html
"Debugging" - https://blogs.gnome.org/mclasen/2017/01/20/debugging-a-flatpak-application/
"Debugging a Flatpak application by mclasen" from 2017-01-20
Basically, for im.riot.Riot, I would normally want to:
- $ flatpak install im.riot.Riot.Debug # install .Debug extension
- $ flatpak run --devel --command=sh im.riot.Riot # run a shell in devel mode for debug info and tools like 'gdb'
In this case, that was insufficient because the debugging information was in the Electron base layer, so I ultimately rebuilt Element locally, specifying the electron .Debug extension in Element's YAML file's "base-extensions:" list, and spent too much time figuring out how to install my locally-built Flatpak's own .Debug extension from my build directory. (See "Detour 2.1" below for details.)
Here are all my steps in case that helps someone else:
1. Investigate
Check for existing bug reports
First I wanted to look to see if another user/developer had already noticed/identified the issue. I thought the problem might be specific to the its Flatpak package, given the constraints on how it can interact with the host system (e.g portals). So my first stop was Flathub, going to Element (im.riot.Riot)'s app page:
Detour 1: broken app pages
- https://github.com/flathub/website/issues/2045 "App pages broken"
That led to the above issue getting filed, as visiting Element's app page . No one directly replied but I was flattered to see it referenced in the #flathub:matrix.org chat:
Check for existing bug reports, take 2
So for now I just googled and found that Flathub Flatpaks have their package repos stored under Flathub's github like:
There wasn't an existing issue created for this, so I made one!
-
https://github.com/flathub/im.riot.Riot/issues/406
'Notifications cause Element to crash after commit 6d9d107 "Use newer freedesktop platform 23.08"'
Finding the change that broke things
First I had to see if I could find the version/change that introduced the problem. I referred to Flatpak's bisect documentation but for reasons that I will investigate later, `flatpak bisect` didn't actually work for me, so I went the classic route of ... clone the package repo and bisect it myself!
Building locally
I referred to more documentation (Building your first Flatpak (even though this is like my third!))
~$ git clone https://github.com/flathub/im.riot.Riot.git [~]$ cd im.riot.Riot/ [im.riot.Riot]$ git clone https://github.com/flathub/shared-modules.git [im.riot.Riot]$ flatpak-builder build-dir im.riot.Riot.yaml error: org.freedesktop.Sdk/x86_64/23.08 not installed Failed to init: Unable to find sdk org.freedesktop.Sdk version 23.08
:( Let's install our prerequisites:
$ flatpak remote-add --user --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo $ flatpak install --user --no-related org.freedesktop.Sdk/x86_64/23.08 $ flatpak install --user --no-related org.freedesktop.Platform/x86_64/23.08 $ flatpak install --user --no-related org.electronjs.Electron2.BaseApp/x86_64/23.08
Notes:
- remote-add: for my test environment, flathub exists already for the system, but I just want to test this locally, so adding it for user.
- --no-related: I use this to skip Locale extensions that are very large and likely unnecessary for testing (in this case, they're going to get installed anyway, ah well)
Now it basically builds!
Manually bisecting
I added a branch tag to im.riot.Riot.yaml to mark it as my debugging version:
app-id: im.riot.Riot branch: dbg base: org.electronjs.Electron2.BaseApp ...
Repeat this a few times:
$ git log # view changes and commit IDs $ git checkout <COMMIT_ID> # test a commit $ flatpak-builder --user --install --force-clean build-dir im.riot.Riot.yaml $ flatpak run im.riot.Riot/x86_64/dbg # run my dbg branch to test
Testing in this case saw me messaging one of my Matrix accounts from another to generate the notification, clicking, and seeing if it crashed.
This quickly narrowed it down to this:
-
https://github.com/flathub/im.riot.Riot/commit/6d9d107941143997e40649192116d93eefe38796
"Use newer freedesktop platform 23.08"
While the commit name references the org.freedesktop.Platform runtime change, spoiler: the issue is more related to the org.electronjs.Electron2.BaseApp base image change to 23.08.
This led to issue 406 (above) being filed, but while working on this, a friend sent me a message via Signal (also based on Electron) and that crashed when I clicked its notification! Uh oh.
Check for existing bug reports, Signal-edition
In Signal's case, someone had already filed an issue:
-
https://github.com/flathub/org.signal.Signal/issues/521
"libnotify crash"
They had identified the issue as existing in libnotify (rather than specifically how the Flatpak used it), and had already filed an issue there:
-
https://gitlab.gnome.org/GNOME/libnotify/-/issues/34
"Signal crashes in flatpak when notification is dismissed"
2. Debug
Finding the change that broke things, part 2: libnotify edition
So from there I decided to try to debug the source of the crash. First I tried installing the debug info for Element (im.riot.Riot) and running it in devel mode, so I could use gdb to catch the crash and look at the backtrace:
$ flatpak install --user im.riot.Riot.Debug $ flatpak run --user --devel --command=sh --devel im.riot.Riot [📦 im.riot.Riot ~]$ cd /app/bin [📦 im.riot.Riot bin]$ ./element & [📦 im.riot.Riot bin]$ gdb --pid=<PID> ... Enable debuginfod for this session? (y or [n]) y ... (gdb) continue ... Downloading separate debug info for /app/lib/libnotify.so.4 (element-desktop:6): libnotify-WARNING **: 18:22:08.633: Running in confined mode, using Portal notifications. Some features and hints won't be supported Thread 1 "element-desktop" received signal SIGSEGV, Segmentation fault. 0x00007f945f8d1418 in ?? () from /app/lib/libnotify.so.4 (gdb) bt #0 0x00007f945f8d1418 in () at /app/lib/libnotify.so.4 #1 0x00007f946f4fb4ea in g_closure_invoke (closure=0x173001ee90b0, return_value=0x0, n_param_values=4, param_values=0x7ffd598a9cb0, invocation_hint=0x7ffd598a9c30) at ../gobject/gclosure.c:832 ...
Huh, despite installing the .Devel extension for Element, running it using --devel, and even having gdb try to download debug info for libnotify.so.4, there are no symbols.
I took a peek inside my Flatpak to see that running with --devel meant /app/lib/debug/
did have debug info for various libraries and packages, but not libnotify.
Note:
- for PID above, element launches multiple element-desktop processes, but the one I want is the following invocation, which abrt was kind enough to identify when I first encountered this problem:
- /app/Element/element-desktop --enable-wayland-ime --ozone-platform-hint=auto --enable-features=WaylandWindowDecorations,WebRTCPipeWireCapturer
Detour 2.0: the quest for libnotify debugging symbols
So I tried a variety of things and did a lot of reading.
Looking around, I saw that libnotify wasn't directly included in im.riot.Riot but in its base layer, org.electronjs.Electron2.BaseApp:
- https://github.com/flathub/org.electronjs.Electron2.BaseApp/
I went back to #flatpak:matrix.org and asked around and barthalion helpedfully noted that apps using Electron usually didn't retain their debugging symbols.
I tried to build a local version of that under my own dbg branch, and then rebuilt my local dbg build of im.riot.Riot using that dbg branch of Electron2, but they were still missing. Hmm.
Detour 2.1: successfully including debugging symbols from a base image
After some more reading, I found this
very helpful blog post by Redhat's Stefan Hajnoczi, and a helpful digging deeper into Flatpak builder docs:
-
http://blog.vmsplice.net/2022/04/debugging-flatpak-applications.html"Debugging Flatpak applications" by Stefan Hajnoczi
- https://docs.flatpak.org/en/latest/flatpak-builder-command-reference.html"Flatpak Builder Command Reference" on base-extensions
So I was finally able to test Element with libnotify debugging info by:
- Adding the following to my im.riot.Riot.yaml, telling it to actually use the Debug extension for org.electronjs.Electron2.BaseApp (using the flathub version, rather than my dbg build):
base-extensions: app-id: im.riot.Riot branch: dbg base: org.electronjs.Electron2.BaseApp base-version: '23.08' base-extensions: - org.electronjs.Electron2.BaseApp.Debug runtime: org.freedesktop.Platform ...
- actually locally installing the Debug extension of my locally-built dbg build (thanks to Stefan Hajnoczi's blog post above!). From my local git repo,
[im.riot.Riot]$ flatpak install --user im.riot.Riot.Debug [im.riot.Riot]$ flatpak install --user --reinstall --assumeyes "$(pwd)/.flatpak-builder/cache" im.riot.Riot.Debug
Repeating earlier steps to run gdb and catch the crash, I got to:
... [New Thread 0x7f61bff1b6c0 (LWP 184)] (element-desktop:4): libnotify-WARNING **: 18:20:44.116: Running in confined mode, using Portal notifications. Some features and hints won't be supported Thread 1 "element-desktop" received signal SIGSEGV, Segmentation fault. 0x00007f61b33cb418 in close_notification (reason=NOTIFY_CLOSED_REASON_DISMISSED, notification=0x25d801795f20) at ../libnotify/notification.c:708 708 if (notification->priv->closed_reason != NOTIFY_CLOSED_REASON_UNSET || (gdb) bt #0 0x00007f61b33cb418 in close_notification (reason=NOTIFY_CLOSED_REASON_DISMISSED, notification=0x25d801795f20) at ../libnotify/notification.c:708 #1 proxy_g_signal_cb (proxy=<optimized out>, sender_name=<optimized out>, signal_name=<optimized out>, parameters=0x25d8002ae990, notification=0x25d801795f20) at ../libnotify/notification.c:795 #2 0x00007f61c2f154ea in g_closure_invoke (closure=0x25d8002d1150, return_value=0x0, n_param_values=4, param_values=0x7ffc29493d70, invocation_hint=0x7ffc29493cf0) at ../gobject/gclosure.c:832 ...
The notification wasn't valid by the time we got to close_notification. Likely, it was unref'd prematurely, possibly by Electron. From there, I spent some time reading source code, setting break points, etc. However, I ultimately ran out of time before Maximiliano beat me to it!
3. Resolution
As you can see in the
libnotify issue #34 previously noted above, Maximiliano (@msandova) was hard at work today on libnotify's
fix-electron branch working on a fix.
I rebuilt Electron's BaseApp using a gitlab-generated .tar.gz from the fix-electron branch, and rebuilt Element atop that, and that resolved the problem!
While I have to switch to other projects in life now, and I didn't really help solve the actual problem, it's been fun learning more about Flatpak and hopefully these notes will prove useful when debugging Flatpak issues in the future.
Thank yous
Thank you to the following very-helpful people:
- Maximiliano (@msandova)
- Stefan Hajnoczi
- barthalion
Keine Kommentare:
Kommentar veröffentlichen