Project

General

Profile

Bug #1553

[ARM]: pacman -Su systemd segfault => non-booting system

GNUtoo - almost 5 years ago - . Updated almost 5 years ago.

Status:
fixed
Priority:
critical
Assignee:
% Done:

100%


Description

Hi,

I recently upgraded my machine and I got that:

# pacman -Su
:: Starting full system upgrade...
[...]
(28/40) upgrading systemd                                           [#####################################] 100%
/tmp/alpm_NHb9SO/.INSTALL: line 13: 14267 Segmentation fault      (core dumped) systemd-sysusers

I did it trough SSH, and at this point the machine (a beaglebone green) hanged.
I don't know if systemd itself (pid 1) did segfault, but if it did this would have created a kernel panic.

I later looked what happened with a serial console and /sbin/init was gone.

Reparation attempt:
Since I had an initramfs, I could chroot in the rootfs like that:

# chroot /new_root

If you don't have a chroot, you can skip that as you are already in in your rootfs.

Then I mounted the required directories:

# mount -t proc none /proc
# mount -t sysfs none /sys
# mount -t devtmpfs none /dev

As pacman didn't work either because some libraries were missing, I had to repair it too.
But before applying the upgrade, pacman checked the signatures.
Here's an example of a missing library

# pacman 
pacman: error while loading shared libraries: /usr/lib/libnghttp2.so.14: file too short

So here I had to find to what package "/usr/lib/libnghttp2.so.14" corresponded.
This can be done in a working parabola with:
$ pacman -Q -o /usr/lib/libnghttp2*
/usr/lib/libnghttp2.so is owned by libnghttp2 1.23.1-1
/usr/lib/libnghttp2.so.14 is owned by libnghttp2 1.23.1-1
/usr/lib/libnghttp2.so.14.13.3 is owned by libnghttp2 1.23.1-1

or by guessing the name either by looking in /var/cache/pacman/pkg/

Then I added back the libraries like that:

# cd $(mktemp -d)
# cp /var/cache/pacman/pkg/libnghttp2-1.27.0-1-armv7h.pkg.tar.xz
# tar xf libnghttp2-1.27.0-1-armv7h.pkg.tar.xz
# cp usr/lib/* /usr/lib/

After doing that for each missing libraries, I then re-installed them properly like that:

# pacman --force -U /var/cache/pacman/pkg/libnghttp2-1.27.0-1-armv7h.pkg.tar.xz

I then installed systemd again. For each package in:

# ls /var/cache/pacman/pkg/*systemd*

I installed it with pacman -U

Then while installing it it segfaulted again:

# pacman -U /var/cache/pacman/pkg/systemd-235.38-2.parabola1-arm
loading packages...
warning: systemd-235.38-2.parabola1 is up to date -- reinstalling
resolving dependencies...
looking for conflicting packages...

Package (1)  Old Version         New Version         Net Change

systemd      235.38-2.parabola1  235.38-2.parabola1    0.00 MiB

Total Installed Size:  18.07 MiB
Net Upgrade Size:       0.00 MiB

:: Proceed with installation? [Y/n] 
(1/1) checking keys in keyring                     [######################] 100%
(1/1) checking package integrity                   [######################] 100%
(1/1) loading package files                        [######################] 100%
(1/1) checking for file conflicts                  [######################] 100%
(1/1) checking available disk space                [######################] 100%
:: Processing package changes...
(1/1) reinstalling systemd                         [######################] 100%
/tmp/alpm_hYnttY/.INSTALL: line 13:  2122 Segmentation fault      systemd-sysusers
/tmp/alpm_hYnttY/.INSTALL: line 13:  2123 Segmentation fault      journalctl --update-catalog
:: Running post-transaction hooks...
(1/5) Updating linux-libre initcpios
==> Building image from preset: /etc/mkinitcpio.d/linux-libre.preset: 'default'
  -> -k 4.13.11-gnu-1 -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-libre.img
==> Starting build: 4.13.11-gnu-1
  -> Running build hook: [base]
  -> Running build hook: [udev]
  -> Running build hook: [autodetect]
find: 'sort' terminated by signal 13
modprobe: ERROR: missing parameters. See -h.
  -> Running build hook: [modconf]
  -> Running build hook: [block]
  -> Running build hook: [lvm2]
  -> Running build hook: [filesystems]
  -> Running build hook: [keyboard]
  -> Running build hook: [fsck]
==> Generating module dependencies
==> Creating xz-compressed initcpio image: /boot/initramfs-linux-libre.img
==> Image generation successful
==> Building image from preset: /etc/mkinitcpio.d/linux-libre.preset: 'fallback'
  -> -k 4.13.11-gnu-1 -c /etc/mkinitcpio.conf -g /boot/initramfs-linux-libre-fallback.img -S autodetect
==> Starting build: 4.13.11-gnu-1
  -> Running build hook: [base]
  -> Running build hook: [udev]
  -> Running build hook: [modconf]
  -> Running build hook: [block]
  -> Running build hook: [lvm2]
  -> Running build hook: [filesystems]
  -> Running build hook: [keyboard]
  -> Running build hook: [fsck]
==> Generating module dependencies
==> Creating xz-compressed initcpio image: /boot/initramfs-linux-libre-fallback.img
==> Image generation successful
(2/5) Updating udev hardware database...
error: command terminated by signal 11: Segmentation fault
(3/5) Updating system user accounts...
/bin/sh: line 1:  3941 Segmentation fault      /usr/bin/systemd-sysusers "$(basename "$f")" 
/bin/sh: line 1:  3944 Segmentation fault      /usr/bin/systemd-sysusers "$(basename "$f")" 
/bin/sh: line 1:  3947 Segmentation fault      /usr/bin/systemd-sysusers "$(basename "$f")" 
error: command failed to execute correctly
(4/5) Creating temporary files...
/bin/sh: line 1:  3951 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3954 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3957 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3960 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3963 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3966 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3969 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3972 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3975 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
/bin/sh: line 1:  3978 Segmentation fault      /usr/bin/systemd-tmpfiles --create "$(basename "$f")" 
error: command failed to execute correctly
(5/5) Arming ConditionNeedsUpdate...

And then after rebooting:

:: running cleanup hook [udev]
[   17.732128] systemd[1]: System time before build time, advancing clock.
[   17.796344] ip_tables: (C) 2000-2006 Netfilter Core Team
[   17.825391] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
[   17.825391] 
[   17.834595] CPU: 0 PID: 1 Comm: systemd Not tainted 4.13.11-gnu-1 #1
[   17.840973] Hardware name: Generic AM33XX (Flattened Device Tree)
[   17.847132] [<c0111124>] (unwind_backtrace) from [<c010c420>] (show_stack+0x10/0x14)
[   17.854921] [<c010c420>] (show_stack) from [<c0b81d74>] (dump_stack+0x8c/0xa0)
[   17.862183] [<c0b81d74>] (dump_stack) from [<c01429dc>] (panic+0xf0/0x27c)
[   17.869096] [<c01429dc>] (panic) from [<c01469e0>] (complete_and_exit+0x0/0x1c)
[   17.876442] [<c01469e0>] (complete_and_exit) from [<dc051e64>] (0xdc051e64)
[   17.883457] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
[   17.883457] 
[   29.498812] random: crng init done
[   62.434769] BUG: workqueue lockup - pool cpus=0 flags=0x4 nice=0 stuck for 43s!
[   62.442144] Showing busy workqueues and worker pools:
[   62.447231] workqueue events: flags=0x0
[   62.451093]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[   62.457137]     pending: vmstat_shepherd
[   62.461111] workqueue events_power_efficient: flags=0x80
[   62.466457]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=3/256
[   62.472499]     pending: neigh_periodic_work, do_cache_clean, neigh_periodic_work
[   62.480063] workqueue mm_percpu_wq: flags=0x8
[   62.484449]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
[   62.490490]     pending: vmstat_update
[   62.494276] workqueue pm: flags=0x4
[   62.497789]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=3/256
[   62.503829]     pending: pm_runtime_work, pm_runtime_work, pm_runtime_work
[   62.510777] workqueue writeback: flags=0x4e
[   62.514989]   pwq 2: cpus=0 flags=0x4 nice=0 active=1/256
[   62.520417]     pending: wb_workfn

History

#1

Updated by GNUtoo almost 5 years ago

After a long setup, I was able to somehow boot to bash and compile systemd.

Here's the result after booting with init=/bin/sh and running some commands:

[root@(none) /]# gdb /sbin/init 
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying" 
and "show warranty" for details.
This GDB was configured as "armv7l-unknown-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /sbin/init...Reading symbols from /usr/lib/debug/usr/lib/systemd/systemd.debug...done.
done.
(gdb) run
Starting program: /usr/bin/init 
[tcsetpgrp failed in terminal_inferior: Inappropriate ioctl for device]
[tcsetpgrp failed in terminal_inferior: Inappropriate ioctl for device]
[tcsetpgrp failed in terminal_inferior: Inappropriate ioctl for device]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[tcsetpgrp failed in terminal_inferior: Inappropriate ioctl for device]

Program received signal SIGSEGV, Segmentation fault.
0xb6fe1480 in __tls_get_addr () from /lib/ld-linux-armhf.so.3
(gdb) bt
#0  0xb6fe1480 in __tls_get_addr () from /lib/ld-linux-armhf.so.3
#1  0x005013c4 in is_main_thread ()
    at ../systemd-stable/src/basic/process-util.c:844
#2  0x004ee144 in hashmap_base_new (hash_ops=0x572988 <string_hash_ops>, 
    type=HASHMAP_TYPE_PLAIN) at ../systemd-stable/src/basic/hashmap.c:730
#3  0x004dfefc in conf_files_list_strv_internal (strv=strv@entry=0xbefff748, 
    suffix=suffix@entry=0x53cd20 ".conf", root=root@entry=0x0, 
    flags=flags@entry=0, dirs=dirs@entry=0x593408)
    at ../systemd-stable/src/basic/conf-files.c:134
#4  0x004e0228 in conf_files_list_nulstr (strv=0xbefff748, 
    suffix=0x53cd20 ".conf", root=0x0, flags=0, 
    d=0x52260c "/etc/systemd/user.conf.d")
    at ../systemd-stable/src/basic/conf-files.c:193
#5  0x004ba798 in config_parse_many_nulstr (
    conf_file=0x5248f0 "/etc/systemd/user.conf", 
    conf_file_dirs=<optimized out>, sections=0x522680 "Manager", 
    lookup=0x4b957c <config_item_table_lookup>, table=0xbefff78c, 
    relaxed=false, userdata=0x0)
    at ../systemd-stable/src/shared/conf-parser.c:460
#6  0x0041fa10 in parse_config_file () at ../systemd-stable/src/core/main.c:771
#7  0x0041ad7c in main (argc=1, argv=0xbefffeb4)
    at ../systemd-stable/src/core/main.c:1625

#2

Updated by GNUtoo almost 5 years ago

I'll try to bisect it if I find the time, as there isn't a lot of changes between last time however I don't have fast ARM devices, and I don't have a lot of time either.

As I understand the only changes to the pkgbuild are:

commit 537adb86225271d3e895cdb8e75b193196c0d6aa
Author: Omar Vega Ramos <ovruni@gnu.org.pe>
Date:   Sat Nov 18 11:56:18 2017 -0500

    systemd-235.38-2.parabola1: updating version

diff --git a/libre/systemd/PKGBUILD b/libre/systemd/PKGBUILD
index 1bfef5d50..bac1e3f2c 100644
--- a/libre/systemd/PKGBUILD
+++ b/libre/systemd/PKGBUILD
@@ -10,11 +10,11 @@ pkgname=('systemd' 'libsystemd' 'systemd-sysvcompat')
 _libsystemd=('libsystemd-standalone' 'libudev' 'nss-systemd' 'nss-myhostname' 'nss-mymachines' 'nss-resolve')
 pkgname+=("${_libsystemd[@]}")
 # latest commit on stable branch
-_commit='7ba74d5f939d0322d6ea730dd0b5ceefd7d7f527'
+_commit='743b771c559c6101544f7358a42c8c519fe4b0db'
 # Bump this to latest major release for signed tag verification,
 # the commit count is handled by pkgver() function.
-pkgver=235.8
-pkgrel=1
+pkgver=235.38
+pkgrel=2
 pkgrel+=.parabola1
 arch=('i686' 'x86_64')
 arch+=('armv7h')
@@ -70,6 +70,8 @@ sha512sums=('SKIP'
             'e276fd1aedd7718333324fa9d99493fe99d951f446e3b590a99e2cc9562a0bd0e29693907997cb52096c39168c5be62ded3feedf93bacd3c9659d58775b6ca8d')

 _backports=(
+    # Fix typo in statx macro (#7180) (FS#56289)
+    '8e6a7a8b2be409d356bcaface00f6d44390c07ff'
 )

 _reverts=(

so we have a "backport" (8e6a7a8b2be409d356bcaface00f6d44390c07ff) and only 30 commits between 7ba74d5f939d0322d6ea730dd0b5ceefd7d7f527 and 743b771c559c6101544f7358a42c8c519fe4b0db (which is 4 bisects)

#3

Updated by GNUtoo almost 5 years ago

First download the old systemd package:

mkdir systemd-packages && cd systemd-packages
wget -c https://repomirror.parabola.nu/libre/os/armv7h/libsystemd-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/libsystemd-235.8-1.parabola1-armv7h.pkg.tar.xz.sig
wget -c https://repomirror.parabola.nu/libre/os/armv7h/libsystemd-standalone-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/libsystemd-standalone-235.8-1.parabola1-armv7h.pkg.tar.xz.sig
wget -c https://repomirror.parabola.nu/libre/os/armv7h/libudev-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/libudev-235.8-1.parabola1-armv7h.pkg.tar.xz.sig
wget -c https://repomirror.parabola.nu/libre/os/armv7h/nss-myhostname-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/nss-myhostname-235.8-1.parabola1-armv7h.pkg.tar.xz.sig
wget -c https://repomirror.parabola.nu/libre/os/armv7h/nss-mymachines-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/nss-mymachines-235.8-1.parabola1-armv7h.pkg.tar.xz.sig
wget -c https://repomirror.parabola.nu/libre/os/armv7h/nss-resolve-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/nss-resolve-235.8-1.parabola1-armv7h.pkg.tar.xz.sig
wget -c https://repomirror.parabola.nu/libre/os/armv7h/nss-systemd-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/nss-systemd-235.8-1.parabola1-armv7h.pkg.tar.xz.sig
wget -c https://repomirror.parabola.nu/libre/os/armv7h/systemd-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/systemd-235.8-1.parabola1-armv7h.pkg.tar.xz.sig
wget -c https://repomirror.parabola.nu/libre/os/armv7h/systemd-sysvcompat-235.8-1.parabola1-armv7h.pkg.tar.xz
wget -c https://repomirror.parabola.nu/libre/os/armv7h/systemd-sysvcompat-235.8-1.parabola1-armv7h.pkg.tar.xz.sig

Then put the downloaded files in /var/cache/pacman/pkg/ on the target.

Then, on the target, verify the signatures of the old packages:

# for p in $(ls *235.8-1*.sig) ; do pacman-key -v $p ; done

And install the old packages:

# for p in $(ls *235.8-1*.pkg.tar.xz) ; do pacman -U $p ; done

And the system will boot again.

#4

Updated by PaulK almost 5 years ago

I am experiencing the same problem and I also tried to rebuild systemd as described in #1564, without luck.

I will first try to install the packages from ALARM, that use the same git commit. I suspect they work, but would like to double-check. If they do work, then the difference is in the FSDG-related patches or in some other difference. Otherwise, I'll go with a bisect as suggested.

#5

Updated by bill-auger almost 5 years ago

  • Priority changed from bug to critical
  • Subject changed from pacman -Su systemd segfault => non-booting system to [ARM]: pacman -Su systemd segfault => non-booting system
#6

Updated by PaulK almost 5 years ago

Looks like it's not failing with the following packages from ALARM:
  • libsystemd-235.38-4-armv7h.pkg.tar.xz
  • systemd-235.38-4-armv7h.pkg.tar.xz
  • systemd-sysvcompat-235.38-4-armv7h.pkg.tar.xz
#7

Updated by isacdaavid almost 5 years ago

  • Assignee set to isacdaavid
#8

Updated by isacdaavid almost 5 years ago

  • Status changed from open to in progress

thanks for looking into this.

systemd 235.38-4 from ALARM is also using backport 8e6a7a8b2be409d356bcaface00f6d44390c07ff, so that's likely not the issue. in addition they have another backport that we don't1. it isn't the kind of change I would expect to elicit a broken init either, but I'm gonna try it first before moving into investigating what has been added by our PKGBUILD.

there aren't many changes between Arch and ALARM:

--- a/PKGBUILD.arch
+++ b/PKGBUILD.alarm
@@ -3,6 +3,11 @@
 # Maintainer: Dave Reisner <dreisner@archlinux.org>
 # Maintainer: Tom Gundersen <teg@jklm.no>

+# ALARM: Kevin Mihelich <kevin@archlinuxarm.org>
+#  - disable gold/LTO
+#  - removed makedepend on gnu-efi-libs, set -Dgnuefi=false
+#  - backport fix for https://github.com/systemd/systemd/issues/7135
+
 pkgbase=systemd
 pkgname=('systemd' 'libsystemd' 'systemd-sysvcompat')
 # latest commit on stable branch
@@ -16,7 +21,7 @@ url="https://www.github.com/systemd/systemd" 
 makedepends=('acl' 'cryptsetup' 'docbook-xsl' 'gperf' 'lz4' 'xz' 'pam' 'libelf'
              'intltool' 'iptables' 'kmod' 'libcap' 'libidn' 'libgcrypt'
              'libmicrohttpd' 'libxslt' 'util-linux' 'linux-api-headers'
-             'python-lxml' 'quota-tools' 'shadow' 'gnu-efi-libs' 'git'
+             'python-lxml' 'quota-tools' 'shadow' 'git'
              'meson' 'libseccomp')
 options=('strip')
 validpgpkeys=('63CDA1E5D3FC22B998D20DD6327F26951A015CC4')  # Lennart Poettering <lennart@poettering.net>
@@ -50,6 +55,8 @@ sha512sums=('SKIP'
 _backports=(
        # Fix typo in statx macro (#7180) (FS#56289)
        '8e6a7a8b2be409d356bcaface00f6d44390c07ff'
+       # seccomp: include ARM set_tls in @default
+       'ce5faeac1f79f3afefcc129025a1cec0211313fb'
 )

 _reverts=(
@@ -114,9 +121,13 @@ prepare() {
 build() {
   local timeservers=({0..3}.arch.pool.ntp.org)

+  LDFLAGS+=" -Wl,-fuse-ld=bfd" 
+  CFLAGS+=" -fno-lto" 
+  CXXFLAGS+=" -fno-lto" 
+
   local meson_options=(
     -Daudit=false
-    -Dgnuefi=true
+    -Dgnuefi=false
     -Dima=false
     -Dlz4=true

[1]: https://github.com/systemd/systemd/issues/7135

#9

Updated by isacdaavid almost 5 years ago

this is gonna be more difficult than i originally planned. my ability to test systemd rests on my ability to boot from QEMU, but this is currently blocked by #1591.

any pointers would be appreciated.

#10

Updated by GNUtoo almost 5 years ago

Testing the most recent systemd resulted in this bug: #1553

#11

Updated by ovruni almost 5 years ago

  • % Done changed from 0 to 100
  • Assignee changed from isacdaavid to ovruni
  • Due date set to 2017-12-27

systemd-236.0-2.parabola1 was pushed and works fine

#12

Updated by ovruni almost 5 years ago

  • Status changed from in progress to fixed

Also available in: Atom PDF