Project

General

Profile

Bug #238

Fastd crash with stack smashing

Added by Ruben Kelevra almost 3 years ago. Updated over 2 years ago.

Status:
New
Priority:
High
Assignee:
-
Start date:
08/23/2016
Due date:
% Done:

0%


Description

Aug 23 12:49:44 fastd[2740]: *** stack smashing detected ***: /usr/bin/fastd terminated
Aug 23 12:49:45 fastd[2740]: ======= Backtrace: =========
Aug 23 12:49:45 fastd[2740]: /usr/lib/libc.so.6(+0x70c4b)[0x7f7e0d03dc4b]
Aug 23 12:49:45 fastd[2740]: /usr/lib/libc.so.6(__fortify_fail+0x37)[0x7f7e0d0c5f17]
Aug 23 12:49:45 fastd[2740]: /usr/lib/libc.so.6(__fortify_fail+0x0)[0x7f7e0d0c5ee0]
Aug 23 12:49:45 fastd[2740]: /usr/bin/fastd[0x415b7f]
Aug 23 12:49:45 fastd[2740]: [0x1e1f9a0]
Aug 23 12:49:45 fastd[2740]: ======= Memory map: ========
Aug 23 12:49:45 fastd[2740]: 00400000-00430000 r-xp 00000000 fe:02 403820                             /usr/bin/fastd
Aug 23 12:49:45 fastd[2740]: 0062f000-00630000 r--p 0002f000 fe:02 403820                             /usr/bin/fastd
Aug 23 12:49:45 fastd[2740]: 00630000-00631000 rw-p 00030000 fe:02 403820                             /usr/bin/fastd
Aug 23 12:49:45 fastd[2740]: 01dde000-01e60000 rw-p 00000000 00:00 0                                  [heap]
Aug 23 12:49:45 fastd[2740]: 7f7e04000000-7f7e04021000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e04021000-7f7e08000000 ---p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0b48d000-7f7e0b4a3000 r-xp 00000000 fe:02 400481                     /usr/lib/libgcc_s.so.1
Aug 23 12:49:45 fastd[2740]: 7f7e0b4a3000-7f7e0b6a2000 ---p 00016000 fe:02 400481                     /usr/lib/libgcc_s.so.1
Aug 23 12:49:45 fastd[2740]: 7f7e0b6a2000-7f7e0b6a3000 r--p 00015000 fe:02 400481                     /usr/lib/libgcc_s.so.1
Aug 23 12:49:45 fastd[2740]: 7f7e0b6a3000-7f7e0b6a4000 rw-p 00016000 fe:02 400481                     /usr/lib/libgcc_s.so.1
Aug 23 12:49:45 fastd[2740]: 7f7e0b6a4000-7f7e0b6a5000 ---p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0b6a5000-7f7e0bea5000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0bea5000-7f7e0beaf000 r-xp 00000000 fe:02 402665                     /usr/lib/libnss_files-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0beaf000-7f7e0c0af000 ---p 0000a000 fe:02 402665                     /usr/lib/libnss_files-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0c0af000-7f7e0c0b0000 r--p 0000a000 fe:02 402665                     /usr/lib/libnss_files-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0c0b0000-7f7e0c0b1000 rw-p 0000b000 fe:02 402665                     /usr/lib/libnss_files-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0c0b1000-7f7e0c0b7000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0c0bd000-7f7e0c0bf000 r-xp 00000000 fe:02 402677                     /usr/lib/libdl-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0c0bf000-7f7e0c2bf000 ---p 00002000 fe:02 402677                     /usr/lib/libdl-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0c2bf000-7f7e0c2c0000 r--p 00002000 fe:02 402677                     /usr/lib/libdl-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0c2c0000-7f7e0c2c1000 rw-p 00003000 fe:02 402677                     /usr/lib/libdl-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0c2c5000-7f7e0c2cf000 r-xp 00000000 fe:02 404289                     /usr/lib/libjson-c.so.2.0.2
Aug 23 12:49:45 fastd[2740]: 7f7e0c2cf000-7f7e0c4ce000 ---p 0000a000 fe:02 404289                     /usr/lib/libjson-c.so.2.0.2
Aug 23 12:49:45 fastd[2740]: 7f7e0c4ce000-7f7e0c4cf000 r--p 00009000 fe:02 404289                     /usr/lib/libjson-c.so.2.0.2
Aug 23 12:49:45 fastd[2740]: 7f7e0c4cf000-7f7e0c4d0000 rw-p 0000a000 fe:02 404289                     /usr/lib/libjson-c.so.2.0.2
Aug 23 12:49:45 fastd[2740]: 7f7e0c4d5000-7f7e0c721000 r-xp 00000000 fe:02 400188                     /usr/lib/libcrypto.so.1.0.0
Aug 23 12:49:45 fastd[2740]: 7f7e0c721000-7f7e0c921000 ---p 0024c000 fe:02 400188                     /usr/lib/libcrypto.so.1.0.0
Aug 23 12:49:45 fastd[2740]: 7f7e0c921000-7f7e0c93d000 r--p 0024c000 fe:02 400188                     /usr/lib/libcrypto.so.1.0.0
Aug 23 12:49:45 fastd[2740]: 7f7e0c93d000-7f7e0c949000 rw-p 00268000 fe:02 400188                     /usr/lib/libcrypto.so.1.0.0
Aug 23 12:49:45 fastd[2740]: 7f7e0c949000-7f7e0c94c000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0c94d000-7f7e0c9b1000 r-xp 00000000 fe:02 401553                     /usr/lib/libsodium.so.18.1.1
Aug 23 12:49:45 fastd[2740]: 7f7e0c9b1000-7f7e0cbb1000 ---p 00064000 fe:02 401553                     /usr/lib/libsodium.so.18.1.1
Aug 23 12:49:45 fastd[2740]: 7f7e0cbb1000-7f7e0cbb2000 r--p 00064000 fe:02 401553                     /usr/lib/libsodium.so.18.1.1
Aug 23 12:49:45 fastd[2740]: 7f7e0cbb2000-7f7e0cbb3000 rw-p 00065000 fe:02 401553                     /usr/lib/libsodium.so.18.1.1
Aug 23 12:49:45 fastd[2740]: 7f7e0cbb5000-7f7e0cbbf000 r-xp 00000000 fe:02 403815                     /usr/lib/libuecc.so.0.7
Aug 23 12:49:45 fastd[2740]: 7f7e0cbbf000-7f7e0cdbe000 ---p 0000a000 fe:02 403815                     /usr/lib/libuecc.so.0.7
Aug 23 12:49:45 fastd[2740]: 7f7e0cdbe000-7f7e0cdbf000 r--p 00009000 fe:02 403815                     /usr/lib/libuecc.so.0.7
Aug 23 12:49:45 fastd[2740]: 7f7e0cdbf000-7f7e0cdc0000 rw-p 0000a000 fe:02 403815                     /usr/lib/libuecc.so.0.7
Aug 23 12:49:45 fastd[2740]: 7f7e0cdc5000-7f7e0cdc9000 r-xp 00000000 fe:02 394023                     /usr/lib/libcap.so.2.25
Aug 23 12:49:45 fastd[2740]: 7f7e0cdc9000-7f7e0cfc8000 ---p 00004000 fe:02 394023                     /usr/lib/libcap.so.2.25
Aug 23 12:49:45 fastd[2740]: 7f7e0cfc8000-7f7e0cfc9000 rw-p 00003000 fe:02 394023                     /usr/lib/libcap.so.2.25
Aug 23 12:49:45 fastd[2740]: 7f7e0cfcd000-7f7e0d162000 r-xp 00000000 fe:02 401712                     /usr/lib/libc-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d162000-7f7e0d361000 ---p 00195000 fe:02 401712                     /usr/lib/libc-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d361000-7f7e0d365000 r--p 00194000 fe:02 401712                     /usr/lib/libc-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d365000-7f7e0d367000 rw-p 00198000 fe:02 401712                     /usr/lib/libc-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d367000-7f7e0d36b000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0d36d000-7f7e0d385000 r-xp 00000000 fe:02 401689                     /usr/lib/libpthread-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d385000-7f7e0d584000 ---p 00018000 fe:02 401689                     /usr/lib/libpthread-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d584000-7f7e0d585000 r--p 00017000 fe:02 401689                     /usr/lib/libpthread-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d585000-7f7e0d586000 rw-p 00018000 fe:02 401689                     /usr/lib/libpthread-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d586000-7f7e0d58a000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0d58d000-7f7e0d5b0000 r-xp 00000000 fe:02 401710                     /usr/lib/ld-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d799000-7f7e0d79d000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0d7af000-7f7e0d7b0000 r--p 00022000 fe:02 401710                     /usr/lib/ld-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d7b0000-7f7e0d7b1000 rw-p 00023000 fe:02 401710                     /usr/lib/ld-2.24.so
Aug 23 12:49:45 fastd[2740]: 7f7e0d7b1000-7f7e0d7b3000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7f7e0d7b3000-7f7e0d7b5000 rw-p 00000000 00:00 0
Aug 23 12:49:45 fastd[2740]: 7fffd24a3000-7fffd24c4000 rw-p 00000000 00:00 0                          [stack]
Aug 23 12:49:45 fastd[2740]: 7fffd25ad000-7fffd25af000 r--p 00000000 00:00 0                          [vvar]
Aug 23 12:49:45 fastd[2740]: 7fffd25af000-7fffd25b1000 r-xp 00000000 00:00 0                          [vdso]
Aug 23 12:49:45 fastd[2740]: ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aug 23 12:49:45  systemd[1]: fastd@gronau.service: Main process exited, code=killed, status=6/ABRT
Aug 23 12:49:45  systemd[1]: fastd@gronau.service: Unit entered failed state.
Aug 23 12:49:45  systemd[1]: fastd@gronau.service: Failed with result 'signal'.

While no coredump is created:

  1. coredumpctl list COMM fastd
    No coredumps found.

usr.7z (138 KB) Ruben Kelevra, 09/14/2016 06:21 PM

History

#1 Updated by Matthias Schiffer almost 3 years ago

How reproducible is this issue? If you can reproduce it easily, could you try running fastd with debug information in gdb and provide a backtrace?

#2 Updated by Ruben Kelevra almost 3 years ago

Actually not, the only thing which changed in the last days:

We update nodes from very old firmware with fastd v14 to a firmware which got fastd v16 and then step up to fastd v18. The last step includes a complete wipe of the old firmware, so a new key is generated, but currently there are many v16 nodes out there, on this instance it might be something around 40 nodes.

This is the first crash of this manner.

Actually I recompile libuecc and fastd after each update of the system after a reboot and reboot after this to make sure everything is fine, so the versions used to compile are just the latest available on archlinux:

$ yaourt -Qi libuecc
Name : libuecc
Version : 7-1
$ yaourt -Qi fastd
Name : fastd
Version : 18-1
$ yaourt -Qi libsodium
Name : libsodium
Version : 1.0.11-1

#3 Updated by Matthias Schiffer almost 3 years ago

Could you send me the fastd binary that generated the above crashdump?

#4 Updated by Ruben Kelevra almost 3 years ago

Sure, I've added all files mentioned in the trace from the snapshot 2016-08-23T12:00:01, the nearest snapshot of this machine before the crash happend.

#5 Updated by Ruben Kelevra almost 3 years ago

Sure, I've added all files mentioned in the trace from the snapshot 2016-08-23T12:00:01, the nearest snapshot of this machine before the crash happend.

please delete the post before :)

#6 Updated by Ruben Kelevra almost 3 years ago

Hm it seems to be impossible to add a file with 2 MB to the post... so here:

https://cloud.vfn-nrw.de/s/FszM8OSK0cmqWQ7

#7 Updated by Matthias Schiffer almost 3 years ago

Unfortunately, fastd[0x415b7f] is at the very end of the main() function, so the stack corruption could have happened at any point.

Have you seen this issue more than once? Could you try running fastd in valgrind using the same configuration and see if any warnings pop up?

Rebuilding fastd to include debug information (option -g) and not stripping the binary would help analysis as well.

#8 Updated by Ruben Kelevra almost 3 years ago

Have you seen this issue more than once?

Nope

Rebuilding fastd to include debug information (option -g) and not stripping the binary would help analysis as well.
Could you try running fastd in valgrind using the same configuration and see if any warnings pop up?

Rebuilding libuecc and fastd with "-O0 -g". Then I gonna try this at the gronau instance.

#9 Updated by Ruben Kelevra almost 3 years ago

Valgrind won't let me start fastd:

[fastd@rig fastd]$  valgrind --leak-check=yes /usr/bin/fastd --syslog-level info --syslog-ident fastd@gronau -c /etc/fastd/gronau/fastd.conf
==28429== Memcheck, a memory error detector
==28429== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==28429== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==28429== Command: /usr/bin/fastd --syslog-level info --syslog-ident fastd@gronau -c /etc/fastd/gronau/fastd.conf
==28429== 
vex amd64->IR: unhandled instruction bytes: 0x8F 0xEA 0xF8 0x10 0xC2 0x1A 0x1 0x0
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==28429== valgrind: Unrecognised instruction at address 0x4180dd.
==28429==    at 0x4180DD: ??? (in /usr/bin/fastd)
==28429==    by 0x422A99: fastd_cipher_init (in /usr/bin/fastd)
==28429==    by 0x407C00: ??? (in /usr/bin/fastd)
==28429==    by 0x404DD7: main (in /usr/bin/fastd)
==28429== Your program just tried to execute an instruction that Valgrind
==28429== did not recognise.  There are two possible reasons for this.
==28429== 1. Your program has a bug and erroneously jumped to a non-code
==28429==    location.  If you are running Memcheck and you just saw a
==28429==    warning about a bad jump, it's probably your program's fault.
==28429== 2. The instruction is legitimate but Valgrind doesn't handle it,
==28429==    i.e. it's Valgrind's fault.  If you think this is the case or
==28429==    you are not sure, please let us know and we'll try to fix it.
==28429== Either way, Valgrind will now raise a SIGILL signal which will
==28429== probably kill your program.
==28429== 
==28429== Process terminating with default action of signal 4 (SIGILL): dumping core
==28429==  Illegal opcode at address 0x4180DD
==28429==    at 0x4180DD: ??? (in /usr/bin/fastd)
==28429==    by 0x422A99: fastd_cipher_init (in /usr/bin/fastd)
==28429==    by 0x407C00: ??? (in /usr/bin/fastd)
==28429==    by 0x404DD7: main (in /usr/bin/fastd)
==28429== 
==28429== HEAP SUMMARY:
==28429==     in use at exit: 0 bytes in 0 blocks
==28429==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==28429== 
==28429== All heap blocks were freed -- no leaks are possible
==28429== 
==28429== For counts of detected and suppressed errors, rerun with: -v
==28429== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Ung├╝ltiger Maschinenbefehl (Speicherabzug geschrieben)

#10 Updated by Ruben Kelevra almost 3 years ago

I think it's just my tuning to the processor, I gonna recompile it again with -march=x86-64 -mtune=generic

#11 Updated by Ruben Kelevra almost 3 years ago

==7076== Memcheck, a memory error detector
==7076== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==7076== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==7076== Command: /usr/bin/fastd --syslog-level info --syslog-ident fastd@gronau -c /etc/fastd/gronau/fastd.conf
==7076== 
==7076== 
==7076== HEAP SUMMARY:
==7076==     in use at exit: 127,533 bytes in 2,745 blocks
==7076==   total heap usage: 3,538 allocs, 793 frees, 344,888 bytes allocated
==7076== 
==7076== LEAK SUMMARY:
==7076==    definitely lost: 0 bytes in 0 blocks
==7076==    indirectly lost: 0 bytes in 0 blocks
==7076==      possibly lost: 0 bytes in 0 blocks
==7076==    still reachable: 127,533 bytes in 2,745 blocks
==7076==         suppressed: 0 bytes in 0 blocks
==7076== Reachable blocks (those to which a pointer was found) are not shown.
==7076== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==7076== 
==7076== For counts of detected and suppressed errors, rerun with: -v
==7076== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

#12 Updated by Matthias Schiffer over 2 years ago

The weirdest thing about this crash is the [0x1e1f9a0] on the call stack, which is a heap address.

Unfortunately, I have no idea where I should start looking for this bug without a core dump. You should really try to fix core dumps on your system (you can test if they work by manually sending signals like SIGSEGV or SIGABRT to processes).

#13 Updated by Ruben Kelevra over 2 years ago

Coredumps still working fine, I've tested them for the segfault-ticket. When I send Segfault to a sleep a dump is recorded.

Fastd isn't getting dumped. Which setting do I have to change to enable coredumps on fastd?

#14 Updated by Matthias Schiffer over 2 years ago

Linux disables coredumps by default for processes changing their user ID. You can enable them by setting the fs.suid_dumpable sysctl to 1.

#15 Updated by Ruben Kelevra over 2 years ago

Thanks, I've added this to our config. If a another crash occours I've add the dump here.

#16 Updated by Matthias Schiffer over 2 years ago

Please don't post the dump in a public place like this, it will contain your secret keys.

#17 Updated by Ruben Kelevra over 2 years ago

Allright, I've send you a mail then. :)

Also available in: Atom PDF