| Summary: | kernel was updated to 3.8.12 version | ||
|---|---|---|---|
| Product: | [ROSA-based products] ROSA Fresh | Reporter: | Alexander Burmashev <alex.burmashev> |
| Component: | Packages from Main | Assignee: | ROSA Linux Bugs <bugs> |
| Status: | RESOLVED FIXED | QA Contact: | ROSA Linux Bugs <bugs> |
| Severity: | normal | ||
| Priority: | Normal | CC: | alexander.kazantsev, dmitry.postnikov, eugene.shatokhin, v.potapov |
| Version: | Fresh | Flags: | v.potapov:
qa_verified+
alex.burmashev: published+ |
| Target Milestone: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Platform: | --- | ROSA Vulnerability identifier: | |
| RPM Package: | kernel | ISO-related: | |
| Bad POT generating: | Upstream: | ||
| Bug Depends on: | 2039 | ||
| Bug Blocks: | |||
| Attachments: |
Kernel panic - screenshot
Output of 'lspci -vvnn' (HP EliteBook) krn4103-8.png krn4103-9.png grub-krn4812.cfg Postnikov-all(broadcom+dracut-27-4+krn3.8.12+kde4.10.3+ssystemd194-17).odt lspci-actl1c.txt atl1c1.png krn-sravn.png krn-sravn1.png tsc.png dmesg double wake up syslog double wakeup messages double wakeup test_suspend_hibernate |
||
|
Description
Alexander Burmashev
2013-05-15 18:35:02 MSK
Advisory: In order to adress http://packetstormsecurity.com/files/121616/semtex.c kernel was updated to 3.8.12 version Additional info: http://habrahabr.ru/post/179735/ Buildlists: i586 https://abf.rosalinux.ru/build_lists/1089218 https://abf.rosalinux.ru/build_lists/1089216 x86_64 https://abf.rosalinux.ru/build_lists/1089217 https://abf.rosalinux.ru/build_lists/1089219 Created attachment 1360 [details]
Kernel panic - screenshot
Installed the released ROSA Fresh i586 one a HP EliteBook, then added the containers with the kernel 3.8.12, updated the kernel packages from there. Rebooted - got kernel panic (see the screenshot I have attached). I observed such crashes on my QEMU VMs too, so I guess, I should not depend on some particular hardware. From the call stack, I suppose, it has something to do with zSwap. If it can be disabled at boot, I'll disable it and try again. Created attachment 1362 [details]
Output of 'lspci -vvnn' (HP EliteBook)
Attached the output of 'lspci -vvnn' for that HP EliteBook where the mentioned panic occurred - just in case.
Added zswap.enabled=0 to the kernel parameters at boot. The system booted OK, so it might be a zswap issue indeed. Discussed with our kernel maintainer and he well disable and remove zswap 3.8.12 was rebuilt with BFQ switched off and zswap removed i586 https://abf.rosalinux.ru/build_lists/1097309 https://abf.rosalinux.ru/build_lists/1097311 x86_64 https://abf.rosalinux.ru/build_lists/1097312 https://abf.rosalinux.ru/build_lists/1097310 fixed the latest packages: i586 https://abf.rosalinux.ru/build_lists/1097836 https://abf.rosalinux.ru/build_lists/1097309 x86_64 https://abf.rosalinux.ru/build_lists/1097837 https://abf.rosalinux.ru/build_lists/1097310 It's necessary to test kernel with the new driver broadcom. https://abf.rosalinux.ru/build_lists/1075200 https://abf.rosalinux.ru/build_lists/1075201 and new dracut http://bugs.rosalinux.ru/show_bug.cgi?id=2008 The package route to extended testing (In reply to comment #8) > fixed the latest packages: > i586 > https://abf.rosalinux.ru/build_lists/1097836 > https://abf.rosalinux.ru/build_lists/1097309 > > x86_64 > https://abf.rosalinux.ru/build_lists/1097837 > https://abf.rosalinux.ru/build_lists/1097310 (In reply to comment #9) > It's necessary to test kernel with the new driver broadcom. > > https://abf.rosalinux.ru/build_lists/1075200 > https://abf.rosalinux.ru/build_lists/1075201 Took the job. Will test on KDE 4.10.3+systemd-194-17 Created attachment 1384 [details] krn4103-8.png Hm.... I update to kernel 3.8.12 and view error befor Plymuth and log. See att. https://bbs.archlinux.org/viewtopic.php?pid=1177701 This is normal? Created attachment 1385 [details]
krn4103-9.png
And i decided to back kernel 3.6.10 and also error, only urpm-reposymc and cpupower.
See att.
Error in text:
===============
[root@localhost pastordi]# cat /var/log/dmesg | grep -i -E "tsc|error|fail|segfau"
[ 0.000000] tsc: Fast TSC calibration failed
[ 0.003000] tsc: Unable to calibrate against PIT
[ 0.003000] tsc: using HPET reference calibration
[ 0.003000] tsc: Detected 3113.761 MHz processor
[ 0.214602] pci0000:00: ACPI _OSC support notification failed, disabling PCIe ASPM
[ 1.040659] i8042: Failed to disable AUX port, but continuing anyway... Is this a SiS?
[ 1.949789] tsc: Refined TSC clocksource calibration: 3113.772 MHz
[ 1.949795] Switching to clocksource tsc
[ 8.537728] sp5100_tco: failed to find MMIO address, giving up.
[root@localhost pastordi]#
[root@localhost pastordi]# uname -a
Linux localhost.localdomain 3.8.12-nrj-desktop-2rosa #1 SMP PREEMPT Thu May 16 21:53:13 MSK 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost pastordi]#
==============
==============
[2/6] perf-3.6.10-1-rosa2012.1.x86_64.rpm
[3/6] cpupower-3.6.10-1-rosa2012.1.x86_64.rpm
urpm-reposync: ошибка при работе с пакетом {'cur_file': '/tmp/urpm-reposync.rpms/cpupower-3.6.10-1-rosa2012.1.x86_64.rpm', 'data': 'RPMCALLBACK_SCRIPT_ERROR; 1024, 1, /tmp/urpm-reposync.rpms/cpupower-3.6.10-1-rosa2012.1.x86_64.rpm, 1'}. Данные: RPMCALLBACK_SCRIPT_ERROR; 1024, 1, /tmp/urpm-reposync.rpms/cpupower-3.6.10-1-rosa2012.1.x86_64.rpm, 1
error: %post(cpupower-3.6.10-1.x86_64) scriptlet failed, exit status 1
[4/6] kernel-nrj-desktop-devel-latest-3.6.10-1-rosa2012.1.x86_64.rpm
[5/6] kernel-headers-3.6.10-1-rosa2012.1.noarch.rpm
[6/6] kernel-nrj-desktop-latest-3.6.10-1-rosa2012.1.x86_64.rpm
Удаление kernel-nrj-desktop-latest
Удаление cpupower
Удаление perf
Удаление kernel-headers
Удаление dracut
Удаление kernel-nrj-desktop-devel-latest
[root@localhost pastordi]#
==============
Did you try adding clocksource=acpi_pm to the kernel boot string ? (In reply to comment #16) > Did you try adding clocksource=acpi_pm to the kernel boot string ? No. Before this I have not observed. Now go back to old kernel, and again will be install new kernel. If i see this error, then try you helpful. The error is the same. But another big mistake is. 1. When I kernel 3.6.10 in Grub2 keyboard works. 2. Install kernel 3.8.12 and in Grub2 keyboard NOT works. I don't tweak the kernel parameters. 3. Again reinstall kernel 3.6.10 in Grub2 keyboard works. 4. Install kernel 3.8.12 and in Grub2 keyboard NOT works. Two attempts to put the new kernel in Grub2 twice the keyboard does not work. The error is clearly associated with the new kernel. It something crooked writes in Gub2. Can you explain better how can i check if this error is present ? You can't move to the other boot option in grub menu or what is the problem ? I doubt that kernel can cause grub2 error because it ( grub ) is loaded before kernel and does not really depend on it at all. (In reply to comment #19) > Can you explain better how can i check if this error is present ? You can't > move to the other boot option in grub menu or what is the problem ? > I doubt that kernel can cause grub2 error because it ( grub ) is loaded > before kernel and does not really depend on it at all. Yes, i can't move to the other boot option in grub menu. Grub2 not respond to pressing the keys. I never did anything. The standard procedure, 1. Connect repo1, connect repo2 2. urpmi --auto-update, and new kernel installed 3. reboot 4. Grub2 not respond to my keyboard. USB keyboard (Genius) + wifi mouse (A4tech) [root@localhost pastordi]# lsusb Bus 004 Device 004: ID 09da:054f A4 Tech Co., Ltd Bus 004 Device 005: ID 04d9:1702 Holtek Semiconductor, Inc. Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 006 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 007 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub [root@localhost pastordi]# I understand that Grub2 BEFORE the kernel is loaded. However, this only after installing a new kernel. After all, once generated .img kernel then goes edit the configuration file Grub2. Apparently something crooked there is written. When I go back to the old kernel. The old kernel, again, writes in configuration file Grub2 , but after a reboot EVERYTHING works. Created attachment 1386 [details]
grub-krn4812.cfg
My Grub2 config file after install and boot kernel 4.8.12
If you can attach please grub config before updating to 3.8 kernel. Created attachment 1389 [details]
Postnikov-all(broadcom+dracut-27-4+krn3.8.12+kde4.10.3+ssystemd194-17).odt
****************************************
Extended testing report - Broadcom
****************************************
******************************************* Extended testing report - Kernel 3.8.12 ******************************************* http://cdn.2safe.com/611642033046/Postnikov-all(krn3.8.12+kde4.10.3+ssystemd194-17).odt (In reply to comment #25) Thanks for the report. Could you also post the output of lspci from the machine where the system complains about atl1c driver? Created attachment 1390 [details] lspci-actl1c.txt (In reply to comment #26) > (In reply to comment #25) > Thanks for the report. > > Could you also post the output of lspci from the machine where the system > complains about atl1c driver? See att. Today nepomukservice is segfault, but atl1c not write error. I wrote in report, and testers all noted fact that kernel is not very stable behaves. ========= May 20 22:38:59 localhost NetworkManager[1591]: <info> (eth0): IP6 addrconf timed out or failed. May 20 22:43:36 localhost kernel: [ 317.609454] atl1c 0000:02:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update. May 20 22:43:49 localhost kernel: [ 330.670145] atl1c 0000:02:00.0: vpd r/w failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update. May 20 22:53:18 localhost systemd-tmpfiles[12454]: stat(/run/user/500/gvfs) failed: Permission denied May 20 22:53:18 localhost systemd[1]: Failed to start Cleanup of Temporary Directories. May 20 22:53:18 localhost systemd[1]: Unit systemd-tmpfiles-clean.service entered failed state. May 21 00:15:25 localhost kernel: [ 5811.774990] GPT: Use GNU Parted to correct GPT errors. May 21 02:11:55 localhost kernel: [12784.323425] psyncnotify[3725]: segfault at f63240 ip 0000000000f63240 sp 00007fff55c9cdf8 error 15 May 21 02:11:56 localhost kernel: [12785.401728] nepomukservices[3698]: segfault at 0 ip (null) sp 00007fff9ede2028 error 14 in nepomukservicestub[400000+7000] May 21 11:20:40 localhost kernel: [ 0.001000] tsc: Fast TSC calibration using PIT May 21 11:20:40 localhost kernel: [ 0.002000] tsc: Detected 3113.755 MHz processor Ma ============== Nepomuk is not kernel related as well as most of other errors ) (In reply to comment #27) > Created attachment 1390 [details] > lspci-actl1c.txt > > See att. OK, so the system does have AR8151 network card by Atheros. Does the wired networking work there? (In reply to comment #29) > Nepomuk is not kernel related as well as most of other errors ) I understand that Nepomuk engaged semantics files. But in the logs messages just from the kernel. Kernel deals with access to extended memory + file systems. And Nepomuk uses this. Perhaps something is buggy. PS. If we compare the kernel 3.8.5 and 3.8.12 version 3.8.5 much more stable behaves. (In reply to comment #30) > (In reply to comment #27) > > Created attachment 1390 [details] > > lspci-actl1c.txt > > > > See att. > > OK, so the system does have AR8151 network card by Atheros. Does the wired > networking work there? Sorry, it was too late... I braked. :) Yes, this Ethernet card (no wifi) and it works well , although this error appeared. Created attachment 1391 [details]
atl1c1.png
For one moment. Only now have noticed. atl1c writes that the card 100 Mbit, but this card - 1 Gigabit.
[root@localhost pastordi]# modinfo atl1c filename: /lib/modules/3.8.12-nrj-desktop-2rosa/kernel/drivers/net/ethernet/atheros/atl1c/atl1c.ko.xz version: 1.0.1.1-NAPI license: GPL description: Qualcom Atheros 100/1000M Ethernet Network Driver author: Qualcomm Atheros Inc., <nic-devel@qualcomm.com> author: Jie Yang srcversion: FFDFE0E5402689DA77D23B4 alias: pci:v00001969d00001083sv*sd*bc*sc*i* alias: pci:v00001969d00001073sv*sd*bc*sc*i* alias: pci:v00001969d00002062sv*sd*bc*sc*i* alias: pci:v00001969d00002060sv*sd*bc*sc*i* alias: pci:v00001969d00001062sv*sd*bc*sc*i* alias: pci:v00001969d00001063sv*sd*bc*sc*i* depends: intree: Y vermagic: 3.8.12-nrj-desktop-2rosa SMP preempt mod_unload modversions [root@localhost pastordi]# [root@localhost pastordi]# Small note - it is not handy to work with bugzilla when errors are reported in external file, better put it all directly to bugzilla. Since 3.8.12 is almost surely will be published, if there is no critical errors, the best what we can do is fix as many of them as possible. 1) USB flash drive error Did you try running dosfsck -a on it's device and can you paste it's output here ? Speaking of the speed - please paste the real time ouput, for example from systemd-analyze, so that the results can be checked somehow. (In reply to comment #35) > Small note - it is not handy to work with bugzilla when errors are reported > in external file, better put it all directly to bugzilla. > Since 3.8.12 is almost surely will be published, if there is no critical > errors, the best what we can do is fix as many of them as possible. > > > 1) USB flash drive error > Did you try running dosfsck -a on it's device and can you paste it's output > here ? > > > > Speaking of the speed - please paste the real time ouput, for example from > systemd-analyze, so that the results can be checked somehow. 1. To write is not in external files, there are problems. a) I need all the testers in Bagzilla drive. b) will be Difficult to keep track of all the errors, when a lot of them. (I think). c) My Boss has not yet allowed to do. 2. systemd-analize not present in package systemd. 3. Message out command: dosfsck -a [root@noname user]# dosfsck -a /dev/sdc1 dosfsck 3.0.13, 30 Jun 2012, FAT32, LFN There are differences between boot sector and its backup. Differences: (offset:original/backup) 65:01/00 Not automatically fixing this. FATs differ but appear to be intact. Using first FAT. Cluster 16387 out of range (2097152 > 981119). Setting to EOF. Cluster 16389 out of range (8388608 > 981119). Setting to EOF. Cluster 16391 out of range (16777216 > 981119). Setting to EOF. Cluster 16393 out of range (134217728 > 981119). Setting to EOF. Cluster 16416 out of range (1048576 > 981119). Setting to EOF. Cluster 16418 out of range (4194304 > 981119). Setting to EOF. Cluster 16421 out of range (33554432 > 981119). Setting to EOF. Cluster 16423 out of range (134217728 > 981119). Setting to EOF. Cluster 16642 out of range (1572864 > 981119). Setting to EOF. Cluster 16644 out of range (4194304 > 981119). Setting to EOF. Cluster 16646 out of range (8388608 > 981119). Setting to EOF. Cluster 16649 out of range (134217728 > 981119). Setting to EOF. Cluster 16672 out of range (1048576 > 981119). Setting to EOF. Cluster 16674 out of range (4194304 > 981119). Setting to EOF. Cluster 16677 out of range (33554432 > 981119). Setting to EOF. Cluster 16679 out of range (134217728 > 981119). Setting to EOF. Cluster 16854 out of range (67108864 > 981119). Setting to EOF. Cluster 17155 out of range (3145728 > 981119). Setting to EOF. Cluster 17159 out of range (33554432 > 981119). Setting to EOF. Cluster 17161 out of range (201326592 > 981119). Setting to EOF. Cluster 17184 out of range (1048576 > 981119). Setting to EOF. Cluster 17186 out of range (4194304 > 981119). Setting to EOF. Cluster 17189 out of range (33554432 > 981119). Setting to EOF. Cluster 17196 out of range (1). Setting to EOF. Cluster 17257 out of range (1048576 > 981119). Setting to EOF. Cluster 17259 out of range (4194304 > 981119). Setting to EOF. Cluster 17666 out of range (1048576 > 981119). Setting to EOF. Cluster 17668 out of range (6291456 > 981119). Setting to EOF. Cluster 17672 out of range (67108864 > 981119). Setting to EOF. Cluster 17695 out of range (134742016 > 981119). Setting to EOF. Cluster 17697 out of range (2097152 > 981119). Setting to EOF. Cluster 17699 out of range (8388608 > 981119). Setting to EOF. Cluster 17701 out of range (33554432 > 981119). Setting to EOF. Cluster 17703 out of range (134217728 > 981119). Setting to EOF. Cluster 17708 out of range (1). Setting to EOF. Cluster 17923 out of range (3145728 > 981119). Setting to EOF. Cluster 17925 out of range (8388608 > 981119). Setting to EOF. Cluster 17927 out of range (33554432 > 981119). Setting to EOF. Cluster 17930 out of range (134217728 > 981119). Setting to EOF. Cluster 17952 out of range (1048576 > 981119). Setting to EOF. Cluster 17954 out of range (4194304 > 981119). Setting to EOF. Cluster 17957 out of range (33554432 > 981119). Setting to EOF. Cluster 17964 out of range (1). Setting to EOF. Cluster 410627 out of range (3145728 > 981119). Setting to EOF. Cluster 410629 out of range (4194304 > 981119). Setting to EOF. Cluster 410632 out of range (67108864 > 981119). Setting to EOF. Cluster 410651 out of range (1). Setting to EOF. Cluster 410656 out of range (1048576 > 981119). Setting to EOF. Cluster 410658 out of range (4194304 > 981119). Setting to EOF. Cluster 410661 out of range (33554432 > 981119). Setting to EOF. Cluster 410663 out of range (134217728 > 981119). Setting to EOF. Cluster 410784 out of range (67108864 > 981119). Setting to EOF. Cluster 410787 out of range (1048576 > 981119). Setting to EOF. Cluster 540676 out of range (6291456 > 981119). Setting to EOF. Cluster 540680 out of range (100663296 > 981119). Setting to EOF. Cluster 540682 out of range (134217728 > 981119). Setting to EOF. Cluster 540705 out of range (2097152 > 981119). Setting to EOF. Cluster 540707 out of range (8388608 > 981119). Setting to EOF. Cluster 540710 out of range (67108864 > 981119). Setting to EOF. Cluster 540716 out of range (1). Setting to EOF. Cluster 541698 out of range (1572864 > 981119). Setting to EOF. Cluster 541701 out of range (8388608 > 981119). Setting to EOF. Cluster 541703 out of range (33554432 > 981119). Setting to EOF. Cluster 541705 out of range (134217728 > 981119). Setting to EOF. Cluster 541729 out of range (2097152 > 981119). Setting to EOF. Cluster 541733 out of range (33554432 > 981119). Setting to EOF. Cluster 541735 out of range (134217728 > 981119). Setting to EOF. Cluster 541892 out of range (4194304 > 981119). Setting to EOF. Cluster 541909 out of range (33554432 > 981119). Setting to EOF. /ROSA.MARATHON.X1.EE.x86_64.iso Contains a free cluster (410630). Assuming EOF. /ROSA.MARATHON.X1.EE.x86_64.iso File size is 1561329664 bytes, cluster chain length is 283721728 bytes. Truncating file to 283721728 bytes. Broke cycle at cluster 410677 in free chain. Ошибка сегментирования (слепок снят) Is this the flash drive that caused remounting to ro in your logs ? If yes, try running dosfsck -a /dev/sdc Created attachment 1393 [details]
krn-sravn.png
Comparison kernel-1
Created attachment 1394 [details]
krn-sravn1.png
Comparison kernel-2
According to that log even 3.6.10 boots differ from 27 to 31 second. Am i right ? So looking at both pictures average boot time did not change. (In reply to comment #40) > According to that log even 3.6.10 boots differ from 27 to 31 second. Am i > right ? > So looking at both pictures average boot time did not change. Yes 3.6.10 = 27-31sec. Yes average boot time did not chage, but visual slow. Especially when connecting external devices. Ok, i understand that it may be a visual problem, caused by boot interruption. What errors do you get at boot ? DO you get Fast TSC calibration failure ? Created attachment 1395 [details] tsc.png (In reply to comment #42) > Ok, i understand that it may be a visual problem, caused by boot > interruption. > What errors do you get at boot ? DO you get Fast TSC calibration failure ? Yes, i get message "tsc: Fast TSC calibration failed". BUT! This message was old kernel 3.6.10. But it was displayed ONLY in the log. And now, in new kernel 3.8.12, it INTERRUPTS the conclusion of Plymouth. Still I'm concerned about inscription: "tsc: using HPET reference calibration" On the x64 test system after wakeup system sleep again. Second wakeup work correct. It's annoying regression :-( Created attachment 1397 [details]
dmesg double wake up
Created attachment 1398 [details]
syslog double wakeup
Created attachment 1399 [details]
messages double wakeup
1. Remove pm-utils 2. Install new upower without pm-utils support https://abf.rosalinux.ru/build_lists/1098836 https://abf.rosalinux.ru/build_lists/1098837 3. Update KDE 4.10.3 for pm-utils drop 4. Try again Also try work systemd power modes systemctl <mode> where <mode> suspend Suspend the system hibernate Hibernate the system Correct links to upower packages: https://abf.rosalinux.ru/build_lists/1098850 https://abf.rosalinux.ru/build_lists/1098851 Created attachment 1405 [details]
test_suspend_hibernate
В текте отчета написано, что нет зависимости от ядер, так все же она есть или нет ? Если проблема общая для всех ядер, то ее решать нужно отдельным багом ) (In reply to comment #51) > В текте отчета написано, что нет зависимости от ядер, так все же она есть > или нет ? Если проблема общая для всех ядер, то ее решать нужно отдельным > багом ) Да, я согласен. Но уже здесь столько нужных контейнеров, что тащить замучаешься. ok kernel-desktop-latest-3.8.12-2.1 cpupower-3.8.12-2-rosa2012.1 ************************* Advisory ************************* kernel was updated to 3.8.12 version ************************************************************ QA Verified |