2022-01-07 15:38:19 UTC
I have a server that runs Slack64 14.2 and has done so since
before 14.2. A few weeks ago the system started crashing.
For most of the crashes the kernel was still running
and would respond to pings, and there was a display
but the server would not accept keyboard or mouse input.
The system would run for a few days and crash again.
I swapped out the power supply with a brand new 750w unit.
The crashes continued.
I swapped out the motherboard/cpu/memory with one
from a working machine. The crashes continued.
I updated the 10 year old bios on the motherboard.
I tried different kernels.
I updated everything with slackpkg.
I updated Chrome to the latest version. Chrome runs all the time.
Only the computer case and NVidia graphics card remain
from the original system, and still the crashes persist.
When I got up this morning, the system had crashed
during the night. After rebooting I looked at the syslog
and I found a stream of:
rcu_sched self-detected stall on CPU
errors which continued until I rebooted the system
This seems to be related to a kernel overload as if
there were too many tasks for the system to keep up.
The cpu is Intel Core I7 3.4GHz with 16GB of memory.
Among other Call Traces in the syslog I see something
that must have originated within Chrome, and another
crash from kswapd, when I have no swap partition.
I am pretty much out of ideas and would appreciate