阿里云服务器Ubuntu 18 高负载,低CPU,低IO问题处理

in 后端 with 0 comment

最近一台1C1G的阿里云服务器,负载异常高能达到4-5,任何时刻都超过1;
使用top命令查看cpu占用很低。
参照https://www.v2ex.com/t/519111 分析,也没有发现D状态进程( D 状态进程是由于在等待 io)

 ps -eTo stat,pid,tid,ppid,comm --no-header | sed -e 's/^ *//' | perl -nE 'chomp;say if (m!^\S*[RD]+\S*!)'

后来按照楼中的60秒分析大法

uptime
dmesg | tail
vmstat 1
mpstat -P ALL 1
pidstat 1
iostat -xz 1
free -m
sar -n DEV 1
sar -n TCP,ETCP 1
top

dmesg | tail 发现一条异常 7 urandom warning(s) missed due to ratelimiting

搜索查询了以下,都说是个bug(参考https://www.linode.com/community/questions/17915/why-did-my-server-miss-urandom-warnings-due-to-rate-limiting)

These messages are the result of a bug that exhausts the entropy pool on your system. While this pool is typically refilled over time through various actions (such as disk activity, keyboard timings, mouse movements, etc)

解决方案上面的链接也有,安装haveged

apt install haveged
systemctl enable haveged

重启,负载终于降下来了;

Comments are closed.