在我使用2.6.32内核的CentOS发行版6.7系统中,我观察到安装rpm非常慢。 通常情况并非如此。 有什么问题? 系统空闲。 没有CPU密集型应用程序在运行。 top , vmstat输出看起来不错。
root@blr# rpm -ivh SystemSupport-10.5.1.i386_centos_el6.rpm Preparing... ########################################### [100%] 1:SystemSupport ####### (17%)
它现在停留了一个多小时的17%。
任何在调查这个问题的指针是真诚的赞赏。
lsof -p $(pgrep -o rpm)列出了许多条目。 与我想要安装的rpm的一个是:
rpm 2533 root 6r REG 8,2 18375865 17481 /root/software/1.12.6.002_GA/SystemSupport-10.5.1.i386_centos_el6.rpm rpm 2533 root 7u REG 8,6 12288 853 /var/lib/rpm/Triggername rpm 2533 root 8uW REG 8,6 0 610 /var/lib/rpm/.rpm.lock rpm 2533 root 9r REG 8,2 18375865 17481 /root/software/1.12.6.002_GA/SystemSupport-10.5.1.i386_centos_el6.rpm [root@blr]# iostat -x -d 1 Linux 2.6.32-642.1.1.el6.i686 (blr) 01/06/2017 _i686_ (24 CPU) Device: rrqm/s wrqm/sr/sw/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.30 6.70 1.22 1.66 29.17 66.92 33.32 15.16 5257.02 117.73 9029.41 318.10 91.73 Device: rrqm/s wrqm/sr/sw/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 2.00 0.00 40.00 20.00 5.12 1622.00 0.00 1622.00 500.00 100.00 Device: rrqm/s wrqm/sr/sw/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 1.00 0.00 2.00 0.00 16.00 8.00 5.25 3247.00 0.00 3247.00 500.00 100.00
dmesg的:
INFO: task rs:main Q:Reg:1961 blocked for more than 120 seconds. Not tainted 2.6.32-642.1.1.el6.i686 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. rs:main Q:Reg D c1ed5364 0 1961 1 0x00000080 f44e7aa0 00000082 00000c88 c1ed5364 c1ed5364 00000000 00000000 c0469e95 c15f8c80 00000557 02adfee8 00000557 c0b9bb20 c0b9bb20 f44e7d48 c0b9bb20 c0b97364 c0b9bb20 f44e7d48 c1ed5364 00001084 00000000 c1a77824 f5244d40 Call Trace: [<c0469e95>] ? local_bh_enable+0x75/0x90 [<c057171c>] ? __find_get_block+0x8c/0x1c0 [<c04913da>] ? ktime_get_ts+0xea/0x120 [<c08789a9>] ? io_schedule+0x59/0xa0 [<c04f4c7c>] ? sync_page+0x2c/0x40 [<c087909f>] ? __wait_on_bit_lock+0x3f/0x90 [<c04f4c50>] ? sync_page+0x0/0x40 [<c04f4c30>] ? __lock_page+0x80/0x90 [<c0485c80>] ? wake_bit_function+0x0/0x60 [<c04f5cac>] ? find_lock_page+0x3c/0x70 [<c04f5d1d>] ? grab_cache_page_write_begin+0x3d/0xc0 [<f7f2df84>] ? ext4_da_write_begin+0xc4/0x260 [ext4] [<f7f27309>] ? ext4_mark_iloc_dirty+0x349/0x570 [ext4] [<c04f5563>] ? generic_file_buffered_write+0x103/0x2c0 [<c04f6344>] ? __generic_file_aio_write+0x1e4/0x540 [<c0456656>] ? try_to_wake_up+0x206/0x3c0 [<c04f6717>] ? generic_file_aio_write+0x77/0xf0 [<c0543945>] ? do_sync_write+0xd5/0x120 [<c0485be0>] ? autoremove_wake_function+0x0/0x40 [<c04c0736>] ? audit_filter_rules+0x16/0xde0 [<c0580139>] ? inotify_dentry_parent_queue_event+0x89/0xc0 [<c05c49fc>] ? security_file_permission+0xc/0x10 [<c0543b16>] ? rw_verify_area+0x66/0xe0 [<c0543870>] ? do_sync_write+0x0/0x120 [<c0543c30>] ? vfs_write+0xa0/0x190 [<c08793e1>] ? mutex_lock+0x11/0x40 [<c05447eb>] ? sys_write+0x4b/0xa0 [<c0409bbf>] ? sysenter_do_call+0x12/0x28 [root@AG1K-1 stgadm]#
smartctl读取
smartctl 5.43 2012-06-30 r3573 [i686-linux-2.6.32-642.1.1.el6.i686] (local build) 2 Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net 3 4 === START OF INFORMATION SECTION === 5 Device Model: 16GB CompactFlash Card 6 Serial Number: 20150511 00000017 7 Firmware Version: CFMAD01A 8 User Capacity: 16,391,340,032 bytes [16.3 GB] 9 Sector Size: 512 bytes logical/physical 10 Device is: Not in smartctl database [for details use: -P showall] 11 ATA Version is: 7 12 ATA Standard is: Exact ATA specification draft version not indicated 13 Local Time is: Fri Jan 6 20:31:51 2017 MYT 14 SMART support is: Available - device has SMART capability. 15 SMART support is: Enabled 16 17 === START OF READ SMART DATA SECTION === 18 SMART overall-health self-assessment test result: PASSED 19 20 General SMART Values: 21 Offline data collection status: (0x00) Offline data collection activity 22 was never started. 23 Auto Offline Data Collection: Disabled. 24 Total time to complete Offline 25 data collection: ( 0) seconds. 26 Offline data collection 27 capabilities: (0x00) Offline data collection not supported. 28 SMART capabilities: (0x0002) Does not save SMART data before 29 entering power-saving mode. 30 Supports SMART auto save timer. 31 Error logging capability: (0x00) Error logging NOT supported. 32 No General Purpose Logging support. 33 34 SMART Attributes Data Structure revision number: 1 35 Vendor Specific SMART Attributes with Thresholds: 36 ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 37 12 Power_Cycle_Count 0x0200 100 100 000 Old_age Offline - 0 38 160 Unknown_Attribute 0x0200 100 100 000 Old_age Offline - 3 39 161 Unknown_Attribute 0x0200 100 100 000 Old_age Offline - 37 40 162 Unknown_Attribute 0x0200 100 100 000 Old_age Offline - 43 41 163 Unknown_Attribute 0x0200 100 100 000 Old_age Offline - 9355 42 164 Unknown_Attribute 0x0200 100 100 000 Old_age Offline - 9050 43 165 Unknown_Attribute 0x0200 100 100 000 Old_age Offline - 9050 44 241 Total_LBAs_Written 0x0200 100 100 000 Old_age Offline - 0 45 46 SMART Error Log not supported 47 Error SMART Error Self-Test Log Read failed: scsi error aborted command 48 Smartctl: SMART Self Test Log Read Failed 49 Device does not support Selective Self Tests/Logging
strace输出。 以下经常重复:
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 getuid32() = 0 getuid32() = 0 chown32("/usr/local/lr/support/SystemSupport/10.5.1/Python/lib/python2.4/distutils/bcppcompiler.pyo;586f8d75", 0, 0) = 0 chmod("/usr/local/lr/support/SystemSupport/10.5.1/Python/lib/python2.4/distutils/bcppcompiler.pyo;586f8d75", 0644) = 0 utime("/usr/local/lr/support/SystemSupport/10.5.1/Python/lib/python2.4/distutils/bcppcompiler.pyo;586f8d75", [2016/10/24-19:22:15, 2016/10/24-19:22:15]) = 0 getuid32() = 0 lstat64("/usr/local/lr/support/SystemSupport/10.5.1/Python/lib/python2.4/distutils/bcppcompiler.pyo", {st_mode=S_IFREG|0644, st_size=8379, ...}) = 0 rename("/usr/local/lr/support/SystemSupport/10.5.1/Python/lib/python2.4/distutils/bcppcompiler.pyo;586f8d75", "/usr/local/lr/support/SystemSupport/10.5.1/Python/lib/python2.4/distutils/bcppcompiler.pyo") = 0 umask(0777) = 022 open("/usr/local/lr/support/SystemSupport/10.5.1/Python/lib/python2.4/distutils/ccompiler.py;586f8d75", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666
在过程中运行strace或lsof以查看卡住的位置。 这不是一件普通的事情。
strace -f -p $(pgrep -o rpm) lsof -p $(pgrep -o rpm)
如smartctl ,该磁盘是一个非常小的16 GB CompactFlash卡。 执行hdparm -t /dev/sda显示没有结果,所以问题似乎与CF卡有关。 这似乎证实,即使读取/写入操作很less, iowait显示100%的磁盘使用率。
闪存是奇怪的野兽:它们非常快, 直到所有页面/块被写入。 在此之后,在读取/修改/写入或读取/删除/写入场景中的任何额外写入操作都会导致性能停顿。 为了避免这个问题,SSD通常有后台垃圾程序和/或公开TRIM命令来显式清除未使用的块。
我强烈怀疑你的CF卡都没有。 或者,也许它是坏的。 在这两种情况下,最安全的方法是取代它(或许更大一些)。