我最近inheritance了一个我从零开始知道的glusterfs设置。 其中一个为硬盘提供砖块的HDD失败了,我能够replace那个硬盘,主机操作系统可以看到硬盘。 我已经成功地对它进行了格式化,并将其放置在replace的硬盘现在被安装的位置。
这是我需要帮助的地方。
我相信我需要运行某种治疗命令,但是对于GlusterFS如何做到这一点,我感到困惑不解。 这是一些背景信息。
$ mount |grep glus /dev/sdc1 on /data/glusterfs/sdc1 type xfs (rw,relatime,attr2,inode64,noquota) /dev/sdg1 on /data/glusterfs/sdg1 type xfs (rw,relatime,attr2,inode64,noquota) /dev/sdf1 on /data/glusterfs/sdf1 type xfs (rw,relatime,attr2,inode64,noquota) /dev/sdb1 on /data/glusterfs/sdb1 type xfs (rw,relatime,attr2,inode64,noquota) /dev/sdd1 on /data/glusterfs/sdd1 type xfs (rw,relatime,attr2,inode64,noquota) 127.0.0.1:/nova on /var/lib/nova/instances type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) 127.0.0.1:/cinder on /var/lib/nova/mnt/92ef2ec54fd18595ed18d8e6027a1b3d type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) /dev/sde1 on /data/glusterfs/sde1 type xfs (rw,relatime,attr2,inode64,noquota)
我replace的硬盘是/dev/sde1 。 我已经安装(如上所示),当我运行glusterfs volume info我看到它在那里列出:
$ gluster volume info nova Volume Name: nova Type: Distributed-Replicate Volume ID: f0d72d64-288c-4e72-9c53-2d16ce5687ac Status: Started Number of Bricks: 10 x 2 = 20 Transport-type: tcp Bricks: Brick1: icicle07:/data/glusterfs/sdb1/brick Brick2: icicle08:/data/glusterfs/sdb1/brick Brick3: icicle09:/data/glusterfs/sdb1/brick Brick4: icicle10:/data/glusterfs/sdb1/brick Brick5: icicle11:/data/glusterfs/sdb1/brick Brick6: icicle07:/data/glusterfs/sdc1/brick Brick7: icicle08:/data/glusterfs/sdc1/brick Brick8: icicle09:/data/glusterfs/sdc1/brick Brick9: icicle10:/data/glusterfs/sdc1/brick Brick10: icicle11:/data/glusterfs/sdc1/brick Brick11: icicle07:/data/glusterfs/sdd1/brick Brick12: icicle08:/data/glusterfs/sdd1/brick Brick13: icicle09:/data/glusterfs/sdd1/brick Brick14: icicle10:/data/glusterfs/sdd1/brick Brick15: icicle11:/data/glusterfs/sdd1/brick Brick16: icicle07:/data/glusterfs/sde1/brick Brick17: icicle08:/data/glusterfs/sde1/brick Brick18: icicle09:/data/glusterfs/sde1/brick Brick19: icicle10:/data/glusterfs/sde1/brick Brick20: icicle11:/data/glusterfs/sde1/brick
试图运行一个治疗命令的结果是:
$ gluster volume heal nova full Locking failed on c551316f-7218-44cf-bb36-befe3d3df34b. Please check log file for details. Locking failed on 79a6a414-3569-482c-929f-b7c5da16d05e. Please check log file for details. Locking failed on ae62c691-ae55-4c99-8364-697cb3562668. Please check log file for details. Locking failed on 5f43c6a4-0ccd-424a-ae56-0492ec64feeb. Please check log file for details. Locking failed on cb78ba3c-256f-4413-ae7e-aa5c0e9872b5. Please check log file for details. Locking failed on 6c0111fc-b5e7-4350-8be5-3179a1a5187e. Please check log file for details. Locking failed on 88fcb687-47aa-4921-b3ab-d6c3b330b32a. Please check log file for details. Locking failed on d73de03a-0f66-4619-89ef-b73c9bbd800e. Please check log file for details. Locking failed on c7416c1f-494b-4a95-b48d-6c766c7bce14. Please check log file for details. Locking failed on 4a780f57-37e4-4f1b-9c34-187a0c7e44bf. Please check log file for details.
尝试再次运行该命令导致:
$ gluster volume heal nova full Another transaction is in progress. Please try again after sometime.
重新启动glusterd会刷新locking,但是我不知道上面的治疗命令实际上是在告诉我什么。 日志我觉得没有用,因为有几个,而且我不完全清楚哪些与什么:
$ ls -ltr /var/log/glusterfs ... rw------- 1 root root 41711 Aug 1 00:51 glfsheal-nova.log-20150801 -rw------- 1 root root 0 Aug 1 03:39 glfsheal-nova.log -rw------- 1 root root 4297 Aug 1 14:29 cmd_history.log-20150531 -rw------- 1 root root 830449 Aug 1 17:03 var-lib-nova-instances.log -rw------- 1 root root 307535 Aug 1 17:03 glustershd.log -rw------- 1 root root 255801 Aug 1 17:03 nfs.log -rw------- 1 root root 4544 Aug 1 17:12 cmd_history.log -rw------- 1 root root 28063 Aug 1 17:12 cli.log -rw------- 1 root root 17370562 Aug 1 17:14 etc-glusterfs-glusterd.vol.log -rw------- 1 root root 1759170187 Aug 1 17:14 var-lib-nova-mnt-92ef2ec54fd18595ed18d8e6027a1b3d.log
任何指导将不胜感激。
看起来好像当系统试图为我添加的砖块/硬盘启动相应的glusterfsd时,系统会遇到问题。下面是日志文件的输出: /var/log/glusterfs/bricks/data-glusterfs-sde1-brick.log :
[2015-08-01 21:40:25.143963] I [MSGID: 100030] [glusterfsd.c:2294:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.0 (args: /usr/sbin/glusterfsd -s icicle11 --volfile-id nova.icicle11.data-glusterfs-sde1-brick -p /var/lib/glusterd/vols/nova/run/icicle11-data-glusterfs-sde1-brick.pid -S /var/run/gluster/d0a51f364706915faa35c6cca46e9ce6.socket --brick-name /data/glusterfs/sde1/brick -l /var/log/glusterfs/bricks/data-glusterfs-sde1-brick.log --xlator-option *-posix.glusterd-uuid=5e09f3ec-bfbc-490b-bd93-8e083e8ebd05 --brick-port 49155 --xlator-option nova-server.listen-port=49155) [2015-08-01 21:40:25.190863] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-08-01 21:40:48.359478] I [graph.c:269:gf_add_cmdline_options] 0-nova-server: adding option 'listen-port' for volume 'nova-server' with value '49155' [2015-08-01 21:40:48.359513] I [graph.c:269:gf_add_cmdline_options] 0-nova-posix: adding option 'glusterd-uuid' for volume 'nova-posix' with value '5e09f3ec-bfbc-490b-bd93-8e083e8ebd05' [2015-08-01 21:40:48.359696] I [server.c:392:_check_for_auth_option] 0-/data/glusterfs/sde1/brick: skip format check for non-addr auth option auth.login./data/glusterfs/sde1/brick.allow [2015-08-01 21:40:48.359709] I [server.c:392:_check_for_auth_option] 0-/data/glusterfs/sde1/brick: skip format check for non-addr auth option auth.login.a9c47852-7dcf-4f89-80e5-110101943f36.password [2015-08-01 21:40:48.359719] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2015-08-01 21:40:48.360606] I [rpcsvc.c:2213:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2015-08-01 21:40:48.360679] W [options.c:936:xl_opt_validate] 0-nova-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction [2015-08-01 21:40:48.361713] E [ctr-helper.c:250:extract_ctr_options] 0-gfdbdatastore: CTR Xlator is disabled. [2015-08-01 21:40:48.361745] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-pagesize from params.Assigning default value: 4096 [2015-08-01 21:40:48.361762] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-cachesize from params.Assigning default value: 1000 [2015-08-01 21:40:48.361774] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-journalmode from params.Assigning default value: wal [2015-08-01 21:40:48.361795] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-wal-autocheckpoint from params.Assigning default value: 1000 [2015-08-01 21:40:48.361812] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-sync from params.Assigning default value: normal [2015-08-01 21:40:48.361825] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-autovacuum from params.Assigning default value: none [2015-08-01 21:40:48.362666] I [trash.c:2363:init] 0-nova-trash: no option specified for 'eliminate', using NULL [2015-08-01 21:40:48.362906] E [posix.c:5894:init] 0-nova-posix: Extended attribute trusted.glusterfs.volume-id is absent [2015-08-01 21:40:48.362922] E [xlator.c:426:xlator_init] 0-nova-posix: Initialization of volume 'nova-posix' failed, review your volfile again [2015-08-01 21:40:48.362930] E [graph.c:322:glusterfs_graph_init] 0-nova-posix: initializing translator failed [2015-08-01 21:40:48.362956] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-08-01 21:40:48.363612] W [glusterfsd.c:1219:cleanup_and_exit] (--> 0-: received signum (0), shutting down
好,所以有一个问题似乎是扩展的属性不会出现在挂载的砖的文件系统。 这个命令是为了解决这个问题:
$ grep volume-id /var/lib/glusterd/vols/nova/info | cut -d= -f2 | sed 's/-//g' f0d72d64288c4e729c532d16ce5687ac $ setfattr -n trusted.glusterfs.volume-id -v 0xf0d72d64288c4e729c532d16ce5687ac /data/glusterfs/sde1
然而,我仍然得到了关于属性缺席的上述警告:
[2015-08-01 18:44:50.481350] E [posix.c:5894:init] 0-nova-posix:扩展属性trusted.glusterfs.volume-id不存在
glusterd restart全部输出:
[2015-08-01 22:03:41.467668] I [MSGID: 100030] [glusterfsd.c:2294:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.0 (args: /usr/sbin/glusterfsd -s icicle11 --volfile-id nova.icicle11.data-glusterfs-sde1-brick -p /var/lib/glusterd/vols/nova/run/icicle11-data-glusterfs-sde1-brick.pid -S /var/run/gluster/d0a51f364706915faa35c6cca46e9ce6.socket --brick-name /data/glusterfs/sde1/brick -l /var/log/glusterfs/bricks/data-glusterfs-sde1-brick.log --xlator-option *-posix.glusterd-uuid=5e09f3ec-bfbc-490b-bd93-8e083e8ebd05 --brick-port 49155 --xlator-option nova-server.listen-port=49155) [2015-08-01 22:03:41.514878] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-08-01 22:04:00.334285] I [graph.c:269:gf_add_cmdline_options] 0-nova-server: adding option 'listen-port' for volume 'nova-server' with value '49155' [2015-08-01 22:04:00.334330] I [graph.c:269:gf_add_cmdline_options] 0-nova-posix: adding option 'glusterd-uuid' for volume 'nova-posix' with value '5e09f3ec-bfbc-490b-bd93-8e083e8ebd05' [2015-08-01 22:04:00.334518] I [server.c:392:_check_for_auth_option] 0-/data/glusterfs/sde1/brick: skip format check for non-addr auth option auth.login./data/glusterfs/sde1/brick.allow [2015-08-01 22:04:00.334529] I [server.c:392:_check_for_auth_option] 0-/data/glusterfs/sde1/brick: skip format check for non-addr auth option auth.login.a9c47852-7dcf-4f89-80e5-110101943f36.password [2015-08-01 22:04:00.334540] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2015-08-01 22:04:00.335316] I [rpcsvc.c:2213:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2015-08-01 22:04:00.335371] W [options.c:936:xl_opt_validate] 0-nova-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction [2015-08-01 22:04:00.336170] E [ctr-helper.c:250:extract_ctr_options] 0-gfdbdatastore: CTR Xlator is disabled. [2015-08-01 22:04:00.336190] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-pagesize from params.Assigning default value: 4096 [2015-08-01 22:04:00.336197] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-cachesize from params.Assigning default value: 1000 [2015-08-01 22:04:00.336211] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-journalmode from params.Assigning default value: wal [2015-08-01 22:04:00.336217] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-wal-autocheckpoint from params.Assigning default value: 1000 [2015-08-01 22:04:00.336235] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-sync from params.Assigning default value: normal [2015-08-01 22:04:00.336241] W [gfdb_sqlite3.h:238:gfdb_set_sql_params] 0-nova-changetimerecorder: Failed to retrieve sql-db-autovacuum from params.Assigning default value: none [2015-08-01 22:04:00.336951] I [trash.c:2363:init] 0-nova-trash: no option specified for 'eliminate', using NULL [2015-08-01 22:04:00.337131] E [posix.c:5894:init] 0-nova-posix: Extended attribute trusted.glusterfs.volume-id is absent [2015-08-01 22:04:00.337142] E [xlator.c:426:xlator_init] 0-nova-posix: Initialization of volume 'nova-posix' failed, review your volfile again [2015-08-01 22:04:00.337148] E [graph.c:322:glusterfs_graph_init] 0-nova-posix: initializing translator failed [2015-08-01 22:04:00.337154] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2015-08-01 22:04:00.337629] W [glusterfsd.c:1219:cleanup_and_exit] (--> 0-: received signum (0), shutting down
好吧,看来我必须做到以下几点。
添加扩展属性trusted.glusterfs.volume-id – 注意到它需要在/brick目录下,我尝试了一个级别,并且没有工作
$ setfattr -n trusted.glusterfs.volume-id -v 0xf0d72d64288c4e729c532d16ce5687ac /data/glusterfs/sde1/brick
注意: volume-id的值来自以下命令:
$ grep volume-id /var/lib/glusterd/vols/nova/info | cut -d= -f2 | sed 's/-//g' f0d72d64288c4e729c532d16ce5687ac
重新启动glusterd
$ service restart glusterd.service
如果我然后看砖的日志: /var/log/glusterfs/bricks/data-glusterfs-sde1-brick.log你会看到效果的消息:
[2015-08-01 22:28:01.510200] I [login.c:81:gf_auth] 0-auth/login: allowed user names: a9c47852-7dcf-4f89-80e5-110101943f36 [2015-08-01 22:28:01.510254] I [server-handshake.c:585:server_setvolume] 0-nova-server: accepted client from icicle07.td.teradata.com-44127-2015/08/01-21:08:06:639278-nova-client-19-0-0 (version: 3.7.0) [2015-08-01 22:28:01.510584] I [login.c:81:gf_auth] 0-auth/login: allowed user names: a9c47852-7dcf-4f89-80e5-110101943f36 [2015-08-01 22:28:01.510614] I [server-handshake.c:585:server_setvolume] 0-nova-server: accepted client from icicle08.td.teradata.com-7291-2015/07/02-00:22:13:514999-nova-client-19-0-0 (version: 3.7.0) [2015-08-01 22:28:01.513443] I [login.c:81:gf_auth] 0-auth/login: allowed user names: a9c47852-7dcf-4f89-80e5-110101943f36
现在,当我看到砖块,我可以看到它与集群的其他部分同步:
$ while [ 1 ]; do du -sh /data/glusterfs/sde1/brick; sleep 30; done 38G /data/glusterfs/sde1/brick 40G /data/glusterfs/sde1/brick 41G /data/glusterfs/sde1/brick
一旦完成,运行一个治愈命令来检查事情。
$ gluster volume heal nova full
我在重新启动glusterd之后也看到了这些消息:
[2015-08-01 22:27:56.882271] W [graph.c:357:_log_if_unknown_option] 0-nova-quota: option 'timeout' is not recognized [2015-08-01 22:27:56.882303] W [graph.c:357:_log_if_unknown_option] 0-nova-trash: option 'brick-path' is not recognized Final graph: +------------------------------------------------------------------------------+ 1: volume nova-posix 2: type storage/posix 3: option glusterd-uuid 5e09f3ec-bfbc-490b-bd93-8e083e8ebd05 4: option directory /data/glusterfs/sde1/brick 5: option volume-id f0d72d64-288c-4e72-9c53-2d16ce5687ac 6: end-volume 7: 8: volume nova-trash 9: type features/trash 10: option trash-dir .trashcan 11: option brick-path /data/glusterfs/sde1/brick 12: option trash-internal-op off 13: subvolumes nova-posix 14: end-volume 15: 16: volume nova-changetimerecorder 17: type features/changetimerecorder 18: option db-type sqlite3 19: option hot-brick off 20: option db-name brick.db 21: option db-path /data/glusterfs/sde1/brick/.glusterfs/ 22: option record-exit off 23: option ctr_link_consistency off 24: option record-entry on 25: option ctr-enabled off 26: option record-counters off 27: subvolumes nova-trash 28: end-volume 29: 30: volume nova-changelog 31: type features/changelog 32: option changelog-brick /data/glusterfs/sde1/brick 33: option changelog-dir /data/glusterfs/sde1/brick/.glusterfs/changelogs 34: option changelog-barrier-timeout 120 35: subvolumes nova-changetimerecorder 36: end-volume 37: 38: volume nova-bitrot-stub 39: type features/bitrot-stub 40: option export /data/glusterfs/sde1/brick 41: subvolumes nova-changelog 42: end-volume 43: 44: volume nova-access-control 45: type features/access-control 46: subvolumes nova-bitrot-stub 47: end-volume 48: 49: volume nova-locks 50: type features/locks 51: subvolumes nova-access-control 52: end-volume 53: 54: volume nova-upcall 55: type features/upcall 56: option cache-invalidation off 57: subvolumes nova-locks 58: end-volume 59: 60: volume nova-io-threads 61: type performance/io-threads 62: subvolumes nova-upcall 63: end-volume 64: 65: volume nova-barrier 66: type features/barrier 67: option barrier disable 68: option barrier-timeout 120 69: subvolumes nova-io-threads 70: end-volume 71: 72: volume nova-index 73: type features/index 74: option index-base /data/glusterfs/sde1/brick/.glusterfs/indices 75: subvolumes nova-barrier 76: end-volume 77: 78: volume nova-marker 79: type features/marker 80: option volume-uuid f0d72d64-288c-4e72-9c53-2d16ce5687ac 81: option timestamp-file /var/lib/glusterd/vols/nova/marker.tstamp 82: option xtime off 83: option gsync-force-xtime off 84: option quota off 85: option inode-quota off 86: subvolumes nova-index 87: end-volume 88: 89: volume nova-quota 90: type features/quota 91: option volume-uuid nova 92: option server-quota off 93: option timeout 0 94: option deem-statfs off 95: subvolumes nova-marker 96: end-volume 97: 98: volume nova-worm 99: type features/worm 100: option worm off 101: subvolumes nova-quota 102: end-volume 103: 104: volume nova-read-only 105: type features/read-only 106: option read-only off 107: subvolumes nova-worm 108: end-volume 109: 110: volume /data/glusterfs/sde1/brick 111: type debug/io-stats 112: option latency-measurement off 113: option count-fop-hits off 114: subvolumes nova-read-only 115: end-volume 116: 117: volume nova-server 118: type protocol/server 119: option transport.socket.listen-port 49155 120: option rpc-auth.auth-glusterfs on 121: option rpc-auth.auth-unix on 122: option rpc-auth.auth-null on 123: option transport-type tcp 124: option auth.login./data/glusterfs/sde1/brick.allow a9c47852-7dcf-4f89-80e5-110101943f36 125: option auth.login.a9c47852-7dcf-4f89-80e5-110101943f36.password XXXXXXX 126: option auth.addr./data/glusterfs/sde1/brick.allow * 127: subvolumes /data/glusterfs/sde1/brick 128: end-volume 129: +------------------------------------------------------------------------------+
您可以使用以下命令查看存在的属性:
$ getfattr -d -m . -e hex /data/glusterfs/sde1/brick getfattr: Removing leading '/' from absolute path names # file: data/glusterfs/sde1/brick trusted.afr.dirty=0x000000000000000000000000 trusted.afr.nova-client-18=0x000000000000000000000000 trusted.afr.nova-client-19=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x00000001000000004ccccccb66666663 trusted.glusterfs.dht.commithash=0x3000 trusted.glusterfs.volume-id=0xf0d72d64288c4e729c532d16ce5687ac