Bacula客户端需要OS重启才能在CentOS上工作

我的Linux CentOS 7(Bacula版本5.2.13)和3个客户端上有一个Bacula服务器。 1是一个Windows和2 Linux(Debian 8和Centos 7)Windows客户端和Debian客户端工作正常,但CentOS客户端有一个问题。 作业失败,错误ERR =中断的系统调用。 看起来,客户端不会在端口9102响应。在这一点上telnet得不到答案,如果我在远程主机上做,但它提供了一个答案,如果我试试它本地。 重新启动后,它将工作一两个小时,直到错误再次出现。

这是我的configuration文件:

在10.0.0.180上的bacula-dir.conf

# # # The only thing that MUST be changed is to add one or more # file or directory names in the Include directive of the # FileSet resource. # # For Bacula release 5.2.13 (19 February 2013) -- redhat (Core) # # You might also want to change the default email address # from root to your address. See the "mail" and "operator" # directives in the Messages resource. # Director { # define myself Name = bacula-dir DIRport = 9101 # where we listen for UA connections QueryFile = "/etc/bacula/query.sql" WorkingDirectory = "/var/spool/bacula" PidDirectory = "/var/run" Maximum Concurrent Jobs = 1 Password = "SECRETPASSWORD" # Console password Messages = Daemon DirAddress = 10.0.0.180 } JobDefs { Name = "DefaultJob" Type = Backup Level = Full Client = bacula-fd FileSet = "Full Set" Schedule = "WeeklyCycle" Storage = File Messages = Standard Pool = File Priority = 10 Write Bootstrap = "/var/spool/bacula/%c.bsr" } Job{ Name = "Bkp Client 182" JobDefs = "DefaultJob" Client = 10.0.0.182 } Job{ Name = "Bkp Client 182 inc" JobDefs = "DefaultJob" Client = 10.0.0.182 Schedule = "testschedule" Level = Incremental } Job{ Name = "Bkp Client 183" JobDefs = "DefaultJob" Client = 10.0.0.183 } Job{ Name = "Bkp Client 188" JobDefs = "DefaultJob" Client = 10.0.0.188 FileSet = "Windows" } # # Define the main nightly save backup job # By default, this job will back up to disk in /tmp Job { Name = "IncBackup" JobDefs = "DefaultJob" } #Job { # Name = "BackupClient2" # Client = bacula2-fd # JobDefs = "DefaultJob" #} # Backup the catalog database (after the nightly save) Job { Name = "FullBackup" JobDefs = "DefaultJob" Level = Full FileSet="Catalog" #FileSet="Full Set" Schedule = "WeeklyCycleAfterBackup" # This creates an ASCII copy of the catalog # Arguments to make_catalog_backup.pl are: # make_catalog_backup.pl <catalog-name> RunBeforeJob = "/usr/libexec/bacula/make_catalog_backup.pl MyCatalog" # This deletes the copy of the catalog RunAfterJob = "/usr/libexec/bacula/delete_catalog_backup" Write Bootstrap = "/var/spool/bacula/%n.bsr" Priority = 11 # run after main backup } # # Standard Restore template, to be changed by Console program # Only one such job is needed for all Jobs/Clients/Storage ... # Job { Name = "RestoreFiles" Type = Restore Client=bacula-fd FileSet="Full Set" Storage = File Pool = Default Messages = Standard Where = /tmp/bacula-restores } FileSet { Name = "Windows" Include { Options { signature = MD5 compression = GZIP } File = "C:/" } } FileSet { Name = "Full Set" Include { Options { signature = MD5 compression = GZIP } # # Put your list of files here, preceded by 'File =', one per line # or include an external list with: # # File = <file-name # # Note: / backs up everything on the root partition. # if you have other partitions such as /usr or /home # you will probably want to add them too. # # By default this is defined to point to the Bacula binary # directory to give a reasonable FileSet to backup to # disk storage during initial testing. # File = / } # # If you backup the root directory, the following two excluded # files can be useful # Exclude { File = /var/spool/bacula File = /tmp File = /proc File = /tmp File = /.journal File = /.fsck File = /bacula } } # # When to do the backups, full backup on first sunday of the month, # differential (ie incremental since full) every other sunday, # and incremental backups other days Schedule { Name = "WeeklyCycle" Run = Full 1st sun at 23:05 Run = Differential 2nd-5th sun at 23:05 Run = Incremental mon-sat at 23:05 } # This schedule does the catalog. It starts after the WeeklyCycle Schedule { Name = "WeeklyCycleAfterBackup" Run = Full sun-sat at 23:10 } Schedule { Name ="testschedule" Run = Incremental mon-sat at 11:05 } # This is the backup of the catalog FileSet { Name = "Catalog" Include { Options { signature = MD5 compression = GZIP } File = "/var/spool/bacula/bacula.sql" } } # Client (File Services) to backup Client { Name = bacula-fd Address = localhost FDPort = 9102 Catalog = MyCatalog Password = "SECRETPASSWORD" # password for FileDaemon File Retention = 30 days # 30 days Job Retention = 6 months # six months AutoPrune = yes # Prune expired Jobs/Files } # # Second Client (File Services) to backup # You should change Name, Address, and Password before using Client { Name = 10.0.0.182 Address = 10.0.0.182 FDPort = 9102 Catalog = MyCatalog Password = "SECRETPASSWORD" # password for FileDaemon 2 File Retention = 30 days # 30 days Job Retention = 6 months AutoPrune = yes } Client { Name = 10.0.0.183 Address = 10.0.0.183 FDPort = 9102 Catalog = MyCatalog Password = "SECRETPASSWORD" # password for FileDaemon 2 File Retention = 30 days # 30 days Job Retention = 6 months AutoPrune = yes } Client { Name = 10.0.0.189 Address = 10.0.0.189 FDPort = 9102 Catalog = MyCatalog Password = "SECRETPASSWORD" # password for FileDaemon 2 File Retention = 30 days # 30 days Job Retention = 6 months AutoPrune = yes } Client { Name = 10.0.0.188 Address = 10.0.0.188 FDPort = 9102 Catalog = MyCatalog Password = "SECRETPASSWORD" # password for FileDaemon 2 File Retention = 30 days # 30 days Job Retention = 6 months AutoPrune = yes } # #Client { # Name = bacula2-fd # Address = localhost2 # FDPort = 9102 # Catalog = MyCatalog # Password = "@@FD_PASSWORD@@2" # password for FileDaemon 2 # File Retention = 30 days # 30 days # Job Retention = 6 months # six months # AutoPrune = yes # Prune expired Jobs/Files #} # Definition of file storage device Storage { Name = File # Do not use "localhost" here Address = 10.0.0.180 # NB Use a fully qualified name here SDPort = 9103 Password = "SECRETPASSWORD" Device = FileStorage Media Type = File } # Definition of DDS tape storage device #Storage { # Name = DDS-4 # Do not use "localhost" here # Address = 10.0.0.180 # NB Use a fully qualified name here # SDPort = 9103 # Password = "SECRETPASSWORD" # Device = DDS-4 # must be same as Device in Storage daemon # Media Type = DDS-4 # must be same as MediaType in Storage daemon # Autochanger = yes # enable for autochanger device #} # Definition of 8mm tape storage device #Storage { # Name = "8mmDrive" # Do not use "localhost" here # Address = 10.0.0.180 # NB Use a fully qualified name here # SDPort = 9103 # Password = "SECRETPASSWORD" # Device = "Exabyte 8mm" # MediaType = "8mm" #} # Definition of DVD storage device #Storage { # Name = "DVD" # Do not use "localhost" here # Address = 10.0.0.180 # NB Use a fully qualified name here # SDPort = 9103 # Password = "SECRETPASSWORD" # Device = "DVD Writer" # MediaType = "DVD" #} # Generic catalog service Catalog { Name = MyCatalog # Uncomment the following line if you want the dbi driver # dbdriver = "dbi:postgresql"; dbaddress = 127.0.0.1; dbport = dbname = "bacula"; dbuser = "bacula"; dbpassword = "SECRETPASSWORD" } # Reasonable message delivery -- send most everything to email address # and to the console Messages { Name = Standard # # NOTE! If you send to two email or more email addresses, you will need # to replace the %r in the from field (-f part) with a single valid # email address in both the mailcommand and the operatorcommand. # What this does is, it sets the email address that emails would display # in the FROM field, which is by default the same email as they're being # sent to. However, if you send email to more than one address, then # you'll have to set the FROM address manually, to a single address. # for example, a '[email protected]', is better since that tends to # tell (most) people that its coming from an automated source. # mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula: %t %e of %c %l\" %r" operatorcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula: Intervention needed for %j\" %r" mail = root@localhost = all, !skipped operator = root@localhost = mount console = all, !skipped, !saved # # WARNING! the following will create a file that you must cycle from # time to time as it will grow indefinitely. However, it will # also keep all your messages if they scroll off the console. # append = "/var/log/bacula/bacula.log" = all, !skipped catalog = all } # # Message delivery for daemon messages (no job). Messages { Name = Daemon mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula daemon message\" %r" mail = root@localhost = all, !skipped console = all, !skipped, !saved append = "/var/log/bacula/bacula.log" = all, !skipped } # Default pool definition Pool { Name = Default Pool Type = Backup Label Format = Local- Recycle = yes # Bacula can automatically recycle Volumes AutoPrune = yes # Prune expired volumes Volume Retention = 365 days # one year } # File Pool definition Pool { Name = File Pool Type = Backup Label Format = Local- Recycle = yes # Bacula can automatically recycle Volumes AutoPrune = yes # Prune expired volumes Volume Retention = 365 days # one year Maximum Volume Bytes = 50G # Limit Volume size to something reasonable Maximum Volumes = 100 # Limit number of Volumes in Pool } # Scratch pool definition Pool { Name = Scratch Pool Type = Backup Label Format = Local- } # # Restricted console used by tray-monitor to get the status of the director # Console { Name = bacula-mon Password = "SECRETPASSWORD" CommandACL = status, .status } 

10.0.0.182上的bacula-fd.conf

 # # Default Bacula File Daemon Configuration file # # For Bacula release 5.2.13 (19 February 2013) -- redhat (Core) # # There is not much to change here except perhaps the # File daemon Name to # # # List Directors who are permitted to contact this File daemon # Director { Name = bacula-dir Password = "SECRETPASSWORD" } # # Restricted Director, used by tray-monitor to get the # status of the file daemon # Director { Name = bacula-mon Password = "SECRETPASSWORD" Monitor = yes } # # "Global" File daemon configuration specifications # FileDaemon { # this is me Name = 10.0.0.182 FDport = 9102 # where we listen for the director WorkingDirectory = /var/spool/bacula Pid Directory = /var/run Maximum Concurrent Jobs = 20 } # Send all messages except skipped files back to Director Messages { Name = Standard director = bacula-dir = all, !skipped, !restored } 

而现在有趣的事情。 当我在早上检查bacula.log如果计划的工作工作,它看起来像这样:

 +-------+--------------------+---------------------+------+-------+----------+---------------+-----------+ | JobId | Name | StartTime | Type | Level | JobFiles | JobBytes | JobStatus | +-------+--------------------+---------------------+------+-------+----------+---------------+-----------+ | 145 | Bkp Client 182 | 2016-12-07 23:05:02 | B | I | 0 | 0 | E | | 146 | Bkp Client 183 | 2016-12-07 23:08:15 | B | I | 15 | 1,165,244 | T | | 147 | Bkp Client 188 | 2016-12-07 23:08:18 | B | I | 308 | 176,285,953 | T | | 148 | IncBackup | 2016-12-07 23:10:02 | B | I | 831 | 118,708,240 | T | | 149 | FullBackup | 2016-12-07 23:10:40 | B | F | 1 | 40,523,735 | T | +-------+--------------------+---------------------+------+-------+----------+---------------+-----------+ 

日志告诉我:

 07-Dec 23:05 bacula-dir JobId 145: Start Backup JobId 145, Job=Bkp_Client_182.2016-12-07_23.05.00_04 07-Dec 23:05 bacula-dir JobId 145: Using Device "FileStorage" to write. 07-Dec 23:08 bacula-dir JobId 145: Warning: bsock.c:132 Could not connect to Client: 10.0.0.182 on 10.0.0.182:9102. ERR=Interrupted system call Retrying ... 07-Dec 23:08 bacula-dir JobId 145: Fatal error: bsock.c:138 Unable to connect to Client: 10.0.0.182 on 10.0.0.182:9102. ERR=Interrupted system call 07-Dec 23:08 bacula-dir JobId 145: Fatal error: No Job status returned from FD. 07-Dec 23:08 bacula-dir JobId 145: Error: Bacula bacula-dir 5.2.13 (19Jan13): Build OS: x86_64-redhat-linux-gnu redhat (Core) JobId: 145 Job: Bkp_Client_182.2016-12-07_23.05.00_04 Backup Level: Incremental, since=2016-12-07 09:59:59 Client: "10.0.0.182" 5.2.13 (19Jan13) x86_64-redhat-linux-gnu,redhat,(Core) FileSet: "Full Set" 2016-11-11 07:28:59 Pool: "File" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "File" (From Job resource) Scheduled time: 07-Dec-2016 23:05:00 Start time: 07-Dec-2016 23:05:02 End time: 07-Dec-2016 23:08:12 Elapsed time: 3 mins 10 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Volume Session Id: 61 Volume Session Time: 1480582855 Last Volume Bytes: 28,914,391,620 (28.91 GB) Non-fatal FD errors: 1 SD Errors: 0 FD termination status: Error SD termination status: Waiting on FD Termination: *** Backup Error *** 

作业“Bkp客户端182”不会工作,直到客户端重新启动。 防火墙被禁用。 我把SELinux放在客户端,用ss -lnt |检查端口 grep -i听

 LISTEN 0 50 *:9102 *:* LISTEN 0 128 *:22 *:* LISTEN 0 100 127.0.0.1:25 *:* LISTEN 0 128 :::22 :::* LISTEN 0 100 ::1:25 :::* 

我重新启动客户端,手动运行这个工作,得到了这个:(运行良好)

 08-Dec 11:05 bacula-dir JobId 153: Start Backup JobId 153, Job=Bkp_Client_182_inc.2016-12-08_11.05.00_06 08-Dec 11:05 bacula-dir JobId 153: Using Device "FileStorage" to write. 08-Dec 11:05 10.0.0.182 JobId 153: DIR and FD clocks differ by 7 seconds, FD automatically compensating. 08-Dec 11:05 bacula-sd JobId 153: Volume "Local-0001" previously written, moving to end of data. 08-Dec 11:05 bacula-sd JobId 153: Ready to append to end of Volume "Local-0001" size=30288206775 08-Dec 11:05 10.0.0.182 JobId 153: /boot is a different filesystem. Will not descend from / into it. 08-Dec 11:05 10.0.0.182 JobId 153: /dev is a different filesystem. Will not descend from / into it. 08-Dec 11:05 10.0.0.182 JobId 153: /run is a different filesystem. Will not descend from / into it. 08-Dec 11:05 10.0.0.182 JobId 153: /sys is a different filesystem. Will not descend from / into it. 08-Dec 11:05 bacula-sd JobId 153: Elapsed time=00:00:01, Transfer rate=271 Bytes/second 08-Dec 11:05 bacula-dir JobId 153: Bacula bacula-dir 5.2.13 (19Jan13): Build OS: x86_64-redhat-linux-gnu redhat (Core) JobId: 153 Job: Bkp_Client_182_inc.2016-12-08_11.05.00_06 Backup Level: Incremental, since=2016-12-08 11:01:16 Client: "10.0.0.182" 5.2.13 (19Jan13) x86_64-redhat-linux-gnu,redhat,(Core) FileSet: "Full Set" 2016-11-11 07:28:59 Pool: "File" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "File" (From Job resource) Scheduled time: 08-Dec-2016 11:05:00 Start time: 08-Dec-2016 11:05:02 End time: 08-Dec-2016 11:05:02 Elapsed time: 0 secs Priority: 10 FD Files Written: 4 SD Files Written: 4 FD Bytes Written: 0 (0 B) SD Bytes Written: 271 (271 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Local-0001 Volume Session Id: 69 Volume Session Time: 1480582855 Last Volume Bytes: 30,288,207,524 (30.28 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination: Backup OK 08-Dec 11:05 bacula-dir JobId 153: Begin pruning Jobs older than 6 months . 08-Dec 11:05 bacula-dir JobId 153: No Jobs found to prune. 08-Dec 11:05 bacula-dir JobId 153: Begin pruning Files. 08-Dec 11:05 bacula-dir JobId 153: No Files found to prune. 08-Dec 11:05 bacula-dir JobId 153: End auto prune. 

更新

我试图重现这种情况,但这次用Redhat 7.3 / 7.2作为客户端,Redhat 7.3作为服务器。 到目前为止他们工作没有任何问题。

更新我监测了CentOS,发现它在工作与不工作之间切换。 Redhat从昨天16:30开始没有问题。

更新和ifconfig -a从客户端的输出

  inet 10.0.0.182 netmask 255.255.255.240 broadcast 10.0.0.191 ether 52:54:00:b1:2c:b2 txqueuelen 1000 (Ethernet) RX packets 42585 bytes 2505692 (2.3 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 42868 bytes 146780428 (139.9 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 loop txqueuelen 1 (Local Loopback) RX packets 68 bytes 5780 (5.6 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 68 bytes 5780 (5.6 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 

我可以肯定地告诉你:Bacula客户端不需要重新启动。 决不。 所以这个问题来自bacula-fd之下的某个地方。

由于telnet没有答案,bacula正在使用TCP / IP会话,它只是不能工作。 解决networking和/或安全问题,你会解决bacula一个。