看起来好像在Amazon AMI(最新的2016.09
版本)上忽略了resolv.conf
选项use-vc
。 考虑以下:
[hadoop@ip-172-20-40-202 ~]$ cat /etc/resolv.conf search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal options use-vc ndots:5 timeout:2 attempts:5 nameserver 172.20.53.184 nameserver 172.20.0.2
如果我交互地使用nslookup
,通过set vc
强制使用TCP,查询完全按照预期工作:
[hadoop@ip-172-20-40-202 ~]$ nslookup > set vc > kafka.default.svc.cluster.local ;; Got recursion not available from 172.20.53.184, trying next server ;; Got recursion not available from 172.20.53.184, trying next server ;; Got recursion not available from 172.20.53.184, trying next server Server: 172.20.53.184 Address: 172.20.53.184#53 Name: kafka.default.svc.cluster.local Address: 100.96.14.2 Name: kafka.default.svc.cluster.local Address: 100.96.7.2 Name: kafka.default.svc.cluster.local Address: 100.96.13.2 > kafka Server: 172.20.53.184 Address: 172.20.53.184#53 Name: kafka.default.svc.cluster.local Address: 100.96.14.2 Name: kafka.default.svc.cluster.local Address: 100.96.7.2 Name: kafka.default.svc.cluster.local Address: 100.96.13.2 > exit
但是,如果离开它自己, nslookup
失败:
[hadoop@ip-172-20-40-202 ~]$ nslookup kafka.default.svc.cluster.local Server: 172.20.0.2 Address: 172.20.0.2#53 ** server can't find kafka.default.svc.cluster.local: NXDOMAIN
与dig
相同。 强制TCP按预期工作:
[hadoop@ip-172-20-40-202 ~]$ dig +vc kafka.default.svc.cluster.local ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.52.amzn1 <<>> +vc kafka.default.svc.cluster.local ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55634 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;kafka.default.svc.cluster.local. IN A ;; ANSWER SECTION: kafka.default.svc.cluster.local. 30 IN A 100.96.13.2 kafka.default.svc.cluster.local. 30 IN A 100.96.14.2 kafka.default.svc.cluster.local. 30 IN A 100.96.7.2 ;; Query time: 2 msec ;; SERVER: 172.20.53.184#53(172.20.53.184) ;; WHEN: Thu Mar 16 20:45:06 2017 ;; MSG SIZE rcvd: 97
而不是强制TCP失败:
[hadoop@ip-172-20-40-202 ~]$ dig kafka.default.svc.cluster.local ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.52.amzn1 <<>> kafka.default.svc.cluster.local ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 9580 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;kafka.default.svc.cluster.local. IN A ;; AUTHORITY SECTION: . 52 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2017031602 1800 900 604800 86400 ;; Query time: 0 msec ;; SERVER: 172.20.0.2#53(172.20.0.2) ;; WHEN: Thu Mar 16 20:44:58 2017 ;; MSG SIZE rcvd: 124
看起来好像use-vc
中的options use-vc ndots:5 timeout:2 attempts:5
被忽略。
如何正确configuration我的configuration以强制使用TCP来用于所有DNS查询? man resolv.conf
说它应该工作!
它看起来像诊断工具, nslookup
和dig
,误导我。
当我使用getent
,我看到名字确实正确地parsing了,并且遵守/etc/resolv.conf
的use-vc
选项:
[hadoop@ip-172-20-40-202 ~]$ getent ahosts kafka.default.svc.cluster.local 100.96.13.2 STREAM kafka.default.svc.cluster.local 100.96.13.2 DGRAM 100.96.13.2 RAW 100.96.14.2 STREAM 100.96.14.2 DGRAM 100.96.14.2 RAW 100.96.7.2 STREAM 100.96.7.2 DGRAM 100.96.7.2 RAW [hadoop@ip-172-20-40-202 ~]$ getent hosts kafka.default.svc.cluster.local 100.96.13.2 kafka.default.svc.cluster.local 100.96.14.2 kafka.default.svc.cluster.local 100.96.7.2 kafka.default.svc.cluster.local
如果我删除/etc/resolv.conf
中的use-vc
选项, getent
borks按预期。
谁知道,对不对?