我被给了一个url列表,生成404错误报告给我们的谷歌。
我可以用curl(从命令行)testing一个URL,像这样:
curl -k --user-agent "Googlebot/2.1 (+http://www.google.com/bot.html)" https://MYURLHERE
其工作完全符合我的预期。 我想把它放在一个脚本中,所以我可以在这里列出一个列表。
#!/usr/bin/bash url=$1 curlcmd="curl -k --user-agent \"Googlebot/2.1 (+http://www.google.com/bot.html)\"" $curlcmd $url
但它不工作。 我不断收到
curl: (1) Protocol "(+http" not supported or disabled in libcurl
我不知道如何逃避这个工作。 有什么build议么 ?
把你的variables$ 1引号,或者你可以使用这样的东西:
$ touch $$ $ echo 'http://www.google.com' >> $$ $ echo 'http://www.yahoo.com' >> $$ $ for url in $(cat $$); do curl -I $url ; done HTTP/1.1 200 OK Date: Wed, 22 Nov 2017 15:57:19 GMT Expires: -1 Cache-Control: private, max-age=0 Content-Type: text/html; charset=ISO-8859-1 P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info." Server: gws X-XSS-Protection: 1; mode=block X-Frame-Options: SAMEORIGIN Set-Cookie: 1P_JAR=2017-11-22-15; expires=Fri, 22-Dec-2017 15:57:19 GMT; path=/; domain=.google.com Set-Cookie: NID=117=CaOUCOyr9TPjs64tqyz1MuqHsASzL_3eO5n-NE4ubqAikITGbs7QY0aegNByOWX1Vaf9SsUVQDJ1wdaIOZwXoiqfVZ9ISLtta7tvcDH6LFM52OGFKRH4J5Clde2EX8oG; expires=Thu, 24-May-2018 15:57:19 GMT; path=/; domain=.google.com; HttpOnly Accept-Ranges: none Vary: Accept-Encoding Age: 0 Transfer-Encoding: chunked Via: 1.1 localhost.localdomain HTTP/1.1 200 OK Date: Wed, 22 Nov 2017 15:57:19 GMT Expires: -1 Cache-Control: private, max-age=0 Content-Type: text/html; charset=ISO-8859-1 P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info." Server: gws X-XSS-Protection: 1; mode=block X-Frame-Options: SAMEORIGIN Set-Cookie: 1P_JAR=2017-11-22-15; expires=Fri, 22-Dec-2017 15:57:19 GMT; path=/; domain=.google.com Set-Cookie: NID=117=VRrA0-bCESlSCoerEK0n1hxXfldwpQI4cisiKrEgnKVph9HkfQJu-tbur3ZBiLh3-RFKZ0kbWUWsBwJKzsi_aPUuJzztM1rCuDfljZLxqjaHanZxiCx7qch4P2WCoDDC; expires=Thu, 24-May-2018 15:57:19 GMT; path=/; domain=.google.com; HttpOnly Accept-Ranges: none Vary: Accept-Encoding Age: 0 Transfer-Encoding: chunked Via: 1.1 localhost.localdomain HTTP/1.1 200 OK Date: Wed, 22 Nov 2017 15:57:19 GMT Via: http/1.1 media-router-fp56.prod.media.ne1.yahoo.com (ApacheTrafficServer [csf ]), 1.1 localhost.localdomain Server: ATS Cache-Control: no-store, no-cache, max-age=0, private Content-Type: text/html Content-Language: en Expires: -1 X-Frame-Options: SAMEORIGIN Content-Length: 12 Age: 0 $
你可以像这样修改它:
#!/usr/bin/bash url="$1" curlcmd='curl -k --user-agent "Googlebot/2.1 (+http://www.google.com/bot.html)"' $curlcmd "$url"
该消息,你越来越说,http(默认)不支持。 改用https:
./test.sh https://www.somepage.com