当docker容器运行相同的应用程序（例如apache2）时，主机上的“pidof”检查失败

apache2 init脚本执行pidof检查来检测apache是否已经在运行。

  if pidof $DAEMON > /dev/null 2>&1 ; then if [ -e $PIDFILE ] && pidof $DAEMON | tr ' ' '\n' | grep -w $(cat $PIDFILE) > /dev/null 2>&1 ; then AP_RET=2 else AP_RET=1 fi ... elif [ $AP_RET = 1 ] ; then APACHE2_INIT_MESSAGE="There are processes named 'apache2' running which do not match your pid file which are left untouched in the name of safety, Please review the situation by hand".

（Ubuntu 16.04.3 LTS上的文件：/ etc / init / apache2 – 为了简洁起见被截断）

但是在docker主机上，VM容器可能已经有apache了。在这种情况下，即使主机上没有运行apache， pidof也会返回非空。

 $ sudo service apache2 stop $ pidof apache2 32742 32480 32379 32365 31295 31294 31293 31292 31291 31274 31270

这意味着init脚本只有在其中有apache的所有 docker容器已经停止（或尚未启动）时才能成功。因此，主机上的apache不能restart 。

如何解决这种情况，使主机的Apache可以独立于虚拟机重启？有没有一个pidof的版本，只会检测由init直接拥有的pid？

太糟糕# can't use pidofproc from LSB here ，在初始化脚本中， # can't use pidofproc from LSB here中的# can't use pidofproc from LSB here ，而没有真正的解释。我仍然认为这个apache2脚本有一个值得报告的bug。

TL; DR：解决方法：用pgrep --ns 1 ^apache2$replacepidof apache2 pgrep --ns 1 ^apache2$ （或者如果这不起作用， pgrep --ns 1 --nslist uts ^apache2$ ）

关于命名空间的一个很长的解释，我find了一个例子，在findpgrep之前可以这样做：

一旦你有了使用pidof的“候选者”，下面是一个将它们分开的方法：检查它们的命名空间，并将它们与pid 1 （init / systemd）的命名空间进行比较。使用lxc和inetd进程的例子，但这是容器的技术和进程的名称不可知论：

 # lxc-start stretch-amd64 # pidof inetd 10285 3372 # ls -l /proc/1/ns/ total 0 lrwxrwxrwx. 1 root root 0 nov. 9 19:49 cgroup -> cgroup:[4026531835] lrwxrwxrwx. 1 root root 0 nov. 9 19:49 ipc -> ipc:[4026531839] lrwxrwxrwx. 1 root root 0 nov. 9 19:49 mnt -> mnt:[4026531840] lrwxrwxrwx. 1 root root 0 nov. 9 19:49 net -> net:[4026531993] lrwxrwxrwx. 1 root root 0 nov. 9 19:49 pid -> pid:[4026531836] lrwxrwxrwx. 1 root root 0 nov. 9 19:49 pid_for_children -> pid:[4026531836] lrwxrwxrwx. 1 root root 0 nov. 9 19:49 user -> user:[4026531837] lrwxrwxrwx. 1 root root 0 nov. 9 19:49 uts -> uts:[4026531838] # ls -l /proc/3372/ns/ total 0 lrwxrwxrwx. 1 root root 0 nov. 9 19:51 cgroup -> cgroup:[4026531835] lrwxrwxrwx. 1 root root 0 nov. 9 19:51 ipc -> ipc:[4026531839] lrwxrwxrwx. 1 root root 0 nov. 9 19:51 mnt -> mnt:[4026531840] lrwxrwxrwx. 1 root root 0 nov. 9 19:51 net -> net:[4026531993] lrwxrwxrwx. 1 root root 0 nov. 9 19:51 pid -> pid:[4026531836] lrwxrwxrwx. 1 root root 0 nov. 9 19:51 pid_for_children -> pid:[4026531836] lrwxrwxrwx. 1 root root 0 nov. 9 19:51 user -> user:[4026531837] lrwxrwxrwx. 1 root root 0 nov. 9 19:51 uts -> uts:[4026531838] # ls -l /proc/10285/ns/ total 0 lrwxrwxrwx. 1 root root 0 nov. 9 19:50 cgroup -> cgroup:[4026532516] lrwxrwxrwx. 1 root root 0 nov. 9 19:50 ipc -> ipc:[4026532415] lrwxrwxrwx. 1 root root 0 nov. 9 19:50 mnt -> mnt:[4026532410] lrwxrwxrwx. 1 root root 0 nov. 9 19:50 net -> net:[4026532418] lrwxrwxrwx. 1 root root 0 nov. 9 19:50 pid -> pid:[4026532416] lrwxrwxrwx. 1 root root 0 nov. 9 19:50 pid_for_children -> pid:[4026532416] lrwxrwxrwx. 1 root root 0 nov. 9 19:50 user -> user:[4026531837] lrwxrwxrwx. 1 root root 0 nov. 9 19:50 uts -> uts:[4026532414]

在这里清楚地看到， pid 3372共享了pid 1的命名空间。 3372正在主机上运行。 10285不共享任何命名空间（好的用户是相同的：以root运行的容器），所以它在一个容器中。可能有时在主机上运行的某些程序出于某种原因（通常是与安全有关的）有一些更改，但不应该是uts（主机名）命名空间。所以这里是一个使用stat的脚本，并且给定arg“$ 1”中的进程名（例如： set -- inetd或脚本的参数）只会给出同一个uts命名空间中的进程，通常意味着（同一个）主机。

 pid1uts="$(stat -c %N /proc/1/ns/uts|cut -d' ' -f3)" for i in $(pidof "$1"); do if [ "$pid1uts" = "$(stat -c %N /proc/$i/ns/uts|cut -d' ' -f3)" ]; then echo $i fi done | xargs -r

在我的例子中，返回3372 。

我解释了如何做到这一点，但是为什么在pgrep有处理它的选项时重新发明轮子：

 # pgrep ^inetd$ 3372 10285 # pgrep --ns 1 --nslist uts ^inetd$ 3372

或者对于大多数情况而言：

 # pgrep --ns 1 ^inetd$ 3372

如果你在端口80上的主机监听你的服务，你可以找出进程ID与netstats

 #netstat -plan | grep :80

容器进程应该与主机上的其他端口号绑定，内部容器与80端口绑定。所以你可以很容易地找出主机进程并杀死它。