So, you think you have a PHP session problem?
The symptoms
You’re running an Apache + PHP stack, and every now and then Apache just stops. If you try curl’ing your host, you get something like this:
$ curl -Ivv api.something.com
* Rebuilt URL to: api.something/
* Trying xxx.xxx.xxx.xxx...
* Connected to api.something.com (xxx.xxx.xxx.xxx) port 80 (#0)
> HEAD / HTTP/1.1
> Host: api.something.com
> User-Agent: curl/7.43.0
> Accept: */*
>
…and nothing more.
The diagnosis
First, let’s look at what Apache processes you have. Get a console on the server, and run ps -ax | grep apache
. You’ll get a response like this:
# ps -ax | grep apache
3015 ? Ss 0:00 /usr/sbin/apache2 -k start
3019 ? S 0:00 /usr/sbin/apache2 -k start
3020 ? S 0:00 /usr/sbin/apache2 -k start
3026 ? S 0:00 /usr/sbin/apache2 -k start
3028 ? S 0:00 /usr/sbin/apache2 -k start
3033 pts/0 S+ 0:00 grep --color=auto apache
The first five lines are what we’re interested in. The left column is the process ID for the Apache workers.
Next, let’s see what each of those processes is doing. In my case, I started with process 3015. Run the command strace -p 3015
, substituting in your first process identifier for 3015
.
When I did this, I saw the following:
# strace -p 3015
Process 3015 attached
select(0, NULL, NULL, NULL, {0, 515464}) = 0 (Timeout)
wait4(-1, 0x7fffd736a4e4, WNOHANG|WSTOPPED, NULL) = 0
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
wait4(-1, 0x7fffd736a4e4, WNOHANG|WSTOPPED, NULL) = 0
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
Those two lines were repeated over and over again. That looks fine, it’s definitely not hung. (To be honest, this is expected, it’s Apache’s host process. But it’s a good illustration of what a healthy process looks like.) Hit Ctrl-C to exit.
Next up, 3019
.
# strace -p 3019
Process 3019 attached
flock(29, LOCK_EX
Woah. The output doesn’t change. This process looks hung. The flock
is also a big clue, the process is trying to get an exclusive lock (LOCK_EX
) on a file. The 29
tells us which file it is. I hit Ctrl-C to exit.
Let’s have a look for files that are used by process 3019
. You can use the command lsof -p 3019
, and narrow down the results using grep 29
because we know (from the strace
that we’re looking for file 29). I saw the following:
# lsof -p 3019 | grep 29
apache2 3019 www-data mem REG 252,0 295816 3414376 /usr/lib/x86_64-linux-gnu/libhx509.so.5.0.0
apache2 3019 www-data mem REG 252,0 109296 3414384 /usr/lib/x86_64-linux-gnu/libsasl2.so.2.0.25
apache2 3019 www-data mem REG 252,0 290520 3409846 /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2
apache2 3019 www-data mem REG 252,0 97296 22545808 /lib/x86_64-linux-gnu/libnsl-2.19.so
apache2 3019 www-data mem REG 252,0 108736 4982329 /usr/lib/apache2/modules/mod_proxy.so
apache2 3019 www-data mem REG 252,0 18440 4982296 /usr/lib/apache2/modules/mod_mime.so
apache2 3019 www-data mem REG 252,0 22544 4982295 /usr/lib/apache2/modules/mod_authz_core.so
apache2 3019 www-data 0r CHR 1,3 0t0 1029 /dev/null
apache2 3019 www-data 1w CHR 1,3 0t0 1029 /dev/null
apache2 3019 www-data 15w REG 252,0 72996896 14161104 /var/log/apache2/api.release.error.log
apache2 3019 www-data 29uW REG 252,0 50 14158656 /var/lib/php5/sess_g6evsj4nbuainfqi1bvlo08323
That last line, for file /var/lib/php5/sess_g6evsj4nbuainfqi1bvlo08323
, is the PHP session file.
So, let’s see which processes are using that file. You can do this with the fuser
command.
# fuser /var/lib/php5/sess_g6evsj4nbuainfqi1bvlo08323
/var/lib/php5/sess_g6evsj4nbuainfqi1bvlo08323: 3019 3020 3026 3028 3037 3041 3042 3043 3055 3056 3057 3059 3060 3061 3062 3063 3065 3066 3067 3103 3104 3105 3106 3107 3108 3109 3110 3111 3113 3114 3119 3120 3121 3126 3127 3128 3129 3131 3132 3133 3134 3135 3136 3138 3139 3140 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3157 3158 3159 3162 3163 3164 3166 3167 3168 3169 3170 3171 3172 3173 3174 3175 3178 3181 3183 3185 3186
In this case, we have a PHP session problem. PHP isn’t releasing the lock on the session file so all the Apache requests are hanging.