Today I had to track down the cause of an issue we were having with a server where shortly after restarting the server, requests would start to hang, and the number of Apache processes seemed to be growing rather large, rather quickly.
I started out using Apache’s mod_status to get some details about the state of each process.
I noticed that many of the processes ended up  in a ‘”W
” Â or “Sending Reply” state. Â I choose a random Apache process and fired up ‘strace’ to try to get some more information:
server7:/root# strace -p 11574
Process 11574 attached – interrupt to quit
flock(26, LOCK_EX <unfinished …>
This process was stuck waiting for an exclusive lock on some file.  I used ‘readlink’ to find out the name of the file in question:
server7:/root# readlink /proc/11574/fd/26
/mnt/Pages/xml/0/1/list1055.xml
Once I had the name of the file I used ‘lsof’ to see if there were any other processes trying to access that file as well:
server7:/root#lsof |grep list1055.xml
httpd 11574 nobody 26w REG 0,31 4232 925874559 /mnt/Pages/xml/0/1/list1055.xml (storage1.npr.org:/files/data)
httpd 11579 nobody 26w REG 0,31 4232 925874559 /mnt/Pages/xml/0/1/list1055.xml (storage1.npr.org:/files/data)
httpd 11629 nobody 26w REG 0,31 4232 925874559 /mnt/Pages/xml/0/1/list1055.xml (storage1.npr.org:/files/data)
Here we have several other process waiting for an exclusive lock on the file as well.
At this point it appears as though a recent code change maybe the cause of this issue…however a closer look at the recent source code commits will be required to know for sure.