UBB.Dev
Hi

UBB Threads was this weekend installed on a new dedicated server at http://www.koptalk.org for my busy site. I decided to dump my old site in favour for UBB Threads and everything was working fine until today.

Allen installed the beta over the weekend until the final release came out and I had IIP running on the front page with most features enabled.

Today, the site went down. I spoke to Digital Princeton who have been great with their support, this is their last email from them:

Here is info you can send to your UBB person, some script is recursivlly
calling to a httpd process;

httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: recursive call
httpd in free(): warning: page is already free
httpd in free(): warning: page is already free
httpd in free(): warning: page is already free
httpd in free(): warning: page is already free
httpd in free(): warning: page is already free
httpd in free(): warning: page is already free
httpd in free(): warning: page is already free


Email me back and as soon as I get it, Ill turn PSA back on and you can go
in and shut down Whos Online and see if that helps. Ill watch the load
closely to see whats going on.

------

Any ideas what is happening? The guy thinks that the who's online feature could be causing it. I have emailed Allen but as I am in England I have no idea what the time difference is.

Thanks very much for any help :-)


Hi Duncan,

Welcome to ThredasDev - even if you are a Liverpool support

I assume that you are on a Plesk account as he refers to PSA - the latest beta version runs fine on plesk - has anything been altered/customized on the site apart from IIP?

Is this the only application running on your site - or are there other scripts outside of threads.

Are there any visible errors when the site is up?
Hi - I know I'm unlucky to be a Red

Mike @ the hosting company said:

"the site is hosted on a dedicated box with Plesk 5.0.5 , what needs to be customised as far as IIP goes:? This is also the only script running on the server.no visable errors when site is running, the server load increases over span of about 10 min and the server crashes. There has to be some PHP script recursivlly calling to apache | httpd and not closing spawns. any ideas?"

There have been no other mods as far as I am aware.

At one point they turned the server back on for me so I could quickly log in and disable the various IIP features. I just left the shoutbox, couple of custom boxes, couple of news items, hot topic and search options on. The site still went off again after just a few minutes.

Thanks - I appreciate your help.


Just received this from the hosts:

Good News,

You were not hacked. I have a feeling its a UBB coding problem:

They dont seem to take use of some PHP commands that improve performace. You
can tell them about :

http://www.php.net/manual/en/ref.apache.php
and
http://www.php.net/manual/en/function.apache-child-terminate.php

Might be an issue they overlooked.

cust691506# ./chkrootkit
ROOTDIR is `/'
Checking `amd'... not infected
Checking `basename'... not infected
Checking `biff'... not infected
Checking `chfn'... not infected
Checking `chsh'... not infected
Checking `cron'... not infected
Checking `date'... not infected
Checking `du'... not infected
Checking `dirname'... not infected
Checking `echo'... not infected
Checking `egrep'... not infected
Checking `env'... not infected
Checking `find'... not infected
Checking `fingerd'... not infected
Checking `gpm'... not found
Checking `grep'... not infected
Checking `hdparm'... not found
Checking `su'... not infected
Checking `ifconfig'... not infected
Checking `inetd'... not infected
Checking `inetdconf'... not infected
Checking `identd'... not found
Checking `init'... not infected
Checking `killall'... not infected
Checking `ldsopreload'... not tested
Checking `login'... not infected
Checking `ls'... not infected
Checking `lsof'... not found
Checking `mail'... not infected
Checking `mingetty'... not found
Checking `netstat'... not infected
Checking `named'... not infected
Checking `passwd'... not infected
Checking `pidof'... not found
Checking `pop2'... not found
Checking `pop3'... not found
Checking `ps'... not infected
Checking `pstree'... not found
Checking `rpcinfo'... not infected
Checking `rlogind'... not infected
Checking `rshd'... not infected
Checking `slogin'... not infected
Checking `sendmail'... not infected
Checking `sshd'... not infected
Checking `syslogd'... not infected
Checking `tar'... not infected
Checking `tcpd'... not infected
Checking `tcpdump'... not infected
Checking `top'... not infected
Checking `telnetd'... not infected
Checking `timed'... not infected
Checking `traceroute'... not infected
Checking `w'... not infected
Checking `write'... not infected
Checking `aliens'... no suspect files
Searching for sniffer's logs, it may take a while... nothing found
Searching for HiDrootkit's default dir... nothing found
Searching for t0rn's default files and dirs... nothing found
Searching for t0rn's v8 defaults... nothing found
Searching for Lion Worm default files and dirs... nothing found
Searching for RSHA's default files and dir... nothing found
Searching for RH-Sharpe's default files... nothing found
Searching for Ambient's rootkit (ark) default files and dirs... nothing found
Searching for suspicious files and dirs, it may take a while... nothing found
Searching for LPD Worm files and dirs... nothing found
Searching for Ramen Worm files and dirs... nothing found
Searching for Maniac files and dirs... nothing found
Searching for RK17 files and dirs... nothing found
Searching for Ducoci rootkit... nothing found
Searching for Adore Worm... nothing found
Searching for ShitC Worm... nothing found
Searching for Omega Worm... nothing found
Searching for Sadmind/IIS Worm... nothing found
Searching for MonKit... nothing found
Searching for Showtee... nothing found
Searching for OpticKit... nothing found
Searching for T.R.K... nothing found
Searching for Mithra... nothing found
Searching for OBSD rk v1... nothing found
Searching for LOC rootkit ... nothing found
Searching for Romanian rootkit ... nothing found
Searching for anomalies in shell history files... nothing found
Checking `asp'... not infected
Checking `bindshell'... not infected
Checking `lkm'... Checking `rexedcs'... not found
Checking `sniffer'... not tested: can't exec ./ifpromisc
Checking `wted'... not tested: can't exec ./chkwtmp
Checking `scalper'... not infected
Checking `slapper'... not infected
Checking `z2'... not tested: can't exec ./chklastlog


----

"also found some reference to php/apache things I felt might be causing the problem, but only UBB experts can answer what the problem is. "

----

All of above from the hosts.
I enabled my PM as you requested but your post above has since been deleted PM's enabled anyway.
Can you provide some details about your server?
-PHP Version
-MySQL Version
-Apache Version
-OS
-RAM
-CPU

Some details about your ubb.threads installation:
-version
-size of the database
-maximum concurrent users online
-average users online
-are you using persistant connections or not?


Does the problem still occur if you disable IIP?
Do you cache your IIP or not? If so, enable caching.
Sent your q's to the host who replied:

P4 1.8 ghz
512 mb ram
40gb 7,200 rpm hd
Plesk Psa 5.0.5
the os type is freebsd

Could you get more info about this new IIP thing and what it does for me.

---

Allen Ayres will have all the answers to your questions. I'm not tech minded at all but was hoping there might be an easy answer that could have resolved this until I heard from him

Allen installed the latest beta of Threads as I wanted to get something on before the latest release. I think around 1200 people registered over the Bank Holiday weekend. So far over the quiet period of the weekend there was around 50 members and 170 guests online - this would have increased today I guess.

I don't know how to disable IIP or change anything on it other than the left/centre/right off options etc.

Threads is totally new to me and I have no experience of it having previously only used UBB Classic.

Threads was going to be used as my main news site and I had spent all weekend adding various categories, forums and content. What should have been my first day today for everyone to have a good look around has been a disaster.



Duncan.

Are you on a shared account, or on a dedicated server? I am assuming shared, otherwise your host would not have turned PSA off.
it's a dedicated server
A recursive call is when a function call's itself, I believe.

function check() {

if (check()) {

echo "it's true";

}

return true;

}


Or something like that. I can't think of anything in IIP that would cause this. I'm not 100% sure but I don't believe I have any specific functions in IIP. It uses the .threads functions and classes. But I'm not 100% sure that it's not IIP causing your problems.

You should remove IIP and see if that corrects the problem. Knowing the version of Apache and MySQL and PHP will help others when trying to figure this out.
I guess I'd best wait for Allen to email me.... hope he hasn't gone on holiday
Hosts:

"Im 100% positive that its a php function recursivilly calling httpd spawns. I've programed in PHP/MySQL for over 4 years now.

PLESK PSA 5.0.5
Apache: 1.3.27
MySQL: 3.23.55

Some people are going to probably tell you that those are not the latest builds, but these are the latest builds that PLESK uses and they are all stable and secure."

----

Does this help at all?
Yes, wait for Allen. He knows what he has done, has access to your server and he is familiar with IIP and ubb.threads.
[]Astaran said:
Yes, wait for Allen. He knows what he has done, has access to your server and he is familiar with IIP and ubb.threads. [/]

Yes. I only called in here incase he was kicking about and incase it was just a simple problem that I could rectify some how even though I am no Threads expert

Thanks anyway guys...I've been a reader of the forums for some time.
heheh I just got back in early this am and am supposed to be working today

IIP was the only modification added, hmmmm... maybe rick might have an answer.
[]
"Im 100% positive that its a php function recursivilly calling httpd spawns. I've programed in PHP/MySQL for over 4 years now. [/]

There are only 2 spots that do recursive function calls postlist.php and showthreaded.php, but that is only to do post threading, they definitely don't create any new httpd processes. That would need to be done with a recursive call that maybe called a header("Location: ..."); or something of that nature which definitely isn't anything that .threads does.

I did a quick search on deja and found alot of posts from users having these problems on specific OS's and php.
Found a couple more relating to FreeBSD and persistant mysql connections. Allen was going to make sure these were turned off.
[]Any ideas what is happening? The guy thinks that the who's online feature could be causing it.[/]

The who's online screen only refreshes once every 60 seconds by default. So unless this script went totally beserk and decided to refresh over and over with no delay this wouldn't cause the problem
Now I just have to wait until Allen gets home from work
home for a little longer, then another meeting for a couple hours... it looks stable right now. I'll be hanging around tonight to see how it goes
Try the domain www.koptalk.org and the entrance link, think they need changing back Allen? It will be stable now as there is no traffic here - it's 11.49pm

The minute I say the the site is open I expect the same will happen again. The hosts rebooted the machine several times and it was fine for about ten minutes then crashed each time. Because of the time difference it's probably hard to work out the peak times etc to compare with what you are used to.

The site was fine this time last night etc, it was just in the morning when the traffic picked up that it went pear shaped and crashed leaving me sat all day scratching my head

I wonder what's causing this?
We'll turn IIP options on gradually... for now make sure 'welcome newest member' and 'featured member' are turned off, I remember someone saying how those were hard on the server... we'll turn the debugging info on at the bottom and watch.
The problems are very similar to those a site on my server had. That had nothing to do with threads though, it was a loop that was badly written, causing it to go on infinately. And this code was run on almost every page on the site, which is very active. I spent ages trying to find the problem, asking the site owner several times if they had changed anything and they promised me that they had done nothing. Had to get outside help eventually and he managed to find the problem script.
Thanks Gardener

This is the only script on the site right now... possibly it was a fluke table-lockup, we'll know tomorrow when there's 3-400 people online
Seems to be doing well with 145 people online right now.. will check back after some :zzz:
It's just gone down again

It was working fine with around 350 on at one point and most of the IIP disabled as you know Allen.

It was going fine until around 12.20pm UK time and then I just got the screen below. I have emailed support at the hosts again:

Warning: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (61) in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 35

Warning: MySQL Connection Failed: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (61) in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 35
SQL ERROR: Database error only visible to forum administrators

Warning: mysql_select_db(): supplied argument is not a valid MySQL-Link resource in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 45

Warning: mysql_query(): supplied argument is not a valid MySQL-Link resource in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 105
SQL ERROR: Database error only visible to forum administrators

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 131

Warning: mysql_query(): supplied argument is not a valid MySQL-Link resource in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 105
SQL ERROR: Database error only visible to forum administrators

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 131

Warning: mysql_query(): supplied argument is not a valid MySQL-Link resource in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 105
SQL ERROR: Database error only visible to forum administrators

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /usr/local/psa/home/vhosts/koptalk.org/httpdocs/forum/mysql.inc.php on line 131
It's working now with about 250 online.
Duncan - as this is a new server, you might like to ask your hosting company, why you are on a standard copy of PSA and not an RPM version. Plesk are dropping the standard version, and moving over entirely to RPM's. Also as your paths are different, it will give you headaches with regards to upgrades & patches.

Plesk 6 will be out soon, so you might have problems upgrading - worth enquiring now, about having the RPM version of 5.0.5.
Just checked in on the site. Chugging along fine with about 500 people online right now.
Hi guys

This is the latest email from Digital Princeton who have been great so far:

Dunk,

We run a custom kernal FreeBSD that is made to handle a very high server load. From what I see in the logs there are 2 problems. The first is that there is some PHP recusivly calling it self and spawning new processes which don't get closed. This is what is taking up all the RAM and killing the box. The second is that there is no my.cnf for MySQL, this is something that you have to have Allen or whoever make. Its not really needed unless you like to tweak MySQL. I have other customers on 500Mhz Box's with 512 MB RAM running threads and they have 600+ users online at any given time.

As Mike said to you there is deffinatly some script killing the box, hes the PHP master and I would have Allen or INFO pop look at that.

We can add another 512MB RAM to your box, Mike also suggested that HD access times might becausing alittle bit of an issue als, but not much.

We have some SCSI servers coming in Monday and could move you to one of them if you like. They are alittle more pricy though. I would be able to cut you a deal on one of them though if you wanted to try that.

I want to have your site up and flying, but I feel right now its software more than hardware and I could put you on a DUAL XEON and it would still die.

I just had the box rebooted again, the load goes to 5.0 after less than a minute on. When I show a process list, there are 1,000 + httpd process running using .05 % CPU there is way to many processes running because they are not closing, they are taking up the CPU and never giving it back to the system so it runs out and then dies.

When you had the first page off yesterday the server ran just fine. Could you just run thr threads and not all the other stuff for a day and see how that goes. Have the index page go right to the forum.

Steve Klenert
Could a news site that checks my site every 5 minutes for updates cause any of this?

When I update a news item I edit a page which can be found here:

http://www.koptalk.com/regulars/newsnow.shtml

Those items then appear at:

http://www.newsnow.co.uk/newsfeed/?name=Liverpool

It was just a thought.

Threads is great when it's working fine. I just wish I could solve this. I wonder what this spawning thing is? The majority of IIP features are turned off.

I don't mind upgrading the box if it will resolve the problem but as Steve says it wouldn't.

Could it be the beta?

I'd like to feed my kids next month, please help me out guys as this could destroy my business...Grrrr...I'll buy you a beer and treat you if I win the lottery

I wish there was something I could do to fix it but I'm so lame it's untrue
The 6.3 beta and the stable 6.2.3 version aren't too much different. At least not enough to cause these types of problems. To try and isolate this I'd do as he suggests. First, totally disable the IIP, just have the index be the forums. Then just to be safe you might want to disable this news checker you're refering to.

1000+ httpd connections is definitely not normal as far as I'm aware and like I said earlier. There is nothing in threads that does recursive calls that would spawn additional httpd children.
Perhaps you should move to using UBB.threads 6.2.3 without IIP and see how things go. On the other machines your host mentions handling 600+ users with .threads what version are they running for software?

Just a thought...
I'll see what Allen says when he's online later today...the time difference is a killer for me

The news checker thing just checks that one page to see if I have provided any new updates (I manually add each item whenever a new news item has been created/published).

As I'm not tech minded I have no idea what 1000+ httpd connections are

Such is life eh
How does this news checker work? You said it checks every 5 minutes? Does it use Conditional Gets? Perhaps the script doing the fetching isn't closing the connection after each fetch?

Just more guessing...
To be honest I'm not sure. They just asked me to provide a hidden page which is the newsnow.shtml link and they spider it every 5 minutes or so. I've always used this system on my normal site and I may be way off here. I'm maybe clutching at straws.
[]Duncan said:
Hosts:
PLESK PSA 5.0.5
Apache: 1.3.27
MySQL: 3.23.55

Some people are going to probably tell you that those are not the latest builds, but these are the latest builds that PLESK uses and they are all stable and secure."
[/]

There's a whole page of bugfixes and security fixes for 3.23.56 released more than 2 months ago:
http://www.mysql.com/doc/en/News-3.23.56.html - including some issues with database corruption. Just because it's what's in the released version of plesk doesn't mean it's the best option for you... you are the one paying the bill.
11.39am in the UK and it's just gone again...this is doing my shed in
Re-booted (again) 500+ members online browsing.

last chat by IM with hosts:

spiders could be doing that and they create httpd requests and its not like a browser that someone closes
and 5 min could be to short of an interval and the process never closes on its own.
its taking much longer for the box to die now so its a slow thing
as far as the mysql goes, Plesk knows about the patches and they have released hot fixes for PSA
which we have applied.
no need to worry about that.
Have you thought of changing your apache configuration to limit the number of httpd processes and to change the timeout for http processes to a lower value?

Maybe logging the slow mysql queries would help to determine the bottleneck also.
Well, at least we know now it's not IIP causing the problem. Got to run into work, will check back later
[]Astaran said:
Have you thought of changing your apache configuration to limit the number of httpd processes and to change the timeout for http processes to a lower value?

Maybe logging the slow mysql queries would help to determine the bottleneck also. [/]

They mentioned this earlier but I'll pass your comments on
[]AllenAyres said:
[]Duncan said:
Hosts:
PLESK PSA 5.0.5
Apache: 1.3.27
MySQL: 3.23.55

Some people are going to probably tell you that those are not the latest builds, but these are the latest builds that PLESK uses and they are all stable and secure."
[/]

There's a whole page of bugfixes and security fixes for 3.23.56 released more than 2 months ago:
http://www.mysql.com/doc/en/News-3.23.56.html - including some issues with database corruption. Just because it's what's in the released version of plesk doesn't mean it's the best option for you... you are the one paying the bill. [/]

Duncan is on Standard version of Plesk, and plesk are not going to be supporting this much longer - at the moment they have not upgraded MySQL.

Duncan, I would strongly consider switching machines, and having an RPM version of Plesk - believe me you will not want a standard version.
Some snips of info from my hosts which help people find what this problem is (Ive edited out the bits that aren't of much info):

DPrincetonNOC [12:49]: i show the load of the server 15 min ago was 65 _
DPrincetonNOC [12:49]: 65 +
DPrincetonNOC [12:50]: last pid: 16163; load averages: 1.32, 11.55, 65.54442 up 0+19:55:30 11:50:31
384 processes: 80 running, 304 sleeping
CPU states: 1.9% user, 0.0% nice, 7.1% system, 0.4% interrupt, 90.7% idle
Mem: 154M Active, 10M Inact, 74M Wired, 788K Cache, 34M Buf, 644K Free
Swap: 480M Total, 458M Used, 23M Free, 95% Inuse, 616K In, 1336K Out

8498 root 29 0 2352K 792K RUN 0:53 3.63% 1.07% top
7821 root 29 0 2352K 792K RUN 0:56 3.30% 0.98% top
187 root 2 0 10040K 1536K select 0:22 0.00% 0.00% httpd
15306 apache -14 0 11616K 2128K inode 0:04 0.00% 0.00% httpd
15396 apache 28 0 11608K 2120K RUN 0:04 0.00% 0.00% httpd
15305 apache 28 0 11692K 1896K RUN 0:03 0.00% 0.00% httpd
15477 apache -14 0 12920K 2664K inode 0:03 0.00% 0.00% httpd
15278 apache -14 0 11540K 2096K inode 0:03 0.00% 0.00% httpd
15294 apache 28 0 11704K 2212K RUN 0:03 0.00% 0.00% httpd
15234 apache -14 0 12948K 2596K inode 0:03 0.00% 0.00% httpd
15517 apache -14 0 11440K 2160K inode 0:03 0.00% 0.00% httpd
15443 apache 28 0 11512K 1772K RUN 0:03 0.00% 0.00% httpd
15465 apache 28 0 11496K 1972K RUN 0:03 0.00% 0.00% httpd
15476 apache -14 0 11532K 1892K inode 0:03 0.00% 0.00% httpd
15480 apache -14 0 11444K 2096K inode 0:03 0.00% 0.00% httpd
15464 apache 28 0 11508K 1692K RUN 0:03 0.00% 0.00% httpd
172 root 2 0 4304K 256K select 0:03 0.00% 0.00% httpsd


DPrincetonNOC [12:51]: there are about 500 httpd processes running right now
DPrincetonNOC [12:51]: thats why it died again.MrKopTalk [12:53]: hmmm....Could a news site that checks my site every 5 minutes for updates cause any of this? When I update a news item I edit a page which can be found here: http://www.koptalk.com/regulars/newsnow.shtml

Those items then appear at:

http://www.newsnow.co.uk/newsfeed/?name=Liverpool

It was just a thought.
DPrincetonNOC [12:53]: that might be whats doing it
DPrincetonNOC [12:54]: im rebooting the box again now
MrKopTalk [12:54]: I'll remove that page so it cant spider the site
DPrincetonNOC [12:54]: i was never told this before so i wasnt loooking for anything like that in the logs
MrKopTalk [12:54]: i didnt know it could be that...just a wild guess
DPrincetonNOC [12:55]: spiders could be doing that and they create httpd requests and its not like a browser that someone closes
DPrincetonNOC [12:55]: and 5 min could be to short of an interval and the process never closes on its own.
DPrincetonNOC [12:55]: its taking much longer for the box to die now so its a slow thing
DPrincetonNOC [12:56]: as far as the mysql goes, Plesk knows about the patches and they have released hot fixes for PSA
DPrincetonNOC [12:56]: which we have applied.
DPrincetonNOC [12:56]: no need to worry about that.
DPrincetonNOC [13:20]: its about every 10 hours that it goes
DPrincetonNOC [13:20]: there are still httpd spawns from the spider and they dont close so they just all add up
MrKopTalk [13:21]: so you think this spider thing every 5 mins could the prob? i can soon work around that as I dont have to use it
DPrincetonNOC [13:21]: we will be able to get it to work if you can get some stable code from them
MrKopTalk [13:21]: when a new headline appears on newsnow.co.uk from my site people click on it and they are taken to my site via a pop-up
DPrincetonNOC [13:22]: see if they can make it every 10 min or something
MrKopTalk [13:22]: koptalk is the 3rd busiest site on there
DPrincetonNOC [13:22]: then if the server goes down every 20 hours we know it was that.
MrKopTalk [13:22]: i'll let you guys know what they say - it might not even be that
DPrincetonNOC [13:23]: I can format and reinstall and write some extra code into the kernal to allow 5,000 httpd connections at any time
DPrincetonNOC [13:23]: right now our custom kernal is set for 2,500 which hasnt ever been a problem for any other customers

DPrincetonNOC [13:28]: how many were online last time you were on
MrKopTalk [13:28]: 500
MrKopTalk [13:29]: maybe my site is too busy for Threads even if this problem is fixed?
DPrincetonNOC [13:29]: i noticed you still had that first page up, can you make it just go to the forum
DPrincetonNOC [13:29]: i doubt it

DPrincetonNOC [13:31]: well let me get this think rebooted again and see what I can see and then steve will work on it

DPrincetonNOC [13:37]: ask allen if he can put a non beta version of threads on the box with the same index page, just create another DB with same content
DPrincetonNOC [13:38]: have the index point to the non beta version on the server and see if that still crashes the box.
The news site checks every 5 minutes and it seems like your problems start growing after 5 mins. It seems like there is a connection there. Hmmmm.
The things is though, only that one page which is on another server at koptalk.com (away from Threads) should be checked for changes etc
Just been thinking again...that page is spidered every 5 mins 24 hours a day so you would expect the server to crash at night time too here in the UK yet it has been going down at around 11am/12noon UK time.
well, I'd think 500+ online might have a little to do with server problems
Yeah, if you've already got 500 open processes with that many users - and then the spidering starts -

I'd disable it at least - to see if you can rule it out.
Some times when I edit a post and re-post it, it says:

[censored]Houllier[censored] [censored]reject[censored] [censored]linked[censored] [censored]with[censored] [censored]Barcelona[censored]
#1580 - 30/05/2003 10:09 (80.194.222.99) Edit Reply Quote



[censored]Former[censored] [censored]Liverpool[censored] '[censored]keeper[censored] [censored]David[censored] [censored]James[censored] [censored]has[censored] [censored]become[censored] [censored]a[censored] [censored]shock[censored] [censored]target[censored] [censored]for[censored] [censored]Spanish[censored] [censored]giants[censored] [censored]Barcelona[censored] [censored]claim[censored] [censored]various[censored] [censored]reports[censored] [censored]today[censored]. <[censored]br[censored] /> <[censored]br[censored] />[censored]James[censored], [censored]who[censored] [censored]has[censored] [censored]established[censored] [censored]himself[censored] [censored]as[censored] [censored]England[censored]'[censored]s[censored] [censored]number[censored] [censored]1[censored] [censored]goalkeeper[censored] [censored]and[censored] [censored]received[censored] [censored]rave[censored] [censored]reviews[censored] [censored]since[censored] [censored]Houllier[censored] [censored]axed[censored] [censored]him[censored], [censored]is[censored] [censored]reportedly[censored] "[censored]flattered[censored]" [censored]by[censored] [censored]their[censored] [censored]interest[censored]. <[censored]br[censored] /> <[censored]br[censored] />[censored]Since[censored] [censored]leaving[censored] [censored]Anfield[censored] [censored]James[censored] [censored]has[censored] [censored]also[censored] [censored]been[censored] [censored]linked[censored] [censored]various[censored] [censored]times[censored] [censored]with[censored] [censored]Manchester[censored] [censored]United[censored]. <[censored]br[censored] /> <[censored]br[censored] />[censored]Ironically[censored] [censored]Brad[censored] [censored]Friedel[censored], [censored]another[censored] [censored]Liverpool[censored] [censored]reject[censored], [censored]was[censored] [censored]voted[censored] [censored]the[censored] [censored]Premiership[censored]'[censored]s[censored] [censored]best[censored] [censored]goalkeeper[censored] [censored]last[censored] [censored]season[censored] [censored]while[censored] [censored]Sander[censored] [censored]Westerveld[censored] [censored]has[censored] [censored]been[censored] [censored]playing[censored] [censored]out[censored] [censored]of[censored] [censored]his[censored] [censored]skin[censored] [censored]with[censored] [censored]Real[censored] [censored]Sociedad[censored]. <[censored]br[censored] /> <[censored]br[censored] />[censored]The[censored] [censored]sooner[censored] [censored]Liverpool[censored] [censored]axe[censored] [censored]Joe[censored] [censored]Corrigan[censored] [censored]the[censored] [censored]better[censored]. <[censored]br[censored] />


What's that about then (no censored words are in it)
Check you bad words filter - you may have a space in it - if not resave the file, make sure it is saved, by adding another word to it
thanks...great fun this Threads

(420 users online and increasing as the dreaded time period approaches....will it crash again today )
The server was ok Friday, Saturday and Sunday which is when the site traffic is quiet. It crashed at 1pm approx UK time on Monday and today (Tuesday). Allen removed IIP and the beta so 6.2.3 is running. This is what my hosts said to me today when it crashed again:

Duncan,



Here is a screen shot of TOP from when the box died.



last pid: 26035; load averages: 127.01, 107.43, 64.99 up 0+22:45:40 12:28:22

398 processes: 196 running, 201 sleeping, 1 zombie

CPU states: 2.1% user, 0.0% nice, 3.7% system, 0.9% interrupt, 93.3% idle

Mem: 156M Active, 7228K Inact, 75M Wired, 924K Cache, 34M Buf, 644K Free

Swap: 544M Total, 514M Used, 30M Free, 94% Inuse, 318M In, 318M Out

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND

5745 root 28 0 2368K 636K RUN 1:25 1.03% 0.10% top

25893 apache 28 0 11772K 1712K RUN 0:00 0.05% 0.05% httpd

155 mysql -18 0 21532K 380K RUN 65:44 0.00% 0.00% mysqld

15173 admin 2 0 2296K 0K RUN 0:58 0.00% 0.00% <top>

188 root 2 0 10040K 1964K select 0:28 0.00% 0.00% httpd

5740 admin 2 0 5292K 0K select 0:07 0.00% 0.00% <sshd>

25039 apache 2 0 12684K 0K sbwait 0:04 0.00% 0.00% <httpd>

24991 apache -18 0 11452K 2108K RUN 0:04 0.00% 0.00% httpd

25073 apache -22 0 11512K 1984K swread 0:04 0.00% 0.00% httpd

25014 apache -14 0 11452K 1944K inode 0:04 0.00% 0.00% httpd

25076 apache -18 0 12692K 2604K RUN 0:04 0.00% 0.00% httpd

25226 apache -14 0 11332K 2436K inode 0:03 0.00% 0.00% httpd

25225 apache -22 0 11544K 2520K swread 0:03 0.00% 0.00% httpd

25248 apache -18 0 11356K 1756K RUN 0:03 0.00% 0.00% httpd

25199 apache -14 0 11092K 2176K inode 0:03 0.00% 0.00% httpd

173 root 2 0 4304K 244K select 0:03 0.00% 0.00% httpsd

25207 apache -18 0 11544K 1264K RUN 0:03 0.00% 0.00% httpd

I dont see any reason that the load should be so high, except that php is spawning httpd so much and not closing. Were there any hacks applied to your site?


---

Any ideas gents?


It's just gone again, hosts say: "mysql died. it tried to dump it self. is that something that threads tells it to do?"
"its still doing httpd in free(): warning: recursive call"
re-booted and gone again within fifteen minutes or so:


last pid: 1208; load averages: 22.52, 16.12, 7.42 up 0+00:22:54 16:23:45
328 processes: 5 running, 323 sleeping
CPU states: 2.9% user, 0.0% nice, 9.0% system, 1.8% interrupt, 86.4% idle
Mem: 157M Active, 22M Inact, 58M Wired, 608K Cache, 34M Buf, 644K Free
Swap: 1504M Total, 301M Used, 1204M Free, 19% Inuse, 364K In, 3316K Out

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
944 root 36 0 2288K 728K RUN 0:07 4.14% 1.37% top
155 mysql 28 0 20504K 1268K pfault 1:35 0.00% 0.00% mysqld
526 apache 28 0 11420K 2368K pfault 0:04 0.00% 0.00% httpd
328 apache 2 0 13000K 0K sbwait 0:04 0.00% 0.00% <httpd>
333 apache 2 0 11784K 0K sbwait 0:04 0.00% 0.00% <httpd>
341 apache 28 0 11680K 2248K pfault 0:04 0.00% 0.00% httpd
343 apache 28 0 11544K 2276K pfault 0:04 0.00% 0.00% httpd
304 apache 2 0 11764K 0K sbwait 0:04 0.00% 0.00% <httpd>
336 apache 2 0 11884K 0K sbwait 0:04 0.00% 0.00% <httpd>
347 apache 2 0 11544K 0K sbwait 0:04 0.00% 0.00% <httpd>
318 apache 2 0 11548K 0K sbwait 0:04 0.00% 0.00% <httpd>
325 apache 28 0 11324K 1112K pfault 0:04 0.00% 0.00% httpd
219 apache 28 0 11472K 1388K pfault 0:04 0.00% 0.00% httpd
351 apache 28 0 11544K 1408K pfault 0:04 0.00% 0.00% httpd
332 apache 28 0 11544K 1836K pfault 0:04 0.00% 0.00% httpd
323 apache 2 0 12756K 0K sbwait 0:03 0.00% 0.00% <httpd>


"something is spawning again"

--

Hosts report that this is an Infopop problem in the software and that they can not debug software for them. They also sent this by email:

Tue Jun 3 16:34:25 2003] [notice] suEXEC mechanism enabled
(wrapper: /usr/loca
l/psa/apache/bin/suexec)
[Tue Jun 3 16:34:25 2003] [notice] Accept mutex: flock (Default: flock)
[Tue Jun 3 16:35:10 2003] [warn] child process 425 still did not exit,
sending
a SIGTERM
[Tue Jun 3 16:35:10 2003] [warn] child process 426 still did not exit,
sending
a SIGTERM
[Tue Jun 3 16:35:10 2003] [warn] child process 427 still did not exit,
sending
a SIGTERM
[Tue Jun 3 16:35:14 2003] [notice] caught SIGTERM, shutting down


We can hint them towards perl and flock/spawn handling, people with out high
load would never have this problems, so they might not notice this stuff.
Well, threads is written in php and doesn't start child processes, so it can't apply to threads...

Have you checked your cron jobs so that subscriptions isn't run every minute or something like that?

A good thing to do is to set up /server-status and /server-info so that you can see what pages all these httpd processes are stuck at. That's how I managed to track down a problem with similar symptoms (which didn't have anything to do with threads).
The more you post, the less it sounds like a threads specific problem. I would almost guess it is a problem with the mysql. But I don't know enough to even make a good guess as to what though. He had said you we're hacked, but why did he think you might be? Hmmmm.
Maybe apache needs rebuilt?
There has been problems with earlier apache versions segfaulting and creating stale httpd:s very much like what you are experiencing. Although those problems should be long gone it might be worth looking in to.

Also, this could happen if you have hardware problems, like a corrupt, or improperly installed, memory. When apache tries to address memory that is broken it will segfault and leave stale httpd:s. This is what I thought my problem was until I finally found the erroneous script with some much needed help of a friend.

Rebuilding apache, php and mysql might be a good thing as well. If php is using the wrong mysql library things might go haywire, I've had that happen when I've forgotten to upgrade php along with mysql when the mysql library changed.
I haven't seeny anything about a hack?

Infopop are looking into this now so I will report back.

Thanks a lot guys.
© UBB.Developers