|
Joined: Jun 2001
Posts: 2,849
Spotlight Winner
|
Spotlight Winner
Joined: Jun 2001
Posts: 2,849 |
Maybe warning is a bit strong, let's say a bit of advise to people that have added the spider hack to their boards.
In the text file with instructions for implementing the spider bait hack you are given a warning about the possible load that it can place on your server during the time your boards are being crawled. This is a serious issue for large boards on shared servers. Another warning warns about impact on your boards if you don't have the accelerator enabled.
Please listen to these warnings. I've helped more than one site troubleshoot problems that ended up being caused by using this hack without enabling the accelerator and being on a shared server. UBB already has a reputation for being a resource hog and a lot of hosting companies no longer allow it as part of their less expensive plans.
We all know that UBB Classic is a great product. The spider bait hack is a must have for anyone that wants their forum content to be searchable. Just remember that it's easier to set your site up properly than it is to have to troubleshoot problems later or find a new host because they booted you for using more than your share of the server resources.
I've also seen cases where the server crashes with a ton of perl processes.
|
|
|
|
Joined: Feb 2001
Posts: 817
Moderator / Kingpin
|
Moderator / Kingpin
Joined: Feb 2001
Posts: 817 |
.classic isn't much as a resource hog as it used to be with earlier 6.x versions but I do agree, you must have the accelerator enabled and you also need to consider the potential problems when enabling the spider hack on an over-shared shared server.
Good post. I'm featuring this topic.
|
|
|
|
Joined: Jan 2000
Posts: 5,073
Admin Emeritus
|
Admin Emeritus
Joined: Jan 2000
Posts: 5,073 |
Any script has the potential to be a resource hog when malicious spiders come crawling along.
Google, Inktomi, and others intentionally limit the crawling rate in order to prevent hitting the server too hard, even for static pages. Not all spiders are as nice. A single malicious spider can easily make a hundred requests a minute, which can promptly bring any server to its knees.
UBB.classic: Love it or hate it, it was mine.
|
|
|
|
Joined: Jan 2000
Posts: 5,834 Likes: 20
UBBDev Owner Time Lord
|
UBBDev Owner Time Lord
Joined: Jan 2000
Posts: 5,834 Likes: 20 |
Agreed, I've seen some serious impact on some boards running spider hack when spiders come along, a lot of hosts don't like it when that happenes now adays due to spiderable url's in the beta... Perhaps look at a new host before you enable spidering... HostNuke hosts both of my boards and Al's, and we both have spidering enabled... Works like a dream...
|
|
|
|
Joined: Jun 2003
Posts: 60
Member
|
Member
Joined: Jun 2003
Posts: 60 |
What do you think of LunarPages in this regards?
I've heard (and seen a little) that their servers do bow to spiders at times.
REaMERE
|
|
|
|
Joined: Jan 2000
Posts: 5,834 Likes: 20
UBBDev Owner Time Lord
|
UBBDev Owner Time Lord
Joined: Jan 2000
Posts: 5,834 Likes: 20 |
I never really liked LunarPages... The only post I like anymore is HostNuke; their systems support my board when 3 other providers turned me away...
|
|
|
|
Joined: Jun 2003
Posts: 60
Member
|
Member
Joined: Jun 2003
Posts: 60 |
Hmmm. good to know, thanks. I use LunarPages (obviously) for all of my clients and I really have no complaints. Of course, I have been on worse providers. Oh...the stories I could tell!
But DrkKnight and I just moved our boards over to Lunar and are doing well. I do about 3000 unique visits per month but he does well over 20,000 (crazy traffic), he's using Ubb.threads now, and I'm using classic, again no complaints.
But we do not have the spider hack installed. I was wondering if anyone had any experience with UBB and the Spider Hack on Lunar Pages as a host.
Thanks for the feedback
REaMERE
|
|
|
|
Joined: Jan 2000
Posts: 5,834 Likes: 20
UBBDev Owner Time Lord
|
UBBDev Owner Time Lord
Joined: Jan 2000
Posts: 5,834 Likes: 20 |
We were using a spider mod when on there, see above, they didn't like it lol...
|
|
|
|
Joined: Jun 2003
Posts: 60
Member
|
Member
Joined: Jun 2003
Posts: 60 |
Ahhh, I didn't realize you were using them. I suppose that either the server crashed or they told you that they were dropping the account? Either way, thanks, good to know, because we were both thinking of using the hack.
REaMERE
|
|
|
|
Joined: Jan 2003
Posts: 3,456 Likes: 2
Master Hacker
|
Master Hacker
Joined: Jan 2003
Posts: 3,456 Likes: 2 |
if you wait a week or two, the next version will have it, so no need to alter your boards.
As for drknight, I believe that threads 6.5 will also have it
|
|
|
|
Joined: Jan 2000
Posts: 5,834 Likes: 20
UBBDev Owner Time Lord
|
UBBDev Owner Time Lord
Joined: Jan 2000
Posts: 5,834 Likes: 20 |
Correct, as AL sais, no need to use a hack when it's standard in a few weeks in 6.7. And I was running my board there and they had told me that I was recieving too much traffic and creating too much of a load with their systems. So I simply moved to HostNuke and never looked back...
|
|
|
|
Joined: Jan 2003
Posts: 3,456 Likes: 2
Master Hacker
|
Master Hacker
Joined: Jan 2003
Posts: 3,456 Likes: 2 |
OK, I'm now taking bets on how many times Gizzy is gonna mention Hostnuke in this thread.
|
|
|
|
Joined: Jan 2000
Posts: 5,834 Likes: 20
UBBDev Owner Time Lord
|
UBBDev Owner Time Lord
Joined: Jan 2000
Posts: 5,834 Likes: 20 |
you counting my sig in this count? lol...
|
|
|
|
Joined: Jun 2003
Posts: 60
Member
|
Member
Joined: Jun 2003
Posts: 60 |
Heheh Now you say it's gonna be standard, but will it be the same hack. Meaning, will as many pages be spidered or will it be kind of like a Spider Hack Light? My obvious concern is that (when) we upgrade to 6.7 Classic and 6.5 threads respectively will this cause major problems with our current host. You know how much effort it is moving boards around. (you know, to places like HOSTNUKE  ) Throw me your opinions and advice please!
REaMERE
|
|
|
|
Joined: Jan 2003
Posts: 3,456 Likes: 2
Master Hacker
|
Master Hacker
Joined: Jan 2003
Posts: 3,456 Likes: 2 |
the one is the base code is written by Charles Capps, who wrote the mod. And now that it's built in, it's much better, with .html extensions for spiders the require.
|
|
|
|
Joined: Jun 2003
Posts: 60
Member
|
Member
Joined: Jun 2003
Posts: 60 |
Nice. Thanks guys for all of the quick answers. So basically we should: 1.) Make sure the PHP Accelerator is on and functional 2.) Monitor the Apache Servers during peak hours (among other times) to make sure they aren't getting overloaded. 3.) Consume several Gin & Tonics
Am I missing anything??
REaMERE
|
|
|
|
Joined: Jan 2000
Posts: 5,834 Likes: 20
UBBDev Owner Time Lord
|
UBBDev Owner Time Lord
Joined: Jan 2000
Posts: 5,834 Likes: 20 |
Don't forget the vodka; other than that, it's lookin good.
|
|
|
|
Joined: Jan 2003
Posts: 3,456 Likes: 2
Master Hacker
|
Master Hacker
Joined: Jan 2003
Posts: 3,456 Likes: 2 |
I found this website that will have an htaccess that prevents most malicious spiders. # source: http://www.webmasterworld.com/forum13/687-9-15.htm # some of these are commented out, they are possibly legitate download agents # if you don't want anyone downloading your sites, uncomment them
RewriteEngine On RewriteCond %{HTTP_REFERER} q=Guestbook [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] RewriteCond %{HTTP_USER_AGENT} ^Bot mailto:[email protected] [OR] RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR] RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR] RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] RewriteCond %{HTTP_USER_AGENT} ^Download Demon [OR] RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] RewriteCond %{HTTP_USER_AGENT} ^Express WebPictures [OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] RewriteCond %{HTTP_USER_AGENT} ^GornKer [OR] RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^Image Stripper [OR] RewriteCond %{HTTP_USER_AGENT} ^Image Sucker [OR] RewriteCond %{HTTP_USER_AGENT} Indy Library [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] RewriteCond %{HTTP_USER_AGENT} ^Internet Ninja [OR] RewriteCond %{HTTP_USER_AGENT} ^Irvine [OR] RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] RewriteCond %{HTTP_USER_AGENT} ^JOC Web Spider [OR] RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Mass Downloader [OR] RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR] RewriteCond %{HTTP_USER_AGENT} ^MIDown tool [OR] RewriteCond %{HTTP_USER_AGENT} ^Mister PiX [OR] RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR] RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Net Vampire [OR] RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR] RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline Explorer [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline Navigator [OR] RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^Papa Foto [OR] RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] RewriteCond %{HTTP_USER_AGENT} dloader(NaverRobot) [OR] #RewriteCond %{HTTP_USER_AGENT} ^puf [NC,OR] #RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] RewriteCond %{HTTP_USER_AGENT} ^SearchExpress [OR] RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR] RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport Pro [OR] RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] RewriteCond %{HTTP_USER_AGENT} ^Web Image Collector [OR] RewriteCond %{HTTP_USER_AGENT} ^Web Sucker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] RewriteCond %{HTTP_USER_AGENT} ^WebBandit [OR] RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] RewriteCond %{HTTP_USER_AGENT} ^WebGo IS [OR] RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] RewriteCond %{HTTP_USER_AGENT} ^Website eXtractor [OR] RewriteCond %{HTTP_USER_AGENT} ^Website Quester [OR] RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] RewriteCond %{HTTP_USER_AGENT} ^Xaldon WebSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR] RewriteCond %{HTTP_USER_AGENT} ^ZyBorg
RewriteRule ^.* - [F,L]
|
|
|
|
Joined: Jun 2003
Posts: 60
Member
|
Member
Joined: Jun 2003
Posts: 60 |
Nice find. I'll be adding that information to mine. Excellent site also, lots of good info.
Question: Is there any way to block spiders from within any of the .cgi or .pl files?
Thanks in advance
REaMERE
|
|
|
|
Joined: Jan 2003
Posts: 3,456 Likes: 2
Master Hacker
|
Master Hacker
Joined: Jan 2003
Posts: 3,456 Likes: 2 |
just disallow access to the cgi-bin and the noncgi/templates folder
|
|
|
|
Joined: Jun 2003
Posts: 60
Member
|
Member
Joined: Jun 2003
Posts: 60 |
Thanks again Al. Cant vote for ya again, but you have my thanks!
REaMERE
|
|
|
Donate to UBBDev today to help aid in Operational, Server and Script Maintenance, and Development costs.
Please also see our parent organization VNC Web Services if you're in the need of a new UBB.threads Install or Upgrade, Site/Server Migrations, or Security and Coding Services.
|
|
Posts: 1,157
Joined: July 2001
|
|
Forums63
Topics37,575
Posts293,931
Members13,824
|
Most Online6,139 Sep 21st, 2024
|
|
Currently Online
Topics Created
Posts Made
Users Online
Birthdays
|
|
|
|