UBB.Dev
Posted By: The Team SE indexing W3T - 03/29/2000 12:07 AM
Anyone have the solution to getting your posts indexed by spiders. I know the problem is with th "?" and the solution probably quite involved.

Any takers? :)

Jay
http://wwlive.net
Online Magazine ala Controversial
Posted By: MattyJ Re: SE indexing W3T - 03/29/2000 4:47 AM
I'm not sure that w3t posts can be indexed like that, due to their dynamic nature... that is, they are stored in the database until the perl script to display them on the users browsers... it's not like other systems that store the posts in html or flat-file format....

Sorry!

Matt

Posted By: ktorbeck Re: SE indexing W3T - 03/29/2000 8:28 PM
I would suggest creating a robots.txt file in your server root dirctory with a "disallow" line pointing to your wwwthreads path.

[:red]
# go away
User-agent: *
Disallow: /cgi-bin/wwwthreads # point to your wwwthreads directory.


This should work for most robots. I have testing it with our robot and it works,but sadly a few robots only work with META tags.

The problem is that META tags need to be in the page header. I do not think you can do this with wwwthreads with out hacking the perl code. The header in the includes directory, comes after the html pages header so META tags will not work in this file. Maybe this is a feature that can be added.

[:red]
<META NAME="ROBOTS"
CONTENT="ALL | NONE | NOINDEX | NOFOLLOW">

default = empty = "ALL"
"NONE" = "NOINDEX, NOFOLLOW"

The filler is a comma separated list of terms:
ALL, NONE, INDEX, NOINDEX, FOLLOW, NOFOLLOW.

Discussion: This tag is meant to provide users who cannot control
the robots.txt file at their sites. It provides a last chance to
keep their content out of search services. It was decided not to
add syntax to allow robot specific permissions within the meta-tag.

INDEX means that robots are welcome to include this page in
search services.

FOLLOW means that robots are welcome to follow links from this
page to find other pages.

So a value of "NOINDEX" allows the subsidiary links to be explored,
even though the page is not indexed. A value of "NOFOLLOW" allows the
page to be indexed, but no links from the page are explored (this may
be useful if the page is a free entry point into pay-per-view content,
for example. A value of "NONE" tells the robot to ignore the page.

-Ken Torbeck [:blue]WWW.INFOSITE.[:red]ORG Special Needs & disAbilities Info. Center
© UBB.Developers