[] Introduction
When writing scripts, it is extremely important to have to ability to transfer information from one script to another. A common method to do this is with the GET convention. Search engine Web spiders, however, tend to ignore pages whose URL contains GET method parameters. If you're not sure what a GET method parameter is, here's an example of a URL with GET method parameters:
http://www.zend.com/mypage.php?myval=1&yourvar=2The example URL passes two parameters to the script mypage.php: "myval" and "yourvar" with the values 1 and 2 respectively. When a search engine spider encounters such a URL while indexing your pages, the spider will ignore the URL and not index that particular page.
This can have a fairly detrimental effect on how your pages are indexed -- especially if you use the GET convention in your hyperlinks often. Today I'll show you how to use your Web server to pass parameters to PHP scripts so that it fools search engines, and allows your page to be indexed when it would otherwise be ignored.
What's wrong with the GET method
The GET method of transferring parameters between Web pages is by far the simplest method. It is particularly useful for passing parameters from within HREF tags. For example, assume you have a set of articles on your Web site and a single script that displays the articles in the desired fashion.
If you wanted to provide a simple hyperlink using <A HREF> to a particular article, you would need to pass the script a parameter telling it which article you would like to view using the GET convention. Unfortunately, Web spiders generally ignore hyperlinks that include parameters in the URL. This means that the page which the hyperlink points to -- as well as all pages referenced by it -- will be ignored by the Web spider indexing your site.
A spider-friendly GET gimmick
Now that you have a better understanding of the problem, let's look at the solution. In order for a spider to traverse (and consequently index) a given page, the URL must be free of any appearance of parameters. But if a given page requires parameters to function properly, what can be done? The answer lies in the use of the $PATH_INFO environment variable, which you can convert a URL from...
http://www.zend.com/myscript.php?myvalue=Hello...to a spider-friendly format:
http://www.zend.com/myscript.php/myvalue/HelloNotice that the spider-friendly format contains no indication that there are any parameters being passed at all. Rather, it simply looks like we are trying to access the directory on the zend.com site /myscript.php/myvalue/Hello, and any search engine spider that accesses the page won't have any trouble following the URL. Yet in reality we are executing the script myscript.php.
But what happened to your parameters?
How to GET your hidden data
Now that you have successfully hidden your parameters within what appears to be a directory structure, how do you get them out? Whenever a PHP script is executed with extra path data appended to the end of the filename (as we did in the spider-friendly example above), the Web server creates an environment variable $PATH_INFO containing this information. You can then access this environment variable through PHP automatically, and parse it to retrieve our data. So our earlier URL...
http://www.zend.com/myscript.php/myvalue/Hello...would populate the $PATH_INFO variable with:
/myvalue/Hello
...from which you can then parse and retrieve the passed information.
Deciphering your data
Now that you know where your parameters are, the next step is to decipher them into a format that PHP can use. Although there is no required method for doing this, I'll assume that you have formatted your data in the following way:
/var_name/var_data/var2_name/var2_data/...
Using this method, all that is left is to:
* break the provided string every time we encounter a slash ('/')
* create variables to associate the given names (var_name, var2_name, etc.) with their respective values (var_data, var2_data, etc.)
With all of this in mind, let's look at some real code.
The script
As with many powerful techniques, the code required to create this ability in your scripts is not difficult to develop. The process consists of traversing an array based on the $PATH_INFO, and creating variables based on that data. In the end, the object is to take the URL...
http://www.zend.com/myscript.php/myvalue/Hello...then use the data provided in the $PATH_INFO variable to construct corresponding variables:
$myvalue = "Hello"
Code flow
* Check for the existence of $PATH_INFO
* Split $PATH_INFO into an array
* If the total number of parameters is even, add an extra empty element at the end to simplify the traversal in the next step
* Traverse array and create variables based on the $PATH_INFO data
<?php
if(isset($PATH_INFO)) {
$vardata = explode('/', $PATH_INFO);
$num_param = count($vardata);
if($num_param % 2 == 0) {
$vardata[] = '';
$num_param++;
}
for(var $i = 1; $i < $num_param; $i += 2) {
$$vardata[$i] = $vardata[$i+1];
}
}
?>
NOTE: If the $PATH_INFO variable does contain a value (if no parameters were passed it will not be set), the first element in the $vardata array will be empty (with the actual data starting at index 1). Therefore, it is important to take this into account when parsing and populating variables as we did in the above code.
A step further
In the above script, not only are the assigned values to variables based on the $PATH_INFO of the script, but also the variable names themselves. This was done to show parallels between this method of passing parameters and the GET method. However, in most cases you can assume the names of the passing parameters.
For example, say you would like to pass a first and last name to the script through our $PATH_INFO method. Using the code above, the URL would resemble the following...
http://www.mysite.com/myscript.php/first/John/last/Coggeshall...to create the variables $first and $last and assign the values "John" and "Coggeshall" respectively. However, when using the $PATH_INFO method, you have more flexibility than with a GET method. The same URL could be written in the following fashion...
http://www.mysite.com/myscript.php/John/Coggeshall...and then the script could use the following to retrieve the data:
list($dummy, $first, $last) = explode('/', $PATH_INFO);
This would allow the script to statically define variables as necessary for that script. Using this method, the variables $first and $last will always be created and set to the first and second values separated by a slash. Note also the third variable $dummy must also be created to deal with the first slash in $PATH_INFO. This could also be avoided in the following manner:
list($first, $last) = explode('/', substr($PATH_INFO,1));
Final notes
It is important to point out that we are expanding the parameter passing abilities of PHP, rather than changing them. You can you use this script to hide parameters you pass to your script, as well as pass parameters to it with standard GET or POST methods as usual.
Because this script is so transparent, feel free to prepend it to any script either through the auto_prepend directive or with a simple include() statement.
About John Coggeshall
John Coggeshall is a PHP consultant and author who started losing sleep over PHP around five years ago. Lately you'll find him losing sleep meeting deadlines for books or online columns on a wide range of PHP topics. You can find his work online at O'Reilly Networks onlamp.com and Zend Technologies, or at his website
http://www.coggeshall.org/.John has also contributed to WROX Press' Professional PHP4 Programming and is currently in the progress of writing the PHP Developer's Handbook published by Sams Publishing.
Readers' Comments [/]
Original Article by John Coggeshall So basically it will fail on your server because of the way apache is configured.. It doesn';t allow for you to use the GET Method and thus it will fail. There is nothing you can do at this point. You could try the mod rewrite way (do a search on the net) but I think that would fail also with your present configuration.