Previous Thread
Next Thread
Print Thread
Rate Thread
#205368 10/09/2000 9:35 AM
Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
After adding the enhanced online hack it got me wondering about how big the language files have been getting, I haven't checked in a while. The average language file is approaching 30K[shock] and 475 lines in length. This might not seem like alot but when you consider that every script has to compile the language file before executing I am beggining to wonder what type of performance hit this is generating.

An option that I am considering is splitting the language files up into individual files needed by each script, along with a generic one used by w3t.pm itself. I do believe this would help in performance but wanted to get a bit of feedback first. The directory structure would look something like this:
/languages
/english
addpost.pl
adduser.pl
wwwthreads.pl
...
/french
addpost.pl
adduser.pl
wwwthreads.pl
...

So, each language would have it's own directory and each script would only compile the language variables needed by that script.



UBB.threads Developer
Sponsored Links
Sally #205369 10/09/2000 9:43 AM
Joined: Jan 2000
Posts: 796
MTO Offline
Addict
Addict
Offline
Joined: Jan 2000
Posts: 796
Fine with me. As long as you are not saying that to go to the forum in, for example Spanish, you have to visit:
wwwthreads/languages/spanish/wwwthreads.pl
If you dont mean that it is perfectly fine with me. []/w3timages/icons/smile.gif[/]

By the way, if you want to release 5.1 with the languages already updated you should get them to the translators before hand, or maybe just put it now at the language resource files so we may get you the translations before you release 5.1. I translate the Spanish files.


Mateo Byler
CruceDeCaminos.com - http://crucedecaminos.com

Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
Nope, as far as the end user is concerned their would be no changes. The only directory structure changes would be within the language directory for internal purposes.

If I'm going to split the language files up I'll need to do that and make the necessary changes to the scripts before posting the language files for translation, because instead of being one large language file there will be multiple smaller files.


UBB.threads Developer
Sally #205371 10/09/2000 10:17 AM
Joined: Jan 2000
Posts: 796
MTO Offline
Addict
Addict
Offline
Joined: Jan 2000
Posts: 796
As far as what translations go, it would be nice for those languages other than English if you could distinguish between when there is one post and more posts.
Where for example it would say "2 new" or "1 new" is not the same in other languages as Spanish. For example [:blue]1 new becames [:blue]1 nuevo and [:blue]2 new becomes [:blue]2 nuevos. Notice it ends with [:blue]s when there are more than 1.
I believe this is true for most languages other than English where if it is plural (more than one), you say so differently.

So it would be nice to have something like:
[:blue]if posts >1 then $newposts
if posts =1 then $newpost
or something like that where you could specify how to say it in each language.

Mateo Byler
CruceDeCaminos.com - http://crucedecaminos.com

Sally #205372 10/09/2000 11:45 AM
Joined: Aug 2000
Posts: 3,590
Moderator
Moderator
Offline
Joined: Aug 2000
Posts: 3,590
So basically, we're giving up space for speed? Is that right? We'll have individual versions of each .pl in a language directory? If this is the case, that's fine with me...I don't have much of a space consideration (yet) []/w3timages/icons/smile.gif[/]


Sponsored Links
Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
I guess I didn't explain myself quite right. Let me try again.

Currently all language variables are stored in one large language file, 475 lines. Right now, each script runs and it has to compile this whole language file before executing. What I want to do is split up the language files so this isn't necessary. For example when approving a post, there are only 2 text strings that are printed out. What I want to do is pull those 2 strings out of the very large language file and create a smaller language file that is named the same as the script but all it contains is those 2 lines.

There will still only be 1 copy of each perl script for the program. The only thing I will be altering is the language files themselves. Just splitting them up into smaller files. I want to name them the same as the actual scriptnames so if you want to change a text string in showflat, you would go into the language directory and edit showflat.pl.

Hope that makes a little more sense.


UBB.threads Developer
Sally #205374 10/09/2000 12:11 PM
Joined: Aug 2000
Posts: 3,590
Moderator
Moderator
Offline
Joined: Aug 2000
Posts: 3,590
Ah....beautiful. Sounds good to me!


Sally #205375 10/09/2000 1:06 PM
Joined: May 1999
Posts: 624
Master Hacker
Master Hacker
Offline
Joined: May 1999
Posts: 624
That's a great idea. It will also make it much easier to find the bits we're looking for when we want to change something.

[]http://www.amdragon.com/images/eileensig.gif[/]

Sally #205376 10/09/2000 2:12 PM
Joined: May 1999
Posts: 78
Member
Member
Offline
Joined: May 1999
Posts: 78
Sounds good. But if it slows down getting the 5.1 release out I say nooooooooo, please noooooooo! []/w3timages/icons/wink.gif[/] Ah well, just excited I guess []/w3timages/icons/tongue.gif[/]

Checking back constantly for 5.1,
Lee


Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
Well, it won't slow it down any because I'm already done with it[]/w3timages/icons/laugh.gif[/]. I just have to spend a day or 2 testing new installs and upgrades to make sure this portion of 5.1 works ok.


UBB.threads Developer
Sponsored Links
Sally #205378 10/09/2000 3:44 PM
Joined: Aug 2000
Posts: 3,590
Moderator
Moderator
Offline
Joined: Aug 2000
Posts: 3,590
Can you throw in an auto-refresh? Or is there one that's just really long?


Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
There is a 60 second auto-refresh in right now.


UBB.threads Developer
Sally #205380 10/09/2000 8:31 PM
Joined: Jun 2000
Posts: 23
Journeyman
Journeyman
Offline
Joined: Jun 2000
Posts: 23
I don't know if this makes any sense or not... or maybe my logic is all wrong here. Let me throw it up against the wall anyways. I'm guessing that 80%+ of traffic on installations comes from the four or five main programs (forum, postlist, showflat, showthreaded... ).. I'm sure someone can analyze their log files to see what the real numbers are. But, I just took a look at showflat.pl and at least on my copy there are 28 references to the language files. Not an awful lot, certainly less than the 400+ lines in there.

What if there were some sort of option, especially for higher traffic sites, to include these 28 lines in a configuration section at the end of these programs that draw the majority of use and eliminate the call to the language file entirely for those scripts. You could keep these lines in the language file and probably even create a very simple tool for the admin area that you would use to synchronize the data in the language file to these high use programs. So if I went in and changed languages/english.pl I would then run this tool which would simply substitute what is found in the language file in place the old data at the bottom of the script... You could probably do this for all files, but if on 80% your of hits you can avoid compiling this second program, it seems to me that would you would get a 40% reduction in load... The other 20% doesn't that much traffic so it isn't as big an issue.

Does that make any sense at all?



Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
I'm not sure I entirely follow but I'll take a stab at putting my thoughts to words.

The way WWWThreads works with PERL is this. We use and require other scripts and modules. showflat.pl uses the 3 main w3t modules (w3t.pm, w3tvars.pm and w3ttheme.pm). At the top of w3t.pm you will see where it requires the language file. So, basically all of these scripts are compiled when showflat is run. What I have done now is make it so we only compile the language strings needed for that particular file.

So, even if we move the language strings directly into showflat this wouldn't be of any benefit now as you would still be compiling the same amount of code. Also, if the language strings are put directly into each scripts then we go back to only being able to support one language at a time.

Again, I'm not sure I followed your post, but I hope I made some sense as well[]/w3timages/icons/wink.gif[/].


UBB.threads Developer
Sally #205382 10/10/2000 12:32 AM
Joined: Jun 2000
Posts: 23
Journeyman
Journeyman
Offline
Joined: Jun 2000
Posts: 23
I guess I misunderstood... The way I read the earlier post was that because the language files were getting compiled in, that perl was starting up again for the language file to compile. And my random thoughts were that if we could eliminate that file completely on the main scripts, then perl wouldn't fire up again and there would be a good boost in performance, plus it would benefit disk access by not having to open that file each time as well. Maybe that explains what I was thinking a little better. Supporting only one language is fine for me right now, since I dumped all the language selections from my copy because I added a bunch of stuff to the language file and translating it all is not something I have the ability to do right now.

On an entirely separate note, I have two other questions, unrelated to language stuff. First, I was wondering if there is a threshold for having too many forums.. Right now I have over 200 to start and I haven't added in college stuff yet (I still have to figure out what to add and how to deal with sorting them), which could add another couple hundred pretty easily. While obviously in a new forum install, this won't matter too much, but as everything grows would I be better off setting up separate installs for college stuff... Second, I notice some sites use things like boards.domain.com or bbs.domain.com for their forums. Obviously it becomes a little harder to remember url's for, but I'm assuming they are set up that way so that either now or in the future, they can be run on their own separate server by having a separate domain record for them... Are there any advantages or disadvantages of setting things up this way?



Sally #205383 10/10/2000 5:13 AM
Joined: May 2000
Posts: 10
User
User
Offline
Joined: May 2000
Posts: 10
I don't know - but don't you create redundances if you put the same text in different files (because a certain text is used in different modules). That could bring a lot of more work when doing personal modifying.

But I have another idea. Why dont you put all the variables in the database. The key would be the ref-word and the language. At runtime you have to read only the nessecary variables. And because the database access is buffered these methode could be the quickest.

Maybe you can make a small test Scream?

Tenovis GmbH & Co KG
Eddie Kreutz
Mail [][email protected][/]


Tenovis GmbH & Co KG<br>Eddie Kreutz<br>Mail [][email protected][/]
Joined: Jun 2000
Posts: 23
Journeyman
Journeyman
Offline
Joined: Jun 2000
Posts: 23
Absolutely, you do create redundancies. That's why I suggested some sort of tool that could be used after you edit the main file which would update the most used scripts in all these places.. I'm just thinking way ahead to having a lot of traffic on these...

I like your database idea, with adding another table that would contain all that information.. You will add one extra sql call to access that table, but it can easily be maintained via the administrator login and a forms interface... that sounds interesting. The only problem I see with the database idea is the structure.. Would you have 1 row for each language and then 400 some odd columns for all the different tags in the language file?

Taking my logic a little further... again keeping in mind that this only probably need to be done for postlist, showflat and showthreaded (maybe forum and search - just the most hit stuff) why couldn't you do the same with not only the language files, but w3ttheme, w3tvars and hell, w3t In looking through showflat, it looks like there are only 14 of the config variables used, 13 of the theme variables and 11 subroutines from w3t (although admittedly I didn't search through each of those yet to see what else is needed by those), this was just a quick check of some stuff... So if wasn't concerned about language, potentially, I could move all this stuff into the most hit scripts, and now instead of opening 4 files totalling (approximate sizes on my site) about 140k to run showflat, I would have only my showflat at 48k plus maybe another 5-10k by merging this stuff in. The only thing of any size was the send_header function... And of course either way, I still have to deal with the header.include and footer.include, which again, I can punch in to the main script and avoid opening two more files. I admit I don't know the real effect of some of these changes and how they would impact server load or the programs operation, but it seems like it could be a big advantage when dealing with higher traffic.



Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
The heavily used text strings like username, submit, continue, etc are in a file called generic.pl. This file is compiled at script execution. It basically contains the text strings that are in w3t.pm and the ones that are found in most every script. There are a few redundant text strings in some of the other files but not very many.

Moving these strings into the database wouldn't serve much benefit. It's true that you have some file I/O overhead when you open a file, but if you move them into the database you have to do the query and then pull the strings into a hash and how this is achieved is different for each database server. So, a small file I/O operation to read in the necessary language strings is easier than executing queries.


UBB.threads Developer
Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
If you are looking at a very large number of forums you probably want to do seperate installs. Because all posts are stored in one table, the size of this table can get quite large if you try to run too much under one install.


UBB.threads Developer
Sally #205387 10/10/2000 2:55 PM
Joined: Jun 2000
Posts: 23
Journeyman
Journeyman
Offline
Joined: Jun 2000
Posts: 23
I know that the number of posts will be needed to determine this, but is there a recommended limit of say 200 forums not to go over... Do you have any stats on what some of the larger sites are doing... I went to sony music and see they have 153k registered users and I was trying to figure out the number of posts and it looks like they are using a different table or db for each board.. I tried punching in low post numbers for different boards and kept getting different posts.. Interesting. From the index page, there appear to be over 500k posts. I bumped over to mycoupons.com and they look like they have "Hundreds of Thousands of Happy Members" with about 20k posts on the main index page and the post numbers appear to be up to almost 500k now. Internet movie database doesn't show members, but shows about 120k posts on the main index page and the post numbers appear to be just over 350k now... If my math is right, my coupons gets about 25 replies per post where imd gets about 3, does that sound right.. it's interesting. so much for trying to figure out average replies per post and getting total post numbers..

I'm assuming the last two are using just one database and sony has pretty much tricked out theirs pretty well for either multiple tables or databases for posts (I also noticed fansonly.com doing this with multiple tables or databases as well...

So is there a break point somewhere and do you know if sony and fansonly are using separate tables or separate database for their boards?



Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
Sony and fansonly are currently using an old copy of WWWThreads. The database schema used to consist of a seperate table for each forum. This turned out to be 1, insufficient and 2, bad practice. Last I heard from Sony they were ordering a new server to handle their forums and they were going to be upgrading, but I haven't heard anything from them since.

I don't get much feedback from many of the larger sites so I'm not sure of a limit on forums/posts.


UBB.threads Developer
Joined: May 2000
Posts: 10
User
User
Offline
Joined: May 2000
Posts: 10
Hi Jpreeper, Hi Scream

Quote
Would you have 1 row for each language and then 400 some odd columns for all the different tags in the language file?


I mean for each variable in each language has one entry in the database. The structure would be:

language +
var_name --> primary key
variable

This is the most flexible structure and handles the less data. It's every time possible to add some more language and also some more variables. without changing the databaase structure.

Quote
how this is achieved is different for each database server


But if you do a test on your machine you will see the difference between the method "textfiles" and the method "database"

Tenovis GmbH & Co KG
Eddie Kreutz
Mail [][email protected][/]


Tenovis GmbH & Co KG<br>Eddie Kreutz<br>Mail [][email protected][/]
Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
I originally did have the languages in the database when I was first making the switch to dynamic language. There were more lines of code doing the database method, even when at the time I only was supporting mySQL and PostGres. 2 reasons for more code. Like I said, different database servers handle hash retrieval differently so there are if statements. Also, you need to specify what variables you want to grab for the script which means building a specific query for each script. So doing this is much more complex then doing a file I/O. Also, running time on the scripts to check their process time and load was a wash. Meaning there was no real difference either way.


UBB.threads Developer
Sally #205391 10/11/2000 2:31 PM
Joined: Jun 2000
Posts: 23
Journeyman
Journeyman
Offline
Joined: Jun 2000
Posts: 23
Scream,
I know you're probably swamped right now, but if I do separate installations for the two, is there any way to sync up the user info... I'd rather not have to require the visitors to have two username/password combos and I would like to maintain all their preferences, marking new messages, private messages, etc.. across both because people interested in college football are probably interested in NFL, and they would be in two different databases.



Joined: May 1999
Posts: 3,039
Guru
Guru
Offline
Joined: May 1999
Posts: 3,039
This could be a slight problem. Since if you do an install for each one then there is no real way to tie them in together. I guess the main thing would be how long you plan to keep old posts. MySQL can handle a very large number of records so having one install might not be a problem unless you plan on keeping posts for a very long period of time. If you put a regular purging schedule on the forums and then run the OPTIMIZE TABLE w3t_Posts once in a while you might be ok.

But again, this is basically a guess as I don't have any hard facts to give you the best answer.


UBB.threads Developer
Sally #205393 10/12/2000 5:24 PM
Joined: Jun 2000
Posts: 23
Journeyman
Journeyman
Offline
Joined: Jun 2000
Posts: 23
Fair enough... Let me ask this then, do you know what the largest number of posts you have seen in a installation is.. As I mentioned earlier, I've seen I think 400k to 500k... And some rough math tells me if there is a 2GB limit on mysql tables and an average post is what, 2k, then we would be looking at roughly 1 million max or 2 million if the average post is 1k. Does that sound about right?

Looks like I'm in for some major hacking to get the college stuff in :(




Link Copied to Clipboard
Donate Today!
Donate via PayPal

Donate to UBBDev today to help aid in Operational, Server and Script Maintenance, and Development costs.

Please also see our parent organization VNC Web Services if you're in the need of a new UBB.threads Install or Upgrade, Site/Server Migrations, or Security and Coding Services.
Recommended Hosts
We have personally worked with and recommend the following Web Hosts:
Stable Host
bluehost
InterServer
Visit us on Facebook
Member Spotlight
isaac
isaac
California
Posts: 1,157
Joined: July 2001
Forum Statistics
Forums63
Topics37,573
Posts293,925
Members13,849
Most Online5,166
Sep 15th, 2019
Today's Statistics
Currently Online
Topics Created
Posts Made
Users Online
Birthdays
Top Posters
AllenAyres 21,079
JoshPet 10,369
LK 7,394
Lord Dexter 6,708
Gizmo 5,833
Greg Hard 4,625
Top Posters(30 Days)
Top Likes Received
isaac 82
Gizmo 20
Brett 7
WebGuy 2
Morgan 2
Top Likes Received (30 Days)
None yet
The UBB.Developers Network (UBB.Dev/Threads.Dev) is ©2000-2024 VNC Web Services

 
Powered by UBB.threads™ PHP Forum Software 8.0.0
(Preview build 20221218)