Wednesday 30 October 2002

MT+MySQL·PHP 04: To .php or not to .php

I know that I’m supposed to be suggesting answers in this PHP/MySQL series but here’s an intriguing question instead: when I move my site to the new hosting service (upgrading to Movable Type 2.5 at the same time), should I change my files from *.html to *.php or not?

Here are my objectives:

My original plan was to:

If this seems utterly crazy, remind yourself that it’s a strategy devised by a non-programmer and that my point in writing this post is to solicit technically superior alternatives.

However, Shelley’s experience with the slashdotting of her post Parable of the Languages gave me pause to reconsider saving my files with .php extensions. Mark Pilgrim observed:

Shelley’s been Slashdotted for her “Parable of the Languages” story. Ouch. I hear PHP screaming right about now.

Shelley verified the truth of Mark’s prediction:

According to my statistics package, I had over 25,000 visits from Slashdot the last few days. I do know that my server felt the strain Friday night, but held together. I learned an important lesson from the experience — don’t save your pages as ‘php’, unless you really need the PHP functionality.

Don’t get me wrong. I have absolutely no expectation of being slashdotted— given the esoteric nature of my blog. But Web publishing is an ongoing Purgatory (as Loren Webster will attest) and I simply want to minimize the grief.

As a consequence of working on ASP and ColdFusion sites for years, I’d mistakenly assumed that a .php extension was essential if I wanted to use PHP’s functionality in my weblog (to display random images without using JavaScript or alternating background colors in my comments). But as Norm Jenson and Tommy Williams kindly pointed out, I can call PHP code in my .html files by adding the following statement to my .htaccess file:

AddType application/x-httpd-php .html

So I came up with an alternative strategy:

So, which is the better course of action? Staying with the .html files is simpler. But lots of people (for example, Phil Ringnalda, Brad Choate, Jeff Ward) use .php extensions and—although this is incredibly embarrassing—.php extensions seem way cooler to me than boring old .html extensions. I know that I’m not supposed to be attached to such worldly affectations, but I’m human.

And, here’s the real question: does PHP handle .html pages and .php pages differently (as Mark’s and Shelley’s remarks suggest)? Any and all suggestions will be gratefully received…

Permalink

Comments

The problem with adding .html to the list of extensions that PHP will parse is that every HTML file that you have will be parsed, whether it contains markup or not. This adds overhead (though I don't know the exact numbers, I figure that any extra overhead, especially if being Slashdotted, is a bad thing).

What we did at work when faced with a similar problem is create a custom 404 page that, if reached by someome looking for a non-existant .html page, would automatically bounce them to the equivalent .php page (so, /archive.html => /archive.php) and if they come back again it would show the actual 404 page (this works because the second time they come back having been looking for that .php that you bounced them to).

It adds a barely noticeable 'double-pop' if they come looking for a page that really isn't there, and makes a pretty good user experience if they just surf in on an outdated link.

You can also do some really cool stuff with .htaccess files and URLRewrite to accomplish the same thing in (probably) a more efficient and clean way, but I couldn't tell you how :)

Posted by: John on 30 October 2002 at 09:59 PM

Adding _any_ type of dynamic functionality to a page will slow it down, be it PHP, mod_perl or _python, server side includes, or what have you. Serving straight .html files is the fastest. Server side includes of static files (which IIRC is Mark Pilgrim's strategy) is second fastest. What you think is third fastest depends on your religion.

Whether you prefer static files or dynamic pages is a matter of religion as well. Static files are OK as long as you never changed them. I switched away from MT because I could no longer rebuild my site at either of my hosters. B2 will scream if I'm ever slashdotted, but at least it works.

Posted by: PapaScott on 30 October 2002 at 10:34 PM

Jonathon, I originally had PHP pages because of dotcomments. However, when I went to MT comments, it was no longer needed on the individual page archives. Which means it wasn't needed for the slashdotted page.

I would suggest that you don't use any functionality on your individual and other archives -- other than comments -- and save these as .html. Then, save your main index page as PHP (index.php). With this, you can easily add PHP functionality into your main page (including the image manipulation). If people link, though, chances are they'll link to the individual item. And since that won't be PHP, your performance won't be impacted.

As for the comments, this is handled through a CGI application and PHP won't help you with the popup window. As for comments in the individual archives, you would lose the ability to alternate colors using PHP and would have to rely on JS.

(BTW, sorry I'm more brain dead than usual, but how were you going to merge PHP with MT comment functionality to get alternating color without JS?)

When I move my weblog, I will most likely use this approach, and will create an errorhandler to redirect weblog requests from a php file to its html equivalent.

I would strongly recommend against redefining your server to process all HTML pages as PHP.

Posted by: Shelley on 30 October 2002 at 10:55 PM

If you're really concerned about performance and want to use PHP, have a look at what Yahoo! is doing: http://public.yahoo.com/~radwin/talks/yahoo-phpcon2002.htm

They're switching their site over to PHP entirely.

However, they're using the ionCube PHP accelerator (http://www.php-accelerator.co.uk/). Not sure whether you would be able to install that, but it's a possibility if you truly, deeply want to run all your pages through PHP and you're worried about tremendous traffic.

Posted by: Tommy Williams on 31 October 2002 at 02:36 AM

.php extensions are only cool if you're non-technical. Once you reach Nerdvana, it's much cooler to use PHP but use your Nerd Ninja techniques to hide the fact. Then you can bask in the satisfaction of knowing that you're using PHP *and nobody can tell*. It's your secret.

Plus, if next year someone writes a better system than Movable Type, and it relies on Python and ZOPE, for example, and everyone is rushing to adopt it, you're not stuck with those silly .php extensions. You can just use your Nerd Ninja techniques to hide the fact that you're using Python and ZOPE rather than Movable Type and PHP.

Posted by: ralph on 31 October 2002 at 03:04 AM

Jonathon, what are you going to do about comments? Your old archived pages, such as what this one is, have a comment form in them, and an associated entry id. If you re-import a limited set of pages, the entry identifiers and associate comments will be different. So if a person wants to comment on an old archive page that hasn't been ported, and tries to post, the comment will get posted to the wrong page. Or possibly cause a failure.

I am following your export/import plan, but I'm thinking of removing the comment form from the individual page template and then re-generating them one more time. The existing comments would show, but new ones couldn't be added.

However, sure would be open to other ideas. Unfortunately, my clock is ticking so I've got to get my act together and start getting this stuff moved. Sigh.

Posted by: Shelley on 31 October 2002 at 06:38 AM

Oh, man. "Nerd Ninja." Gotta love that.

Posted by: Dorothea Salo on 31 October 2002 at 08:58 AM

Well... the consensus seems to be that I should stick with .html pages, which has the additional advantage that it's a lot less work. I like Shelley's idea of using .html for the archives and .php for the index page only. And thanks to Ralph for pointing out that using PHP but hiding the fact is cooler than displaying .php extensions (though I don't anticipate attaining Nerd Ninja status any time soon).

Shelley, thanks for uncovering the fundamental flaw in my "re-import a limited set of pages" strategy. I hadn't considered the duplicated post numbers. I'd still like people to be able to comment on the "hidden" archived posts so I think I'll re-import all the posts and use the SQL plug-in to filter out the unwanted posts from the "new" blog.

Shelley also asked: "BTW, sorry I'm more brain dead than usual, but how were you going to merge PHP with MT comment functionality to get alternating color without JS?"

I had this wacky idea that I could use .htaccess to have PHP process the (.cgi) comment files once MT was done with them. BTW, I'd advise you not to try to compete with me in the category of brain deadness!

Posted by: Jonathon Delacour on 31 October 2002 at 10:49 AM

Okay you've peaked my interest. What will this new focus be?
This is a great post It is answering many questions I've been thinking about. Thanks Jonathon and thanks to all those who have taken the time to comment.

Posted by: Norm Jenson on 31 October 2002 at 11:46 AM

>I had this wacky idea that I could use .htaccess to have PHP process the (.cgi) comment
> files once MT was done with them.

Sorry, you can't do that (easily or nicely). Apache (1.x) can only process the "output stage" of the request once. With Apache 2.x you could setup filters so the output of one handler will be processed by the next. Last I checked (some time in the summer), PHP was still not supporting it. (Filters, it is called in Apache lingo).

Your best bet is to make MT export the comments into files with a separate template and then have php process those files.

Don't worry about performance. It should be trivial for your server to handle much more than 5-10 requests per second with php. What makes "dynamic pages" slow is usually data access or brain dead processing. If you just use it for simple tasks it'll be hard to shoot yoursef in the foot too hard.

Just 10 requests per second is 36000 per hour.

And I second the suggestion to keep using .html.
http://www.w3.org/Provider/Style/URI.html


- ask

Posted by: Ask Bjoern Hansen on 31 October 2002 at 02:14 PM

Ask, thanks for the PHP stats and the pointer to the W3C URI page -- there's tons of useful information there.

Norm, I'm thinking that I'll focus on posts about cats, the Dishmatique, and beer. Just kidding. You'll have to wait and see...

Posted by: Jonathon Delacour on 31 October 2002 at 09:49 PM

I'm a .html person that is parsing .php on my blog pages - and I've never noticed a server strain.

As for the move of the site, it's a challenge in MT to keep your posts in the same order. I would take them off of your old host, direct everyone to your new site and use a nice MT Search option to let them find the post they are looking for. However, if you go to the MT forums I'm sure there are other suggestions there on how to handle a move like yours.

Posted by: Christine on 1 November 2002 at 09:27 AM

FYI - I use my individual archive template in MT as my comment pop-up window. That way I could run PHP on the comments window so that I could "skin" it to coordinate with the rest of my site. Best of both worlds, it just uses a template that is scaled down rather than my full index template. (If that didn't make any sense and you want a more indepth explaination, feel free to e-mail me.)

Posted by: Christine on 1 November 2002 at 09:30 AM

file extensions can be purely cosmetic. you needn't mix the two seperate questions here: 1) should you use php to dynamically parse pages? 2) should you use .php extensions or .html extensions? the first question has been answered already, but that doesn't mean the second has. if you think .php extensions look cool, there's no reason you can't serve .php pages unparsed (i.e. with no additional delay) in the same way you would force parsing of .html pages in .htaccess files:

AddType text/html .php

you could also do the same with ".jonathon" files, or use whatever extensions you want with either parsed or unparsed files. if you want to see this in action, just look at blogger.com's .pyra pages. there is no "pyra" file type - that's the name of their company.

Posted by: scott reynen on 1 November 2002 at 02:53 PM

This discussion is now closed. My thanks to everyone who contributed.

© Copyright 2002-2003 Jonathon Delacour