Category Archives: PHP

WordPress Automatic Downloads and Plugin Updates Permissions

Going crazy following posts on the Internet trying to set permissions on your WordPress directory to get automatic plugin downloads and updates working?  So was I.  Several times on several sites.  Trying to keep security in mind I didn’t want to give too much.  After applying the DIRECT method to the wp-config.php file I had to do this to get plugins to download, create the directory and install.  Keep in mind all permissions were the default except for this:

chmod 775 wp-content/
chgrp nobody wp-content/
chgrp nobody -R wp-content/plugins/

‘Nobody’ should be the user that the apache process runs under. You do not need to set permissions to 777.

PHP APC config syntax causes [apc-error] apc_mmap: mmap failed: Invalid argument

After upgrading Ubuntu server from 9.10 to 10.04LTS PHP’s APC cache wasn’t functioning.  Apache wouldn’t start, it hung in the process list and printed this error to /var/log/apache2/error.log

[apc-error] apc_mmap: mmap failed: Invalid argument

The apache process would show up in the process like this:

apc@hostaname# ps aux | grep apache
www-data 6958 104 0.0 139044 3624 ? R 12:47 0:19 /usr/sbin/apache2 -k start

This process would then have to be killed, APC commented out, and then the web server restarted just to continue on without APC until a solution was found.

The PHP manual states this regarding MMAP support in APC:

http://php.net/manual/en/apc.configuration.php

When APC is compiled with mmap support (Memory Mapping), it will use only one memory segment, unlike when APC is built with SHM (SysV Shared Memory) support that uses multiple memory segments. MMAP does not have a maximum limit like SHM does in /proc/sys/kernel/shmmax. In general MMAP support is recommeded because it will reclaim the memory faster when the webserver is restarted and all in all reduces memory allocation impact at startup.

APC was made to run by commenting out all lines from the PHP config file except for:

extension=apc.so
apc.enabled = 1

This config can exist in a number of places. In 9.10 APC had been compiled by PECL so it was in our /etc/php5/apache2/php.ini file. However, in 10.04 APC is a package so we removed the PECL version, installed the version using apt-get install php-apc and moved the configuration to /etc/php5/conf.d/apc.ini for better consistency.

pear uninstall apc
apt-get install php-apc

As I began to uncomment lines one by one, it turned out the culprit was in the apc.shm_size directive. The default size is 30M, but as soon as the directive was uncommented it crashed Apache. I was unable to specify any value at all, even the same or lesser value. I even tried with quotes and removing quotes. That’s when I started thinking syntax may be a problem because it works when using the default value (shm_size commented out) but fails with an “invalid argument” error. That makes me think APC is sending an invalid argument to MMAP. In which case I find this post that confirms my suspicion.

http://stackoverflow.com/questions/6716929/apc-configuration-on-ubuntu-10-4-problem-with-apc-shm-size-apc-shm-segments-an

It turns out that the “M” for Megabytes cannot be specified in the shm_size directive for APC in Ubuntu server 10.04 because it is using APC version 3.1.3p1. However, on 9.10 APC wasn’t included as a package so it was installed with PEAR PECL which installed a more recent version of APC (3.1.9) which did allow specifying the “M” in the shm_size directive.

If you wish this to work in your config file, it should read like this in older versions of APC:

apc.shm_size = 100

This would specify 100M shared memory segments, and would be equivalent to this in newer versions:

apc.shm_size = 100M

And you can also put quotees around the “100M” if you like.

After these changes I had Apache up and running again, the APC cache helping PHP along, and some of the quickest loading pages I’ve seen in a while.

Changing phpBB User Control Panel Welcome Message (L_UCP_WELCOME)

Trying to integrate a forum into an existing site with its own login already has been a challenging task. I’m getting a good understanding of the layout of phpBB though.

One thing I didn’t find much information about was changing the welcome message in the User Control Panel. It reads like this:

Welcome to the User Control Panel. From here you can monitor, view and update your profile, preferences, subscribed forums and topics. You can also send messages to other users (if permitted). Please ensure you read any announcements before continuing.

One problem, I’m integrating with an existing system which has its own account management built in along with custom private messaging, so the user won’t be able to change their email, password, or message others. So the message needs to be changed.

In your theme templates folder, this is found in the file ucp_main_front.html. Near the top it looks like this:

{L_UCP_WELCOME}

But the definition is a long way from there. I find these definitions to be scattered all over, with many found in the functions.php file, but in this case the L_ prefix denotes it is a language define. That means you’ll find the definition in the language/en/ucp.php file. EN if you are using English language that is.

BUT, if you have some nice editing software that can find definitions throughout the whole project directory, you’ll be disappointed when you couldn’t find L_UCP_WELCOME. That’s because you’ll find it under UCP_WELCOME instead. Confusing? Open the file listed above and you’ll find it is ONE great big array bearing the name $lang. Somewhere down the list you’ll find:

	'UCP_WELCOME'					=> 'Welcome to the User Control Panel. From here you can monitor, view and update your profile, preferences, subscribed forums and topics. You can also send messages to other users (if permitted). Please ensure you read any announcements before continuing.',

Just change it to whatever you like.

Personally I don’t like having to update the language file manually after an update, but that’s the current level of ability with this software. Change UCP_WELCOME to something you like and it will show up in place of the L_UCP_WELCOME placeholder in the template when the page is loaded. The other alternative, if you are editing the style template files for customization already, is to just hard code it in HTML, as long as you don’t plan on using other languages.

Updating Drupal from 5 to 6 causes “The post could not be saved” – Postgres version fix

I recently upgraded several Drupal sites to 6.2 Some were 5.1, 5.2, and 6 previously. Upon completion of what I thought was a successul upgrade, they all suffered from the same common problem – trying to save a new post resulted in a “The post could not be saved” error. There were also various other database access errors in logs. On one site it was observed that after trying to make a post, Drupal would take it upon itself to take the site offline with the famous “Welcome to your new Drupal website” message. The problem there turned out to be a permission access problem, but only occurred after a failed post was attempted. To repair, go to admin/content/node-settings and select the “rebuild permissions” button. Once the permissions were repaired it was fine – until you made another post and then Drupal repeated the offline cycle and the beginning of page access SQL errors in the log again.

Fortunately there were some hints as to the problem, but the real problem is that they were mostly for MySQL, and I was using Postgres. Here’s my solution to the problem.

First, the most verbose error message I got was on this very site, where it looked like this after trying to post:

* warning: pg_query(): Query failed: ERROR: duplicate key value violates unique constraint “node_revisions_pkey” in /path/to/drupal-6.2/includes/database.pgsql.inc on line 139.
* user warning: query: INSERT INTO node_revisions (nid, uid, title, body, teaser, log, timestamp, format) VALUES (0, 1, ‘test’, ‘test test test test test test test test test test test vv’, ‘test test test test test test test test test test test test test vv’, ”, 1294612270, 4) in /path/to/drupal-6.2/includes/common.inc on line 3538.
* warning: pg_query(): Query failed: ERROR: duplicate key value violates unique constraint “node_vid_idx” in path/to/drupal-6.2/includes/database.pgsql.inc on line 139.
* user warning: query: INSERT INTO node (vid, type, language, title, uid, status, created, changed, comment, promote, moderate, sticky, tnid, translate) VALUES (0, ‘blog’, ”, ‘test’, 1, 1, 1294612270, 1294612270, 2, 1, 0, 0, 0, 0) in /path/to/drupal-6.2/includes/common.inc on line 3538.
* warning: pg_query(): Query failed: ERROR: duplicate key value violates unique constraint “node_comment_statistics_pkey” in /path/to/drupal-6.2/includes/database.pgsql.inc on line 139.
* user warning: query: INSERT INTO node_comment_statistics (nid, last_comment_timestamp, last_comment_name, last_comment_uid, comment_count) VALUES (0, 1294612270, NULL, 1, 0) in /path/to/drupal-6.2/modules/comment/comment.module on line 607.
* The post could not be saved.

If you are using postgres on the command line via the psql client, you can examine all sequences using the command

\ds

You’ll see that both node_nid_seq and node_revisions_vid_seq have been created. If you’ve tried creating a number of new pages or posts and failed, you’ll notice if you use nextval() like:

SELECT nextval('node_nid_seq');

It will be quite a bit advanced from your actual last successful post’s nid, which you can see like this:

SELECT max(nid) FROM node;

However, if you do this:

SELECT nextval('node_revisions_vid_seq');

Make a note of it. Then find your actual last node_revisions max vid like this:

SELECT max(vid) FROM node_revisions;

And you’ll see it is probably only one higher.

At this point, examine the node table like this:

\d node

                                  Table "public.node"
  Column   |          Type          |                     Modifiers                      
-----------+------------------------+----------------------------------------------------
 nid       | integer                | not null default nextval('node_nid_seq'::regclass)
 vid       | integer                | not null default 0
 type      | character varying(32)  | not null default ''::character varying
 uid       | integer                | not null default 0
 status    | integer                | not null default 1
 created   | integer                | not null default 0
 changed   | integer                | not null default 0
 comment   | integer                | not null default 0
 promote   | integer                | not null default 0
 moderate  | integer                | not null default 0
 sticky    | integer                | not null default 0
 mos_id    | integer                | 
 language  | character varying(12)  | not null default ''::character varying
 tnid      | integer                | not null default 0
 translate | integer                | not null default 0
 title     | character varying(255) | not null default ''::character varying

And you can correctly see under “Modifiers” that the default value for nid is the next value in the node_nid_seq, which is the equivalent of the more familiar MySQL AUTO_INCREMENT attribute.

Now try the same on node_revisions:

\d node_revisions


                       Table "public.node_revisions"
  Column   |          Type          |               Modifiers                
-----------+------------------------+----------------------------------------
 vid       | integer                | not null default 0
 uid       | integer                | not null default 0
 body      | text                   | not null
 teaser    | text                   | not null
 log       | text                   | not null
 timestamp | integer                | not null default 0
 format    | integer                | not null default 0
 nid       | integer                | not null default 0
 title     | character varying(255) | not null default ''::character varying

And you’ll see that the default value for vid is zero (0). Of course it isn’t auto-incrementing and so you are getting errors trying to write because you already have a vid with the value 0 in it:

SELECT min(vid) FROM node_revisions;
 min 
-----
   0
(1 row)

So what is the solution?

ALTER TABLE node_revisions ALTER COLUMN vid SET DEFAULT nextval('node_revisions_vid_seq'::regclass);

Double check to make sure it went through:

\d node_revisions
                                   Table "public.node_revisions"
  Column   |          Type          |                          Modifiers                           
-----------+------------------------+--------------------------------------------------------------
 vid       | integer                | not null default nextval('node_revisions_vid_seq'::regclass)
 uid       | integer                | not null default 0
 body      | text                   | not null
 teaser    | text                   | not null
 log       | text                   | not null
 timestamp | integer                | not null default 0
 format    | integer                | not null default 0
 nid       | integer                | not null default 0
 title     | character varying(255) | not null default ''::character varying

…And it looks good!

We almost switched to WordPress for one site because of this error, but thinking about converting 5 years and 1250 posts from PostgreSQL Drupal to MySQL WordPress didn’t sound like fun. Fortunately the fix was easy, but nailing down the problem took some time first. Hopefully this helped someone else.

Validating POST values from HTML in PHP revisited

In a recent article titled “Submitting POSTed HTML Forms and Registering Variables in PHP” we examined a better way to treat tainted POST variables passed to your PHP script from an HTML form. We decided to create an array called $FORM_FIELDS that would contain only the names of the elements that we wanted in our new array thereby cleaning up all fields in one statement (loop) and omitting unwanted fields from becoming part of memory.

In a recent article titled “Submitting POSTed HTML Forms and Registering Variables in PHP” we examined a better way to treat tainted POST variables passed to your PHP script from an HTML form. We decided to create an array called $FORM_FIELDS that would contain only the names of the elements that we wanted in our new array thereby cleaning up all fields in one statement (loop) and omitting unwanted fields from becoming part of memory or registered session variables.

The creators of PHP went one further in this step by making the $FORM_FIELDS array a multi-dimensional array. They added a couple fields in this array to represent the data type and an additional function to process the data.

The array may look like this:

$FIELDS = array(
'country_id' => array('type' => 'int', 'function' => 'addslashes')
);

Now when you process your loop as done in the previous article you would just add a line for the datatype:


if (isset($FIELDS[$name]['type']))
{
$tmp[$name] = settype($FIELDS[$name], $FIELDS[$name]['type']);
}

The reason this is important is incoming values from an HTML form all seem to be of type ‘string’. If you are inserting this information into a database or otherwise performing calculations on it then it is nice to know it is of the correct data type that your functions expect.

But there’s more. If someone is attacking your program by sending POST values different from those you expect or different from those in your form then this will help.

You simply set the data type before you perform validation. If you have a field where a user selects the country name from a drop down list box and the value sent to your form is the country_id then you would pick ‘integer’ as the data type. But what happens when a user sends a string? If you try inserting it to a database you get an error. If you try to compare it against a list of known country_id’s for validation purposes you will also get an error.

If you perform the settype() function in PHP before your validation you will either set it to the correct datatype or get a value of zero. If a malicious user sent a string of ‘Aruba’ for your country_id field and you cast it to an integer you would get a zero. Anything other than an INT would become a zero. You can then proceed with your validation which would presume that a value of zero is not valid.

You will want to take a look at the PHP manual section called “Type Juggling.” There are certainly some caveats and you may not expect some conversions to be logical. For example, if you had the string “10thingsIhate” and you cast it to an INT you would have int(10).

Brush up on the PHP types. They can save you some work in your validation of HTML form input in your PHP scripts.

PHP’s APC cache and how it relates to Apache Bloat

You just finished your install of Apache, PHP and APC cache on a VPS and you’re beat. You spent about a day going over every configuration and compile option trying to eek performance out of your VPS machine and web server. Then you take a look at the output of top or ps and notice, hey, my Apache process is 150 Megs! Well isn’t that grand. Compiling the Big Apache on a limited resource VPS can be a challenge for those of us who like to tinker. On the last VPS I got I scrapped the packaged stuff and started on compiling an Apache for the machine. The problem is trying to meet your needs with the limited memory you usually get on these machines. Although I like lighttpd, I’m not ready to give it a go on a production server that is being monitored by the media all the time. I need something as guaranteed as I can get. Although lighttpd may be such a beast, I’m not going to test it out on this site at this time!

I sat down and started to go over all configuration option for Apache, PHP and APC to get a nice small package to do the job. I was surprised how much junk PHP has compiled in by default. Compile it once and execute a phpinfo() and you’ll see. Just start with the “–disable” switches for anything you don’t want.

After about a day of thinning out the size of my stack I slipped into TOP and used the SHIFT-A toggle to see the alternate views. I noticed something odd…..

3  PID %MEM  VIRT SWAP  RES CODE DATA  SHR nFLT nDRT S  PR  NI %CPU COMMAND
 24140  0.1 78592  69m 7228  504 2124 2952    6    0 S  16   0    0 httpd
 11329  0.1 78408  69m 6908  504 1964 2736    2    0 S  16   0    0 httpd
 24141  0.1 78564  70m 6752  504 2120 2492    0    0 S  16   0    0 httpd
 24142  0.1 78564  70m 6732  504 2120 2476    0    0 S  16   0    0 httpd
 27813  0.1 78560  70m 6684  504 2116 2432    0    0 S  16   0    0 httpd

Doesn’t that seem like a lot of VIRT and SWAP for Apache2? Yes, it is. If you used the suggested 128 for apc.shm_size it will be a lot bigger too! I used 64M and you can see that at 78592 minus the 64M shared mem for the APC cache, the actual size is around 13M for apache with a 6.7M RES. That makes me feel better! I was so worried I had gotten something horribly wrong.

If you happen to run across this kind of thing in testing, simply set your apc.enabled in php.ini to ‘0’ to disable it, restart Apache and check ‘top’ again. Likely nothing to worry about.

The machine I am running is only hosting 4 sites so 64M is a good starting point, but what you should do is copy the file ‘apc.php’ from your APC source directory into a folder on your webserver and visit it frequently to see how it behaves. Depending on your other settings like the cache lifetime and garbage collection, if you have a large portion of FREE and a small USED then you can probably set the apc.shm_size lower. If it makes you feel better 🙂

Drupal multisite configuration problems

Drupal multisite setup configuration

I’ve adopted Drupal for a majority of my online activities in the past year. It has enough of the things I like and is much faster and more stable than previous CMS or forum software I’ve used. However, some documentation seems to be lacking. Multisite configuration with Apache using a single Drupal codebase is one area.

I finally decided to try a project in which I would use a common codebase for Drupal across all of my websites. That is, rather than having a directory for each website that has a Drupal installation in it, I thought I would take advantage of Drupal’s multisite functionality by having one drupal installation (codebase) in a central directory and have all the project websites point to it.

Drupal multisite advantages

The advantages of Drupal’s multisite feature are easy to spot.

First, a common codebase means only one codebase to change during an update – even though drupal requires you to update each websites database separately.
Second, using an intermediary cache like APC for PHP means that you can use less server memory caching files across multiple websites because the core files are the same.

Despite this great feature and its advantages, there is much confusion at the Drupal.org website from patrons as to what is the correct way to set this up. The Drupal install file goes into it pretty good, but leaves out one important detail – the web server configuration. Some purists say that is an Apache issue and we should leave that to the Apache mailing list but hey, PHP does a pretty good job of detailing the Apache configuration in their instructions so Drupal could too.

Most people who administer their own websites would follow a ritual in creating a new website. Create a directory for the new website, create the www subdirectory and then configure the Apache DirectoryRoot directive. With a single codebase used by multiple domains and websites you need a different approach.

The missing multisite ‘key’

The most important thing here is to make sure each website’s DocumentRoot in the Apache configuration points to the common drupal directory. This is an unusual configuration for most people but it works well.

The idea is that Drupal receives information from PHP (and Apache) to tell it which website it is supposed to serve. When it determines that, it will retrieve the correct configuration file (database passwords and URL base) from the Drupal ‘sites/’ subdirectory. That’s how it can determine which website to display.

Here is a quick breakdown.

Install Drupal in a common directory. Here I chose /var/lib/drupal-4.7.4

Caveats

Some people use a symbolic link to point Drupal to the correct distribution. I’ve done this but I suggest not doing it. Instead install your Drupal installation in a single directory titled according to it’s version number. That is, use the same directory structure and names that Drupal gave when you extracted the zip or tarball.

Why? If you have a couple sites and it comes time to upgrade you can run into trouble. It is easy to just recreate a symbolic link pointing ‘drupal/’ to ‘drupal-4.7.4’ but unfortunately that affects ALL your websites instantly. Not good on a production server. If you have 50 websites this means that you have 50 sites using the new codebase and the same 50 websites awaiting your hand to manually update their databases using Drupal’s supplied script. If any of those 50 are accessed during this wait period you’ll be in trouble.

The other disadvantage of the symbolic link comes if you are using third-party supplied modules. You forgot that 5 out of your 50 websites are using a module that isn’t being maintained on the same schedule as the Drupal project and guess what? It breaks your website. I’ve found that frequently a misbehaving module or even an errant ‘files’ path in the Drupal settings will disable all other Drupal modules. Best to avoid this altogether.

You can save yourself from both of these negative scenarios by simply putting Drupal in the supplied directory name and then adjusting your Apache DocumentRoot directive as you update them. It is an extra step but very easy.

The final advantage of not using a symbolic link is that you can hold a stable website. That is, you can have a couple different versions of Drupal on your server used by different websites. Several hosting providers do this with PHP and that is a very good example. If you upgrade to PHP 6 for one project you’ll find it doesn’t work with Drupal so you need to keep an old version of PHP installed for non-compliant websites. With Drupal I’ve found that once in a while a website gets working just right and I don’t want to update it. Or I quit monitoring it. Or it is an informational content only website with no subscriptions allowed. Or a million other reasons. Basically I will ‘freeze’ that website and not allow anymore code updates. Or, say you have a module that you use but the author abandoned it long ago. It won’t work with Drupal 6 for example. Simply freeze the website and only update your other websites with the new Drupal codebase. Keeping a couple distributions around can be handy but it means that you can’t point them to the codebase with a symbolic link.

Unzip the new codebase in parallel to the old.

# tar zxvf drupal-4.7.x.tar.gz

Backup the database for the website you are about to update.
Exact commands vary with which database you are using.

Edit Apache config file to change one website at a time.

# vi extra/httpd-vhosts.conf

Change:


DocumentRoot /var/www/example.com/www
ServerName www.example.com

To:


DocumentRoot /var/lib/drupal-4.7.4
ServerName www.example.com

Restart Apache so the config takes effect

# apachectl graceful

Visit the Drupal supplied database update URL for your website.

http://www.example.com/update.php

Watch for errors and check your logs. Visit the website and check the settings to make sure all modules still show up under the admin -> settings menu.

If successful, continue on with the next website on your server.

Maintaining Standards with PHP Code

In this article I want to talk a bit about standards. Not necessarily standards like you’d find with W3C, and not code formatting standards, but coming up with a consistent plan as you code and sticking with it. Staying close to the PHP manual and following developments with PHP so you don’t get bit in the butt later. I’ve been working with PHP for some years and have seen it go from version 2 through the current version and there have been many changes along the way.

In particular, I was thinking of two specific situations from my past which can bring down a site unexpectedly.

Shared Hosting

This is hard, but for most sites it is how they get online. You find a host, pay a monthly fee, and they take care of the rest. You just code and upload. This is fine, but the lack of control can be frustrating at times. Not only that, you never know who the person is playing SysAdmin on the other end of the wire.

In early 2000 I built a site for a company and it ran for nearly a year when I made a major international move. We were getting ready for another modification after I got settled. Amazingly enough, I was moving to one of the cities where this company had an office. When I got to my destination in early 2001 I had a horrible time getting an Internet connection. I think they were still in the dark ages.

Unfortunately, during my 6 weeks without an Internet connection (known now as my ‘blackout’ period), the shared host decided that PHP 4 was mature enough to upgrade. They sent out plenty of warnings that major changes were coming….but I didn’t get them without e-mail! One day they finally changed over and I had to go borrow a computer to find out why the site didn’t work. It turned out that PHP changed the format for notation for Apache directives. Very simple, very minute, probably not even necessary from my point of view. We got it fixed and we were back on the road and in a couple months we were working toward another redesign.

They say that waking up is hard to do

Have you experienced this? One day your site is down. You clear your schedule, visit the mailing lists and forums and spend your day (or two) finding out what went wrong. Waking up from this nightmare may leave you in a cold sweat. Keeping up with PHP or any software can be difficult. PHP isn’t bad, but you will notice some inconsistencies in the interface and over time things like that will be fixed, eventually breaking old code. What’s even worse is keeping up with shared hosts. Let’s take a look at scenario #2 from the hosting company’s point of view.

I worked for a typical Internet hosting company. We had many web sites on our machines, all with access to PHP. Although I can’t go into details, I would say their implementation of PHP was unusual. It was certainly not stable or secure.

It wasn’t unusual for the SysAdmin to come to work in the morning, read his e-mail, find out there was an upgrade for PHP released overnight and install it. That’s fine for your development server, but would you do that for the machines hosting many thousands of web sites?

Well, they did, they still do and so do many shared hosts. Not the good ones of course. The good ones I’ve been with test very thoroughly before making upgrades. Then they notify the clientele in advance of a change so they can monitor their site.

The problem is, you probably don’t know who is on the other end of your web site. What is their level of ability? Their work ethic? You probably don’t know.

The Standard

The ‘standard’ in this case simply reads “Keep up to date.” Read the PHP manual. Subscribe to the mailing lists. Get the RSS feed. Know what they are doing.

Scan through the manual frequently so you know what is going on. When you see that something is being phased out, like the HTTP_*_VARS, don’t keep developing with it!! Change immediately and start modifying your old code. That day will creep up when it’s gone and you won’t know when. Your site will go down even though you didn’t change anything. You’ll be on the phone with tech support blaming them. I’ve seen it. Worse yet, while you are on the phone with tech support, your former clients will be calling telling you their site is down.

If something is labelled as experimental in the PHP manual, be ready for changes. If you use it, don’t expect it to always work as it is now. You can even help the PHP team by submitting feedback if you use experimental features.

The Reward

The bonus reward for staying up to date is you are unlikely to be surprised. Not ever of course, things happen beyond our control. We can’t know everything, and we certainly can’t know the future. By knowing what’s coming as it is released by the PHP team you can at least get a head start, minimize the impact, and minimize the site outages, which could otherwise mean lost revenue.

If you get the opportunity to be your own SysAdmin, try it, at least on your development machines. PHP isn’t that hard to maintain, and you can see performance boosts by compiling only what you need. It is a big job to maintain a machine though, so if you aren’t up for it then make sure you know you’re hosting company well. Don’t just pick one. Peruse the online comments but don’t assume you are always getting a fair shake because hosting companies offer outrageous affiliate fees making people work hard to get referrals. Talk to friends, use tools at your disposal and make sure that the company you deal with only hires professionals in their field. If you made a bad mistake choosing a company don’t be afraid to move. You shouldn’t reward them for a bad job.

Taking these steps will make sure you are ready to go at each turn in the road. Neither rain, nor sleet, now snow can take down your site!

Now, are you ready for a slashdotting???

Bad PHP Code – Blogs and CMS

Lately people have been blaming PHP for being insecure. While most of them seem to blame little tiny issues that, when used incorrectly by the programmer, make their scripts insecure. Does that make sense? The programmer uses it insecurely, but the problem is in the language. Absurd. Guess what, you can do that with any language. You can make your house insecure too by failing to fasten the door locks. But more to the point, after examining some of the Blog and CMS software for PHP I’ve come to this conclusion – if that kind of programming is general across the board for these packages then we have a bigger problem.

Blame the Messenger

The first thing we need to know is that we are not all knowing.

Often times we get people learning programming from PHP because it is so easy. That’s what I did back in the days of PHP 2. PERL was just a little too hard for me, and in 1995 I had a real hard time scouring information on the web to figure out what CGI was. But I went from getting my feet wet in PHP to learning C to going to college, then going into full-time work. I was lucky.

When I think back to my early days of PHP I can see how we can be dangerous. We typically focus on getting something done, not really how to do it or the best way or even the most secure way. I was so proud when I built the unthinkable using PHP the first time.

The problem when we are that young in a skill is that we think, “OK, I’ve learned this skill, let’s move on.” What we are forgetting is that we know a language but we don’t know how to use it. We’ve figured out where to put the pieces to form sentences – that is, how to write code to not get errors. The problem is, we still don’t know how to have a conversation.

What I’m getting at here is that sometimes we forget that we need to consider why we write code. We need to understand security. We need to know how the web works. We need to know how a browser renders HTML or CSS or XML or whatever you are working with. We need to know the consequences of using a database system, the overhead caused by connects. We need to know how to bug track, profile and optimize for best efficiency. There is much more work to do once you your code “just works.”

Extending the Problem

That is what happens when we are young. Then we explore, then we work, then we forget. That’s me 🙂

As we move along and take on big projects we need to keep on moving upward. From what I’ve experienced, it is common for people to get interested in one area and focus on that. We know programmers who don’t know a thing about hardware. They don’t know what a stack or register is even. On the other side, we have the hardware pros who think they can program because they know the basics of a language. They don’t know what a stack or register is even 🙂

CMS and Blogs

How does this apply to Blog software? I really don’t know. I can’t tell what those individuals and development teams were thinking when they spec’ed out their project. IF they did that. I don’t know what they were thinking when they decided to put PURE PHP code inside template files made for HTML coders, graphics designers and common citizens. Seeing “DON’T EDIT THIS CODE” inside a template file is not acceptable in a project. I especially don’t know what they were thinking when they decided to make “index.php” a 10,000 line piece of code that did everything except the admin.

Obviously there are several CMS and Blog packages for PHP out there that are less than ideal. Some are very big and popular. I’ve seen absolutely useless database connects and queries inside a page that slow it down to a crawl. Major packages with 15+ queries on a blank template page that only returns navigation. You don’t want to install this kind of stuff on a shared host, and if you even get the kind of traffic your blog deserves you’ll be looking for a new package at a really inopportune time….and then figuring out how to migrate your data.

The End

I’m hoping soon we see the end of this kind of thing. The trouble is that PHP is such an easy thing to learn that we won’t. What we need is a crop of fantastic programmers to step up and build something we can be proud of. I have a feeling that won’t happen though. Most of those programmers are employed customizing the blog software that already exists out there for people who found the limitations 🙂

Why Dynamic Page Caching in PHP is a Good Idea

Just yesterday I talked at length about the caching benefits of the Smarty Template Engine for PHP. Today I was able to see an example of why caching was such a good idea.

In that article I used an example of an e-commerce store that had a page with three ‘Top 10’ lists. The 3 variations were for 3 different time periods, and the ‘all time best sellers’ list was a killer SQL query that had to look through 6 years worth of sales records to determine the top sellers. I used Smarty’s version of template and page caching for the script which caches the output in a PHP file. I then chose to regenerate the cached file once per day resulting in only one database connection and query PER DAY instead of one query every time someone viewed the page.
Just yesterday I talked at length about the caching benefits of the Smarty Template Engine for PHP. Today I was able to see another example of why caching was such a good idea.

In that article I used an example of an e-commerce store that had a page with three ‘Top 10’ lists. The 3 variations were for 3 different time periods, and the ‘all time best sellers’ list was a killer SQL query that had to look through 6 years worth of sales records to determine the top sellers. I used Smarty’s version of template and page caching for the script which caches the output in a PHP file. I then chose to regenerate the cached file once per day resulting in only one database connection and query PER DAY instead of one query every time someone viewed the page.

As we discussed, this large query resulted in an execution time around 4 seconds, which was intolerable from my point of view. With the implementation of Smarty, that dropped the execution time down to around 0.4 seconds with nearly no work on behalf of the hardware. Win-win.

However, today I got a first hand example of how NOT using Smarty, or any caching for that matter, is a bad idea.

Changing Programmers

When I left that company, I had a brief meeting with the incoming programmer who really wanted to delve into Smarty for some time. He was quite eager and wanted to do some work. As it turned out, the site was functioning so well on it’s own that the management put him to work on a new project. It seems they wanted to separate the store from the company site so they would have two sites.

This was a good idea conceptually, but for whatever reason the programmer decided to NOT use Smarty on the new store, even though the programming was already in place. They did leave Smarty in tact on the static pages corporate web site. Smarty on the ‘static pages’ was originally put in so the data punchers in the office could maintain the mundane details of the HTML files (templates) without having to see (and break) th PHP code that runs everything. Smarty does a great job of separating code from display.

At the time I left they moved the whole she-bang off the shared servers onto a dedicated machine and I peeked in to see that their site was one of the fastest functioning sites I’ve ever seen, and even Alexa rated it as ‘Fast’ – which is rarely seen in Alexa ranks.

Then it all went downhill…

Somewhere along the way that decision bit them. I peeked into their new store today to look at the “Top 10” page. To marvel at my earlier work, and to revel at how fast Smarty caching is.

When I got there I remembered, “Oh ya, new store, new site.” I clicked the link to the best sellers page and I waited. And waited. And after 40 seconds I got bored and did something else. When I came back the page had loaded.

40+ seconds to load a page???? I thought 4 seconds was intolerable!! What could have possibly gone wrong.

I need it yesterday

Don’t we all. I hated that phrase. I can hear the management saying that right now. “I know you just came to me with the idea today, and I didn’t even know it was possible before that, but it is such a good idea that I NEED IT YESTERDAY. I can’t live without it now!” Geez. I find that rather impulsive.

That’s sometimes the way it goes, and that could be what happened here. All we know for sure is that the management wanted to move the site away from the corporate identity, and with the move came some new features to try and increase revenue. The programmer probably didn’t get his chance to work with Smarty and chose to go without it and use what he was comfortable with. In the process, completely wasting the money they’d spent on the project to that point.

Despite a move to new hardware, you have to remember there has probably been an additional two years worth of sales records to sort through in that query, and more products in the store. Couple that together with new product distribution methods, new lines, and then likely a rewrite of that ‘all time best sellers’ query by someone who may not have been comfortable with SQL and you get a 40+ seconds page execution time. Something which makes the pages virtually inaccessible and strains the load on the hardware during peak times.

The moral

Sometimes it is important for us not to become emotional when faced with things we don’t understand – like defending a programming methodology. Managers need to know that it takes time to sort out the facts – nothing you do today can transcend space-time, i.e. “I need it yesterday.” Not everyone is at the same level of understanding. Information systems can frequently be complex and even web designers need to consider they may be building software that could be in use for some time. Plan for the future.

The bottom line is, get acquainted with a good tool like Smarty now, while you can. Don’t wait until you are faced with a hardware problem, an expensive query, working with a new graphic designer or management that is in a hurry before you decide to look for those tools. Get them in your library, get the skills under your belt and start to implement them right away in the projects you are doing today.

Trust me. You’ll appreciate that caching template system the first time your system gets slashdotted 🙂