MySQL drivers for PDO in PHP 5.3+ on Ubuntu

I needed to get PHP to talk to MySQL through PDO on an AWS Ubuntu instance.

Here was my initial error:

PHP Fatal error: Uncaught exception 'PDOException' with message 'could not find driver'

This means that PDO didn't have the right driver. In this case, MySQL.

What version of PHP was I running?

$ php -v

PHP 5.3.2-1ubuntu4.9 with Suhosin-Patch (cli) (built: May 3 2011 00:45:52)

I tried to install PDO, just in case:

$ sudo pecl install pdo

[Some stuff excluded for brevity]
make: *** [pdo_dbh.lo] Error 1 ERROR: `make' failed

That just means that PDO is already baked into PHP 5.3, so we don't have to install PDO at all. Ok, so let's install the pdo_mysql driver the old fashioned way.

$ sudo pecl install pdo_mysql

[Some stuff excluded for brevity]
checking for PDO includes... checking for PDO includes...
configure: error: Cannot find php_pdo_driver.h.
ERROR: `/tmp/pear/temp/PDO_MYSQL/configure' failed

No love, and searching for the error didn't tell me what I needed to know: the pdo_mysql driver is included in the php5-mysql package on ubuntu. It's not explicitly obvious.

sudo apt-get install php5-mysql

I also restarted apache, even though it looked like apt did it for me, but after that, no more errors.

A small fix to syntax highlighting of PHP comments in emacs

If you're like me, you use emacs to edit a lot of PHP, and if you're like me, you sometimes run across code like this:


/* This is a comment
$foo = 'this is some commented out code'; # Bla bla bla
*/

That's not so bad, really, but the php-mode that I had been using got confused and ended the comment at the newline after the hash mark. Argh. It took a little searching and a bit of trial and error, but eventually, I found some documentation of syntax tables within emacs. From what I can tell, emacs has a little lexer-like routine that goes character by character during syntax highlighting. The built-in functionality seems like it might have some shortcuts over what might be hidden inside your favorite compiler suite, but whatever.

Let's start with what I added inside of php-mode.el. I added this after the code that this comment ;; Specify that cc-mode recognize Javadoc comment style describes, but I have a feeling it could live just about anywhere within that code block.


;; This works for me in GNU Emacs 21.4.1 and 22.2.1 on CentOS and Ubuntu, respectively.

(progn
(modify-syntax-entry ?/ ". 124b" php-mode-syntax-table)
(modify-syntax-entry ?* ". 23" php-mode-syntax-table)
(modify-syntax-entry ?# "< b" php-mode-syntax-table)
(modify-syntax-entry ?\n "> b" php-mode-syntax-table)
)

Okay, what does that mean?

(modify-syntax-entry ?/ ". 124b" php-mode-syntax-table)

modify-syntax-entry is a command that takes two or three arguments, as follows:

  1. ?/: This is the character whose lexing you are modifying

  2. ". 124b": This is how you are defining that character.

    • . The dot means that the character, a slash, can be used as punctuation

    • " " The single space means that the character can be used as whitespace (I think). Things did not work unless I had the space in there. Your mileage may vary, and I would love an explanation.

    • 1 The slash character can be the first character in a pair of opening comment characters: // or /*

    • 2 It can also be the second character in a pair of opening comment characters: //

    • 4 It can be the second character in a pair of closing comment characters: */

    • b This signifies the when the slash is used as the second character in a comment character pair, it is part of the "b" class of comments, so it must be closed by a "b" closer. We have defined \n as a "b" closer (the only one?), so there you go. // gets closed by \n. You can write this stuff as regular expressions and functions, but it won't be fun. There might be a time when you want another set of comment openers and closers, or any arbitrary number, really, but as far as I can tell, you only get "a" and "b".

  3. php-mode-syntax-table : This argument is optional, but presumably you are only doing this for the table with which you are currently working.

?* ". 23" The star gets used as the second character in a comment opening pair /*, and the first character in a comment closing pair */. ?# "< b" This sets the hash mark as a comment beginner, whitespace, and part of the b-team. Roughly the same thing happens to the newline.

That's about it.

PHP, easy as APC

I moved my web services from Mediatemple to a box in a rack somewhere in Atlanta. I'm leasing the 1U from Chris Kelly, who is leasing several other Us, as well as power and bandwidth. My nerdier friends demand to know the specs (I ordered the box in parts from Newegg with "reasonable" as my only goal), but I'm not focused on such things -- merely being glad to be out of the Mediatemple grid server ghetto.

Okay, fine, it's an AMD 64X2 5400+ with 4G of RAM and a terabyte of storage. Another 4G of RAM is on its way. Mmm, RAM.

That is neither here nor there, since it's the personal (if physically distant) control that I wanted. Case in point: PHP opcode caching, in the form of APC. (I tried XCache, but things went haywire in ways that might not have been XCache's fault, but when in a community (like Gallery) it's often useful to stick with what other people are doing rather than reinventing the wheel.

I'd like to mention that XCache has an "isset" function and APC does not, meaning that if you want to store a FALSE (presumably the result of a complicated but memoizable computation) you have to wrap it in something else. You probably have to wrap everything then, but that's something that can be worked out.

So, Zach wanted to see benchmarks of APC versus not-APC. Enter our old friend ab.
I ran some tests from a neighboring box over gigabit ethernet because I wanted to get a real maximum requests number, including a bare minimum network overhead.

This information was common to each run:


This is ApacheBench, Version 2.0.40-dev Revision: 1.146 apache-2.0
Server Software: Apache/2.2.8
Server Hostname: gallery2.jpmullan.com
Server Port: 80
Concurrency Level: 1
Complete requests: 100
Failed requests: 0
Write errors: 0

APC On

ab -n 100 http://gallery2.jpmullan.com/

Time taken for tests: 13.41635 seconds
Total transferred: 694500 bytes
HTML transferred: 652400 bytes
Requests per second: 7.67 [#/sec] (mean)
Time per request: 130.416 [ms] (mean)
Time per request: 130.416 [ms] (mean, across all concurrent requests)
Transfer rate: 51.99 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 127 129 3.6 130 161
Waiting: 127 129 3.6 130 161
Total: 127 129 3.6 130 161

APC Off

$ ab -n 100 http://gallery2.jpmullan.com/

Time taken for tests: 33.743932 seconds
Total transferred: 694500 bytes
HTML transferred: 652400 bytes
Requests per second: 2.96 [#/sec] (mean)
Time per request: 337.439 [ms] (mean)
Time per request: 337.439 [ms] (mean, across all concurrent requests)
Transfer rate: 20.09 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 323 336 11.1 336 410
Waiting: 322 336 11.1 336 410
Total: 323 336 11.1 336 410

However, the front page only says so much. Let's view a large album inside of my gallery (incidentally full of pictures from my recent vacation in San Francisco).

APC On

ab -n 100 http://gallery2.jpmullan.com/v/scans/2008/04/20080420/

Time taken for tests: 26.385255 seconds
Total transferred: 1542900 bytes
HTML transferred: 1503000 bytes
Requests per second: 3.79 [#/sec] (mean)
Time per request: 263.853 [ms] (mean)
Time per request: 263.853 [ms] (mean, across all concurrent requests)
Transfer rate: 57.08 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 259 263 10.6 261 355
Waiting: 256 260 10.4 258 350
Total: 259 263 10.6 261 355

APC Off

ab -n 100 http://gallery2.jpmullan.com/v/scans/2008/04/20080420/

Time taken for tests: 52.541065 seconds
Total transferred: 1542900 bytes
HTML transferred: 1503000 bytes
Requests per second: 1.90 [#/sec] (mean)
Time per request: 525.411 [ms] (mean)
Time per request: 525.411 [ms] (mean, across all concurrent requests)
Transfer rate: 28.66 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 1.9 0 19
Processing: 517 524 5.1 525 548
Waiting: 504 510 4.7 510 533
Total: 517 524 6.3 525 567

Let's summarize: in the four test runs, APC allowed my server to return data at least twice as fast. That seems like reason enough to keep it.

APC StateRequests
per second
Speedup
On

7.67

259%

Off

2.96

On

3.79

199%

Off

1.90

The whole point of loose typing

I don't know John Lim, but he linked to a Livejournal post where someone complained about their expectations being dashed when trying various comparisons.


<?php

$a = 0;
$b = "eggs";
$c = "spam";

print ($a == $b) ? "a == b\n" : "a != b\n";
print ($b == $c) ? "b == c\n" : "b != c\n";
print ($a == $c) ? "a == c\n" : "a != c\n";
print ($a == $d) ? "a == d\n" : "a != d\n";
print ($b == $d) ? "b == d\n" : "b != d\n";
print ($c == $d) ? "c == d\n" : "c != d\n";

?>

% php equality.php
a == b
b != c
a == c
a == d
b != d
c != d

They proceed to be pretty confused by what could be perceived as confusing behavior. PHP does some type juggling in the background.

First, let's examine what happens when we cast various common values as booleans. There are two type specifiers because I specify each item as the type and the value, and then cast that.


/* Plain old true and false values */
(boolean) (boolean) true === (boolean) true
(boolean) (boolean) false === (boolean) false

/* A NULL, which is a special value */
(boolean) (NULL) NULL === (boolean) false

/* An empty string */
(boolean) (string) '' === (boolean) false

/* The numbers zero and one */
(boolean) (integer) 0 === (boolean) false
(boolean) (integer) 1 === (boolean) true

/* Other non-zero numbers (this behavior is just like C) */
(boolean) (integer) -1 === (boolean) true
(boolean) (integer) 100 === (boolean) true

/* The number zero in string form: the string zero */
(boolean) (string) '0' === (boolean) false

/* An empty array */
(boolean) (array) array() === (boolean) false;

/* Non-empty arrays */
(boolean) (array) array(0) === (boolean) true;
(boolean) (array) array(1) === (boolean) true;
(boolean) (array) array(0, 1) === (boolean) true;

/* A non-empty string */
(boolean) (string) 'in the hay' === (boolean) true

/* A tricky non-empty string */
(boolean) (string) 'false' === (boolean) true

/* Really tricky non-empty strings
* I might actually take issue with these,
* because they are really sneaky, but
* what is (probably) happening is that
* first the strings get converted into integers
* and then booleans.
*/
(boolean) (string) '0 is awesome' === (boolean) true
(boolean) (string) ' 0' === (boolean) true

/* That notion seems to be help up by this test: */
(boolean) (string) '1' === (boolean) true

/* Fortunately, it is not completely ridiculous */
(boolean) (string) 'awesome is 0' === (boolean) true

So what is going on in that "crazy" example? One of the things to point out is that non-empty non-numeric strings are being compared to the number zero. My guess is that the zero is being juggled into a string since the string cannot be juggled into a number.


<?php

$a = 0;
$b = "eggs";
$c = "spam";

/*
(string) $a is '0';
(string) $d is 0 (because it is not set)
(string) $d is '' (because it is not set)
*/

print ($a == $b) ? "(string) a == (string) b\n" : "(string) a != (string) b\n";
print ($b == $c) ? "(string) b == (string) c\n" : "(string) b != (string) c\n";
print ($a == $c) ? "(string) a == (string) c\n" : "(string) a != (string) c\n";
print ($a == $d) ? "(int) a == (int) d\n" : "(int) a != (int) d\n";
print ($b == $d) ? "(string) b == (string) d\n" : "(string) b != (string) d\n";
print ($c == $d) ? "(string) c == (string) d\n" : "(string) c != (string) d\n";

?>

% php equality.php
(string) a == (string) b
(string) b != (string) c
(string) a == (string) c
(int) a == (int) d
(string) b != (string) d
(string) c != (string) d

I wouldn't mind a more strongly typed language sometimes, and I might have made different design decisions, but they were made for a reason.

Another site on the internet lamented the fact that strpos() returns a false on failure and a 0 if the string match happens at the zero index, complaining that the required code is not very clear.


if (false !== strpos('haystack', 'needle')) {
echo 'Found it!';
}

Their comment was that this (the C syntax) would be oh-so-much-better:


if (-1 < strpos('haystack', 'needle')) {
echo 'Found it!';
}

Wow. That's so much clearer! I suppose that the meat of their problem was that one has to use the !== operator instead of the != operator to check for false. Yes, strpos is kind of a hassle that way, but if it really bothers you, it is trivially easy to write a wrapper function.


/* Note that I am Elliot-Smithing the argument order */
function needle_in_the_hay($needle, $haystack) {
return (false !== strpos($haystack, $needle);
}

The real problem here is that strpos in C is for finding a single character in another string and is being abused in php to find whole strings in other strings. Better to pick a new name for that purpose. Even so, ultimately it's just an excuse to whine about PHP. It's not a perfect language, but it is what it is.

You could really make the complaint that there should be a String class with methods and stuff like that. I would accept that complaint... and then ask why you aren't writing that class. You know you want to!

Oh, php has a nice chart of comparisons.

http://us.php.net/manual/en/types.comparisons.php

Detecting Cycles in Tree Structures with PHP

My Gallery recently had some serious database issues. I had to write some code outside of the Gallery proper to look for database errors. When I went to look for cycles in the tree, I didn't find any handy algorithms out in the wild of the internet, so I wrote something.

That produces this output:

Found cycle in 1:2,2:3,3:1
Found cycle in 1:2,2:3,4:1,3:4
Found cycle in 1:2,2:1

I hope that helps whomever might be searching for this very algorithm. Please note that this does not detect duplicate children or missing parents.

 1