Monday, December 26, 2011

PHP: retrospective

Once in a while, someone on Reddit asks for justification why everyone there hates PHP.  I never reply, because there's too much to list in a comment, but maybe I can write a definitive post here.

Most recently updated: 7 Feb 2014, for new features in PHP 5.6.0a1.


I've put them in an ordered list more for your convenience in referring to the individual points, not because they're actually ordered.  I have also limited this list to things that have annoyed me in particular, so you won't find a simple list of commonly derided PHP warts here.  I knew about them early enough to dodge them.

Things that have made the language more clumsy to use:
  1. E_NOTICE: the language defines a default value (null) for missing array keys, object properties, and variable names, then complains at you if you rely on this fact.  Adding injury to insult, some versions of the engine fetch and format the entire string before passing it down to the layer that checks error_reporting to see if it needs to do anything.
  2. E_ALL: that is not actually "all error levels".  After adding this insanity to 5.0.0, I think E_ALL actually includes almost everything again, somewhere >= 5.3.0.
  3. Write context: isset(some_func()) yields "Can't use function return value in write context" because, to suppress the E_NOTICE if it isn't set, isset generates opcodes to fetch the value for writing, so it has to be an lvalue.  This is also why it's a special operator: so it can use the write context to incidentally find out whether the value was already set. (That means every variable write gets to check and set that flag, in case you were calling isset().)  [Similar problems happened with empty().]  [Prior to 5.5.0.]
  4. Callbacks: the callback pseudo-type accepts any of: a string (name of a function); an array of 2 strings (static call to class and method); an array of 1 object and 1 string (method call on the object instance); or an actual closure in PHP 5.3+.  The first and last can be called using $callback("arg", "list"); but the OOP calls cannot [prior to 5.4.0].  You have to pass them through call_user_func instead.
  5. There's call_user_func_array because there's no *args style operator (before "..." was added in PHP 5.6.)
  6. No keyword arguments: they have to be simulated with associative arrays.  (It doesn't make sense to use the Builder pattern, because it's not statically checked anyway.)
  7. Language operators disguised as functions: isset, list, and so on.  I think you'd put func_get_args() here too even though some limitations have been lifted in recent PHP versions.
  8. Iterators: many, many functions expect and only operate on arrays.  You can't pass anything that implements Traversable to http_build_query.  It fails on my machine (5.3.2-1ubuntu4.11 in Lucid) with "Corrupt member variable name", though in the past I do believe it would warn, then blithely do nothing.
  9. array_key_exists: since isset() returns false if a value is defined-but-null, there's an extra function for finding if an array key itself exists, which doesn't work on objects implementing ArrayAccess because that only handles the special language operators.  [And it takes the key argument first, unlike any other array operation ever.]
  10. Magic quotes: the first thing anyone does is writes an "input normalizer" that disenchants everything if magic_quotes is on.  Even then, PHP isn't aware of character encoding, so addslashes() (that magic_quotes uses) once had a fix issued for causing the SQL injection it was meant to block in some character encodings.
  11. Since short_tags and asp_tags are optional, environment-independent code always [prior to 5.4.0, which accepts <?= regardless] needs <?php echo ?> wrapped around variables, if it wants to pretend PHP is a "template" language.
  12. htmlspecialchars: is just too long and full of options.  me.inc.php therefore has function HE($s) { return htmlspecialchars($s, ENT_QUOTES, 'UTF-8'); }.  And, it's not the right thing for encoding URI components.  This is a Web language that provides no help in producing HTML.
  13. You barely have support for HTTP.  There's no shortcut for specifying "Let this cache for five minutes" nor "This expires on $DATE" nor offering a file download nor setting charset without having to specify the rest of the content type header.
    1. Your SAPI might require header("HTTP/1.0 410 Gone") even when you're handling an HTTP/1.1 request.  [Fixed in 5.4.0 since response codes may be set in a SAPI-independent manner with http_response_code(410).]
    2. There's no way to handle Expect: 100-continue since mod_php doesn't have a way to invoke code between receiving the headers and receiving the request-body.
    3. Likewise, the core team had to build upload progress monitoring themselves before you could do it portably.
  14. The date function defines meaning for individual alphabetical letters, so a format like "at 13:14 on Dec 12, '11" has to be specified with '\a\t' and '\o\n'.  In single quotes, so that "\t" and "\n" aren't tab and newline.
  15. Class constants: can't be named dynamically.
  16. Namespaces.
    1. Using backslash instead of ::, or if that was really so impossible, :::.
    2. The random restrictions.  You can only import namespaces and classes in 5.3-5.5, not functions nor variables.  (PHP 5.6 adds import of functions and constants by re-using old keywords in conjunction with use.)
    3. The resolution rules.  You can't just throw new Exception anymore, unless you did use \Exception.
    4. Constant names cannot be defined dynamically.  You can define('MyNS\Foo') but you can only get it back using constant(), because those constants are an entirely separate thing.
  17. PDO.
    1. It caused a lot of segfaults in 5.1.x.
    2. It doesn't have any conveniences like DBI's selectcol_arrayref or selectall_hashref($sql, 'id').
    3. Using certain versions of PHP+libmysql, prepares could fail and return null, causing a fatal error when trying to call execute on it, and maybe segfaulting.
    4. mysqli's bound parameter interface is even worse.  [Also, mysqli_result wasn't Traversable prior to 5.4.0.]
  18. Tokenizer extension: gives you either array(T_TOKEN_TYPE, $literal_text) or a 1-character literal text string.
  19. There's no parser/AST extension, opcode dumper, or debugger bundled with the language.  Though we do have xdebug, at least.
  20. The PCRE extension names its functions beginning preg_.
  21. I'm always getting passthru/fpassthru/readfile, and shell_exec/exec/system mixed up.  Usually, I need readfile or proc_open respectively, and the latter is still a pain to work with.  (Especially if you want to launch something async, without invoking the shell, and also know if it actually died instead of launching OK.)
  22. [1 Mar 2012] PHP still doesn't have: unsigned ints, 64-bit ints of any sign on 32-bit platforms, or handling of 64-bit ints in pack() on any platform except via "I" (machine-dependent size and byte order).
  23. [7 Feb 2014, but should have been ages ago] finally wasn't available for exception handling until PHP 5.5.
Philosophical issues:
  1. Case-insensitivity: for function, class, and method names.  But not for variables, properties, or constants.  The Turkish problem was baked right in.  [PHP 5.5 hardcodes ASCII case rules and fixes the Turkish problem, though.]  For some versions of the engine, case wasn't preserved when reporting errors or returning class names from get_class().  Once it was, everyone's PHP4-compatible-instanceof checks needed peppered with strtolower everywhere.
  2. Constants: define() runs at runtime, and takes its first argument as a string.  PHP steals the semantic of "equals the constant name if undefined" in Perl, which seems to be related to why you can get "unexpected token T_STRING" out of foo("data " CONSTANT . " blah");.  And why strings are T_ENCAPSED_STRING.
  3. The "impossible" ability to parse $klass::foo(), also used as a justification against allowing "::" as the namespace separator, and which forced everyone to use call_user_func(array($klass, 'foo')), was implemented in PHP 5.3.
  4. addslashes and stripslashes: they're backslashes, not slashes.
  5. Parentheses: optional on 0-argument constructors, and nowhere else.
  6. Dynamic class names to the new operator: from new $klass(...) to new parent;.
  7. Constructor return values: I'm ashamed to admit I wrote something that had a use for $foo = parent::__construct().  (Of course, in the standard use of $f = new Foo(); the return value is discarded, because new returns the constructed instance.)
  8. Functions that do things functions can't do: compact and extract.  (Yes, these can be called through call_user_func.)
  9. Variables that do things variables can't do: superglobals.
  10. Until PHP 5.1.3, true/false/null were looked up at runtime using the standard constant mechanism.  Though they are case-insensitive in spite of this.
  11. It's almost shared-nothing, except for the file upload handlers and session extension, that usually comes configured out-of-the-box to produce the longest ID possible.
  12. [29 May 2012] is_callable() returning true doesn't mean you can definitely use that variable in the variable-function syntax, due to practical issue #4.

In other news, the things that PHP does exceptionally well:
  1. SimpleXML.
  2. The procedural PCRE interface.  It's not Perl or Ruby, but at least it's not Python.
  3. There is no three.
I think we're done here.


Loved this?  Hated it?  Blog about it and drop the link to @sapphirepaw_org on Twitter.