Saturday, July 9, 2016

A hat beside the “PHP7 vs HHVM” ring

I spent all my free time, all week, exploring the difference in performance between PHP 5.6.22, PHP 7.0.8, and HHVM 3.14.1.  (HHVM 3.14.2 came out later in the week, but I didn’t want to redo everything.)

In the end, it turned out kind of useless as a “real world” test.  The server doesn’t have L3 cache, and the client is a virtual machine.  I also didn’t run a large number of trials, nor tune the server stacks for optimum performance. In fact, the tunings I tried had no effect, at best.

tl;dr conclusions:

  • HHVM proxygen is almost completely amazing.  If you have the option, it is probably what you want. It just crashes hard when concurrency exceeds its open file limit.
  • nginx+FastCGI hits a resource limit somewhere, and starts failing requests at higher concurrency, like 160.
  • apache+FastCGI does the same… but the limit is higher, between 256 and 384. The price for this is that it’s only 86% as many requests per second.
  • Providing more FastCGI workers makes the errors hit sooner, but ramp up more slowly.
  • I’m really disappointed in nginx.  I expected more.  A lot more.

Monday, July 4, 2016

Weird Bug: sudden massive I/O

I was doing a performance test of my own, PHP 5.6 vs HHVM 3.14 vs PHP 7.0, when it went off the rails.

I ran some preliminary tests on localhost, but since ab was chewing on 16% CPU, I decided to move the client onto the other side of a gigabit network. Protip: VirtualBox NAT performance is not so great.  A bridged connection works much better.

Anyway, I got the HHVM and PHP 5 tests done, then moved on to PHP 7, when it started returning awful performance numbers.  Like single-digit requests/sec with concurrency 1 during the warmup phase, while the actual machine was dumping large amounts of data to disk.  iotop pointed the finger at mysql, to the tune of 400 KB/sec.  (Spinning disk, so I assume random accesses.)

I was able to verify php 5 hadn’t regressed, and php7 had also been afflicted on localhost, but I haven’t made much progress beyond that. Tweaking innodb_flush_log_at_trx_commit has improved performance, but I’m still seeing median response times of 22 ms per request, instead of the expected 5 ms each.  (Anonymous page cache and MySQL query cache should be hit for every part of these requests.)

At this point, I’ve spent all day installing Drupal, building php7 and getting opcache enabled (it’s a zend_extension, it doesn’t compile statically, and the “php” files layout for the configure script places php.ini in lib, not etc), and running the tests, so I’m going to quit for a while.

But I’m still so confused.  PHP 5 hasn’t regressed, and it uses the same URL. None of the servers (hhvm, php5-fpm, php7-fpm) have write permissions to the code or settings on disk.  The only change to nginx is flipping which socket FastCGI requests are forwarded to.

Added 2016-07-09: clearing the Drupal cache fixed this. I still don't even.

Saturday, July 2, 2016

Nonlocality

Three years ago, I was porting a Perl CGI-based system to FastCGI, one URL at a time, using mod_rewrite to do the dispatching.  (If the handler.pm exists where the FCGI dispatcher will look for it, invoke this request via FastCGI.)  A consequence of this is that the core library code needs to run on both paradigms, since both front-ends use it.  That runs straight into problems when the CGI access checking functions blithely print and call exit when the access is denied.

Instead of updating 60+ scripts to read e.g. $session = get_session() or Site::CGI->go_login; I decided to get hackier.


Wednesday, June 29, 2016

rumprun php7 experiment results

I forked rumprun-packages to build its php7 package with more features by default.  The original says it’s an “unmodified PHP 7.0.3” build, but it uses --disable-all to build with a minimal number of extensions by default, and that’s not really great for showing just what could work on rumprun.

I made a “mega” build which contains everything I can possibly include into it, including packaging up libjpeg-turbo and libpng so that this PHP can use gd.

I have been unable to get the memcached extension to work, though.  libmemcached-1.0 is written in C++ and depends on some exception-handling symbols.  PHP links with $(CC), which doesn’t include any C++ support.

I’d probably be more disappointed in this, except that I don’t know if rumprun PHP is all that useful.  Options are the single-threaded CLI HTTP server (not intended for production), or using ye olde php-cgi SAPI… which is also not a multithreaded option.  It expects to run single-threaded child processes under a FastCGI process manager.  (php-fpm simply integrates a process manager, so that you don’t need to find a separate one and integrate it; it’s also multi-process.)

And rumprun doesn’t have processes.  It’s a unikernel.

Tuesday, June 28, 2016

Scaling

How it feels when we try to procure stuff. Maybe we’re just a personal sized business?



This is only getting worse, as each AWS hardware generation raises the machine size floor.

Friday, June 17, 2016

My Blogger Pipeline

For the past couple of years, I’ve been uncomfortable with the notion of Google watching my every thought as I type into a draft on the Blogger site. (Since a post whose working title was “VPNs are hard,” apparently.)

As a coder, everything can be solved with the code-hammer, so I wrote 7 KB of PHP. Now, I can write locally in Markdown, then run markup-menu.php.  This is a stupidly simple program that takes a unique filename prefix, finds the matching *.md file, and passes it to the much larger markup.php.

That program takes a file, preprocesses the Markdown a bit for compatibility with Blogger’s “preserve line breaks” setting, and post-processes the resulting HTML to iron out a few more quirks.  That file can be copypasta’d into Blogger.

Once there, it gets Blogger dressing like tags, then I click Preview.  If nothing’s wrong, it gets posted, as if it sprang fully formed from my head into Blogger.

Tuesday, June 14, 2016

Modern AJAX in jQuery

In the dark ages, I wrote my own wrappers for jQuery.getJSON because it had the function signature:

$.getJSON(url [, data] [, success])

And I wanted to provide an error handler.  So, our corporate library (represented by LIB) has a function like:

LIB.getJSON(url, data, success [, error])

I also didn’t know if I could rely on getting a response body from error events, so where possible, errors are returned as 200 OK {Error:true,ErrorMsg:"boom"} and the LIB.getJSON catches these and invokes the error callback instead of the success one.

(One more sub-optimal design choice: like, almost the whole point of LIB.getJSON is to pass an error handler and let the user know “okay. we are not loading anymore.”  But notice that the error handler is still considered optional for some reason.)

If I were designing this from scratch today, the service would return errors as HTTP error responses, and I’d use the “new” (added in 1.5, after I started jQuery’ing) Deferred methods.

function success (data, txt, xhr) {
    // ...
}
function error (xhr, txt, err) {
    var r;
    try {
        r = $.parseJSON(xhr.responseText);
    } catch (e) {}
    // ...
}
$.getJSON(url, data)
    .then(success, error);

Result: a more RESTful design, with less glue.

I’d probably still have an “error adapter generator” function which converts all the possibilities for xhr/txt/err down to a single message string, and pass newErrorHandler(showErrorUI) as the error callback into .then().  But the point is, there’s no need to have LIB.getJSON as a whole anymore, to both accept an error callback and filter ‘successfully returned error’ values.