Wednesday, January 30, 2013

Minimal, Working Perl FastCGI Example, version 2

This is an update to a previous post.  File layout remains the same: "site" is a placeholder for the actual site name, and /home/site/web is the actual repository of the project.  Static files then appear under public, and Perl modules specific to the site in lib/Site (i.e. visible in Perl as Site::Modname when lib is put in @INC).  I am still using mod_fcgid as the FastCGI process manager.

The major improvement: This version handles FCGI-only scripts which have no corresponding CGI URL.  I discovered that limitation of the previous version when I tried to write some new code, where Apache or mod_fcgid realized that the CGI version didn't exist, and returned a 404 instead of passing it through the wrapper.  As a consequence of solving that problem, FcgidWrapper is no longer necessary, which gives the dispatch.fcgi code a much cleaner environment to work in.

Everything I liked about the previous version is preserved here: I can create Site/Entry/login.pm to transparently handle /login.pl as FastCGI, without requiring every other URL to be available in FastCGI form.  It also stacks properly with earlier RewriteRules that turn pretty URLs into ones ending in ".pl".

Apache configuration:
# Values set via SetEnv will be passed in the request;
# to affect Perl startup, it must be FcgidInitialEnv
FcgidInitialEnv PERL5LIB /home/site/web/lib
RewriteCond /home/site/web/lib/Site/Entry/$1.pm -f
RewriteRule ^/+(.+)\.pl$ /home/site/web/dispatch.fcgi [L,QSA,H=fcgid-script,E=SITE_HANDLER:$1]
<directory /home/site/web/fcgi>
    Options ExecCGI FollowSymLinks
    # ...
</directory>
Again, the regular expression of the RewriteRule is matched before RewriteCond is evaluated, so the backreference $1 is available to test whether the file exists.  This time, I also use the environment flag of the RewriteRule to pass the handler to the dispatch.fcgi script.  Since I paid to capture it and strip the leading slashes and extension already, I may as well use it.

That means the new dispatch.fcgi script doesn't have to do as much cleanup to produce the module name:
#!/home/site/bin/perl
use warnings;
use strict;
use FindBin qw($Bin);
use Site::Response;
use Site::Preloader ();
while (my $q = CGI::Fast->new) {
    my ($base, $mod) = ($ENV{SITE_HANDLER});
    $base =~ s#/+#::#g;
    $base =~ s#[^\w:]##g;
    $base ||= 'index';
    $mod = "Site::Entry::$base";
    my $r = Site::Response->new($base, "$Bin/templates");
    eval {
        eval "require $mod;"
            and $mod->invoke($q, $r);
    } or warn "$mod => $@";
    $r->send($q);
}
I remembered to include the $r->send call this time.  I pass the CGI query object so the response can call $q->header.  That's not strictly necessary—FCGI children process one request at a time and copy $q to the default CGI object, meaning header should work fine alone, but I didn't know that yet.

I also remove non-{word characters or colons} from the inbound request for security, since my site uses URLs like /path/somereport.pl.  You may need to carefully adjust that for your site.

Site::Response is initialized as a generic error so that if the module dies, the response written to the client is a complete generic error.  Otherwise, the template is selected and data set, so the send call ships the completed page instead.

The only thing left that I'd like to do is make this configuration more portable between web servers instead of dependent on Apache's mod_rewrite and mod_fcgid, but since Apache isn't killing us at work, it probably won't happen very soon.

No comments: