<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-3922219755971684412</id><updated>2012-01-26T21:00:50.966-05:00</updated><category term='templates'/><category term='mail'/><category term='value'/><category term='dom'/><category term='ipsec'/><category term='admin'/><category term='s3'/><category term='explanation'/><category term='ec2'/><category term='fp'/><category term='flexibility'/><category term='apple'/><category term='magic'/><category term='perl'/><category term='autostart'/><category term='hash'/><category term='lucid'/><category term='technique'/><category term='wacom'/><category term='http'/><category term='ambiguity'/><category term='gnome'/><category term='pointers'/><category term='character sets'/><category term='shell'/><category term='python'/><category term='unicode'/><category term='MANPATH'/><category term='vim'/><category term='aws'/><category term='prediction'/><category term='rant'/><category term='science'/><category term='customization'/><category term='future'/><category term='man'/><category term='recovery'/><category term='xml'/><category term='gdm'/><category term='dvorak'/><category term='rpc'/><category term='mysql'/><category term='wallpaper'/><category term='schedule'/><category term='php'/><category term='security'/><category term='programming'/><category term='scope'/><category term='cloudfront'/><category term='music'/><category term='oop'/><category term='lisp'/><category term='war story'/><category term='cross-platform'/><category term='pdf'/><category term='tip'/><category term='style'/><category term='c'/><category term='rest'/><category term='meta'/><category term='tcp'/><category term='pagelib'/><category term='ui'/><category term='economics'/><category term='dh'/><category term='dns'/><category term='first mover'/><category term='gconf'/><category term='languages'/><category term='xdg'/><category term='history'/><category term='dsl'/><category term='microsoft'/><category term='tradeoffs'/><category term='design'/><category term='net neutrality'/><category term='chariot'/><category term='references'/><category term='ubuntu'/><category term='markets'/><category term='data'/><category term='json'/><category term='compiler'/><title type='text'>Decoded Node</title><subtitle type='html'>Taking the ‘under’ out of ‘underdocumented’</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>50</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-6614481099704942788</id><published>2012-01-26T21:00:00.001-05:00</published><updated>2012-01-26T21:00:50.985-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='data'/><category scheme='http://www.blogger.com/atom/ns#' term='chariot'/><category scheme='http://www.blogger.com/atom/ns#' term='ambiguity'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='php'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><category scheme='http://www.blogger.com/atom/ns#' term='languages'/><title type='text'>Plain Old Data</title><content type='html'>I’m coming to the conclusion that there’s actually no such thing as “plain data;” it always has &lt;i&gt;some &lt;/i&gt;metadata attached.&amp;nbsp; If it doesn’t, it might be displayed incorrectly, and then a human needs to interfere to determine the &lt;i&gt;correct &lt;/i&gt;metadata to apply to fix the problem.&amp;nbsp; (Example: &lt;tt&gt;View → Character Encoding&lt;/tt&gt; in Firefox.)&amp;nbsp; Pushed to the extreme, even “just numbers”  have metadata: they can be encoded as text, a binary integer/float (IEEE 754 or otherwise) of some size/endianness, or an ASN.1 encoding.&lt;br /&gt;&lt;br /&gt;Another conclusion I’m reaching is that HTTP conflates all kinds of metadata.&amp;nbsp; Coupled with the lack of self-contained metadata in file formats and filesystems, things start to accumulate hacks.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;In my ramblings about &lt;a href="http://sapphirepaw.blogspot.com/2011/11/rest-and-rpc-not-actually-antonyms.html"&gt;REST and RPC&lt;/a&gt;, I mentioned that the &lt;i&gt;combination &lt;/i&gt;of ETag/Last-Modified (a point-in-time) with a URL (an identity) provides a specific &lt;i&gt;state &lt;/i&gt;of that identity.&amp;nbsp; If an HTTP client wants to save that state to disk, how does it proceed?&amp;nbsp; Typically, it saves the entity-body and  &lt;b&gt;discards&lt;/b&gt; the headers.&amp;nbsp; Yet at least &lt;i&gt;some&lt;/i&gt; of those headers specified important data like the ETag, Cache-Control, Content-Location, and optionally for text/* types, character encoding.&amp;nbsp; (Other headers are specific to the transfer/message itself, such as Connection and the curiously named TE.&amp;nbsp; These are concerned with &lt;a href="http://sapphirepaw.blogspot.com/2012/01/layer-7-routing-http-ate-internet.html"&gt;layer 6&lt;/a&gt; rather than layer 7.)&lt;br /&gt;&lt;br /&gt;HTML, being in use world wide on the &lt;acronym title="World Wide Web"&gt;WWW&lt;/acronym&gt;, rapidly hit problems with  locale, so it has long had the capability to use &lt;code&gt;&amp;lt;meta http-equiv&amp;gt;&lt;/code&gt; tags for specifying the charset.&amp;nbsp; (Amusingly, HTML 4 seems to define it as &lt;a href="http://www.w3.org/TR/html4/struct/global.html#adef-http-equiv"&gt;something the HTTP server is supposed to parse&lt;/a&gt;, and add to the actual response headers.&amp;nbsp; If you need it after the transfer is done, it’s clearly not the transfer’s metadata, but the content’s.)&amp;nbsp; A lot of formats have gone Unicode only, like Java source code, or had Unicode hacked into them at a later date to solve the depressingly familiar issues of character set.&lt;br /&gt;&lt;br /&gt;The funny thing about specifying the character set in HTML is that you have to be able to &lt;a href="http://www.w3.org/International/questions/qa-html-encoding-declarations"&gt;interpret the characters in the first place&lt;/a&gt; in order to find the meta tag.&amp;nbsp; Thus, more hacking has ensued: if you have an ASCII-compatible encoding, your only requirement is to put the meta tag in the first 1K of the document; if it’s Unicode, you can start it with a BOM.&amp;nbsp; Otherwise, it must be specified through the HTTP header, or else it’s up to the user-agent to guess/use latin-1/do something else.&lt;br /&gt;&lt;br /&gt;Meanwhile on Unix, everything is “just” a stream of bytes, including &lt;a href="http://linux.die.net/man/2/ioctl"&gt;things&lt;/a&gt; that &lt;a href="http://rute.2038bug.com/node21.html.gz"&gt;aren’t&lt;/a&gt;, and things that are &lt;a href="http://en.wikipedia.org/wiki/Berkeley_sockets"&gt;more than that&lt;/a&gt;.&amp;nbsp; On one hand, this made Unix programs much more adaptable to handling multiple, and new, encodings; on the other, it was a massive effort to make every program handle new encodings, since there was never much support for it (especially for variable-length encodings) baked into C, and for compatibility, it had to be &lt;a href="http://en.wikipedia.org/wiki/Locale#POSIX-type_platforms"&gt;opt-in&lt;/a&gt;.&amp;nbsp; Also, when working on a remote system, the whole stack had to &lt;a href="http://www.linuxquestions.org/questions/linux-server-73/sendenv-and-acceptenv-in-ssh-session-683639/"&gt;agree&lt;/a&gt; on a character encoding.&lt;br /&gt;&lt;br /&gt;By pushing encoding issues up to the programs, Unix keeps knowledge of character encoding out of the kernel.&amp;nbsp; Yet it doesn’t provide much to allow applications to keep the encoding of text—or any other metadata, really—attached to the bytes.&amp;nbsp; Nobody relies on extended attributes (or file forks, or alternate streams), because they might be turned off, or a file might be transferred to a filesystem that doesn’t support them, such as the infamous FAT family.&amp;nbsp; To my knowledge, there’s no “metadata channel” attached to IPC, either.&amp;nbsp; In-band communication is the &lt;i&gt;only &lt;/i&gt;reliable form.&lt;br /&gt;&lt;br /&gt;This might be fairly sensible, though.&amp;nbsp; Some formats can have multiple kinds of data that they can store.&amp;nbsp; In particular, certain formats of ID3 tags come out on my pocket MP3 player as sequences of &lt;a href="http://hanzismatter.blogspot.com/"&gt;Hanzi&lt;/a&gt; and squares.&amp;nbsp; It seems that the encoding of the tags has changed between versions, and the poor old thing gets horribly confused when it doesn’t have its matching revision of tags.&lt;br /&gt;&lt;br /&gt;Still, most of the time when I’m dealing with “data” in a programming language, I’m keeping track of both the data and the format.&amp;nbsp; Python still has some warts, like the encoding of sys.stdout goes missing when stdout is a pipe instead of a terminal, but it still seems to be the right approach, and is thus doomed to failure at the hands of &lt;a href="http://dev.mysql.com/doc/refman/5.5/en/constraint-invalid-data.html"&gt;dirtier&lt;/a&gt; alternatives.&amp;nbsp; If you try to double-encode in Perl or PHP, that string is going to get double encoded, and you’ll have junk like Â€œ everywhere.&amp;nbsp; Python will throw an exception (“bytes object has no encode method” in 3.x, and “could not decode character” &lt;i&gt;when a non-ASCII byte was present in the string&lt;/i&gt; in 2.x).&amp;nbsp; On the flip side, this makes dealing with damaged data trickier, because this “don’t compound common problems” attitude &lt;a href="http://stackoverflow.com/a/1177542"&gt;protects you from a simple double-decoding&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The other interesting thing about PHP not understanding character sets is that addslashes/&lt;a href="http://php.net/manual/en/function.stripslashes.php"&gt;stripslashes&lt;/a&gt; can mangle &lt;a href="https://bugs.php.net/bug.php?id=15683"&gt;Big5&lt;/a&gt; and Shift-JIS text.&amp;nbsp; At one point, someone realized they could make magic_quotes &lt;i&gt;cause &lt;/i&gt;a SQL injection, by using a double-byte character ending in 0x27.&amp;nbsp; Adding the backslash changes the character, and moves the 0x27 out to become a real single-quote.&lt;br /&gt;&lt;br /&gt;On the other hand, since PHP is so completely unaware of encodings, when you print a string and it comes out weird, you need hardly more than those bytes to find your problem.&amp;nbsp; Perl’s backwards-compatible idea that the world is single-byte unless specified means that when you try to print Unicode data without configuring the proper IO encoding on a handle, it tries to convert to its legacy encoding and print that.&amp;nbsp; If code points are out of range, then it prints &lt;i&gt;all of them in UTF-8,&lt;/i&gt; with the lovely “wide character in print” warning.&amp;nbsp; So it may appear to work, in spite of being completely broken, if you happened to choose data that won’t break it and have warnings off.&lt;br /&gt;&lt;br /&gt;Perl’s functions work the same way; if you try to pack a UTF-8 string (with the utf8 flag on) into hex codes, hoping to see what the bytes are before they get to the filehandle, it doesn’t work—they go through the same encoding process.&amp;nbsp; To have a fully Unicode pipeline in Perl or Python, you need to set up the appropriate encodings for &lt;i&gt;every&lt;/i&gt; segment of the process &lt;i&gt;before&lt;/i&gt; you can actually see the data that you have, so you can understand whether it is correct in the first place.&lt;br /&gt;&lt;br /&gt;&amp;lt;/tangent&amp;gt;&lt;br /&gt;&lt;br /&gt;The point is, there’s no such thing as “just text,” or “just data.”&amp;nbsp; MP3 may contain different text encodings in ID3 tags, the same is true of &lt;a href="http://owl.phy.queensu.ca/%7Ephil/exiftool/faq.html#Q10"&gt;JPEG/EXIF comments&lt;/a&gt;, and MIME’s whole purpose in life is to stitch together randomly encoded things, including any of the above, into a parse-able whole.&amp;nbsp; “Text” based, of course.&lt;br /&gt;&lt;br /&gt;Looking at HTTP once more, there’s a whole stack of data to deal with: data about the message itself, where a &lt;tt&gt;202 Accepted&lt;/tt&gt; status implies that there will be no response body; about management of the ephemeral connection, with Connection and TE; about the response body when present, as in Content-Type and Last-Modified; and about the server, the common examples being Server and X-Powered-By.&amp;nbsp; Some modern standards groups are rushing to add security and privacy as well, through the DNT (do not track) header and the various “do stuff across origins” specifications.&lt;br /&gt;&lt;br /&gt;And in the end, the response body might carry something that has its own metadata, like a MIME message with its parts, or HTML document with HEAD tags.&amp;nbsp; Formats which embraced “In-band communication is the &lt;i&gt;only &lt;/i&gt;reliable form”.&lt;br /&gt;&lt;br /&gt;Formats which don’t necessarily think all data has a self-evident format.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Irrelevant tl;dr: stalk me via &lt;a href="https://twitter.com/sapphirepaw_org"&gt;twitter&lt;/a&gt;, or ye olde &lt;a href="http://sapphirepaw.blogspot.com/feeds/posts/default"&gt;atom feed&lt;/a&gt;.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-6614481099704942788?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/6614481099704942788/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=6614481099704942788&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/6614481099704942788'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/6614481099704942788'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2012/01/plain-old-data.html' title='Plain Old Data'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-3038647252206880870</id><published>2012-01-25T13:02:00.002-05:00</published><updated>2012-01-25T13:08:41.569-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>What If: Weak Memory Pages</title><content type='html'>Raymond Chen &lt;a href="http://blogs.msdn.com/b/oldnewthing/archive/2012/01/18/10257834.aspx"&gt;wrote about&lt;/a&gt; the "what if everybody did this?" problem of applications written to consume up to some threshold of memory and free some of it under pressure: if multiple applications have different thresholds that they're trying to maintain, then the one with the smallest-free threshold wins. &amp;nbsp;Of course, the extreme of this is a normal application that doesn't try to do anything fancy, which acts like it has a negative-infinity threshold. &amp;nbsp;If it never adjusts its allocations in response to free memory, then it always wins.&lt;br /&gt;&lt;br /&gt;Some of the solutions batted around in the comment thread involve using mmap() or other tricks to try to get the OS to manage the cache, but this brings up its own problems.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;The obvious problem with mmap is that, under pressure, the OS will still treat the mapped data as precious, and write it back to the file. &amp;nbsp;This is almost indistinguishable from swapping normal memory, but since we're using mmap(), this is a separate file from the actual swap partition. &amp;nbsp;That means it may live on top of all sorts of other layers: the filesystem itself, a volume manager, software RAID, a transparently encrypted and/or compressed block device, and/or a block device pointing to network-attached storage. &amp;nbsp;The final one is also a roundabout way of having a networked file system, while hiding the traditional signs of NFS from applications. &amp;nbsp;In any case, the network may be invoked by accident. &amp;nbsp;On traditional hard disks, separating the application cache's swap space from the actual swap partition may cost more in disk seeks when swapping actually happens. &amp;nbsp;On a solid-state drive, the user may not actually want the cache swapped to it at all, to prolong the drive's longevity.&lt;br /&gt;&lt;br /&gt;So that's a lot of points against mmap, which are all consequences of its swapping, since the application doesn't have&amp;nbsp;a way to tell the OS that "this data isn't that valuable—you can discard dirty pages." &amp;nbsp;What would happen if we added a system call for that, and extended the malloc family to request it?&lt;br /&gt;&lt;br /&gt;Well, the first question is, "How do you know when the OS discarded the page?" &amp;nbsp;If you just try to read the memory, you'll most likely generate a segfault, which you'll have to distinguish somehow as a weak-pointer... and then do something about it because the machine can't just keep executing. &amp;nbsp;That's why it generated the fault in the first place! &amp;nbsp;Java can get away with an exception, because the language intrinsically provides them, and the necessary stack unwinding control. &amp;nbsp;C, on the other hand, tries to avoid having a stack visible in the language.&lt;br /&gt;&lt;br /&gt;Alternatively, if reading a discarded page always returns 0, then you can't store data there that's legitimately zero. &amp;nbsp;Otherwise, you can't distinguish a discarded page from your legitimate data. &amp;nbsp;Even a struct: if you read individual members into the CPU, then those individual members cannot be zero, or you're opening yourself to weird time-of-check/time-of-use scenarios where the struct validated, but you got invalid data back by the time you tried to read your "real" target.&lt;br /&gt;&lt;br /&gt;Not to mention, I'm unsure exactly how that zero read would be implemented--either "mapped-but-zero" has to be supported in the hardware's memory management, or the OS' fault handler has to go disassemble the instruction from the user IP and write a 0 into the appropriate place. &amp;nbsp;Trying to do anything else, like deliver a signal, seems to run into the exact same signal-handling problems: how does the application set itself up to abort when the signal handler returns? &amp;nbsp;Does it have to add checks before every write using data that was read from a weak page? &amp;nbsp;How does anyone keep &lt;i&gt;track &lt;/i&gt;of all that?&lt;br /&gt;&lt;br /&gt;This also adds concerns to the system's memory manager. &amp;nbsp;It has a new type of page to handle everywhere, along with a discarding policy. &amp;nbsp;If pages can be discarded out of the middle of the discardable space, then discard-fragmentation can happen. &amp;nbsp;Depending on how the application re-allocates weak space, the system may never be able to fill it again.&lt;br /&gt;&lt;br /&gt;I feel like I'm writing&amp;nbsp;&lt;a href="http://steve-yegge.blogspot.com/2009/04/have-you-ever-legalized-marijuana.html"&gt;Have You Ever Legalized Marijuana?&lt;/a&gt;&amp;nbsp;but shorter and focused on only one example.&lt;br /&gt;&lt;br /&gt;If all of the above are solved, there's still a pathological case where an application that discovers it lost some weak pages, re-populates them (as weak pages), and then tries to use them—only to discover they're gone already because of continuing memory pressure. &amp;nbsp;This same problem occurs with ordinary weak data structures in Java, when memory is low. &amp;nbsp;The application needs to know what data is precious enough to keep in ordinary memory, and what can be moved to a weak area. &amp;nbsp;Moving data is easier in Java, because there aren't unrestricted pointers, so moving an object need not invoke copy constructors or invalidate any pointers.&lt;br /&gt;&lt;br /&gt;All this, to drag down the entire performance of every application on the system, in order to provide support for a few apps to not-swap? &amp;nbsp;It's probably not worth the effort.&lt;br /&gt;&lt;br /&gt;I briefly considered "what if the OS could send low-memory notifications to applications?" but that still divides apps into two categories: ones that react, and ones that ignore the information. &amp;nbsp;The latter ones will then win all the "excess" memory from the ones who play nicely.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-3038647252206880870?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/3038647252206880870/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=3038647252206880870&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/3038647252206880870'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/3038647252206880870'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2012/01/what-if-weak-memory-pages.html' title='What If: Weak Memory Pages'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-1641026755087855785</id><published>2012-01-18T10:12:00.000-05:00</published><updated>2012-01-19T09:43:37.583-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><category scheme='http://www.blogger.com/atom/ns#' term='character sets'/><title type='text'>Perl and Unicode in Brief</title><content type='html'>&lt;ol&gt;&lt;li&gt;If you want to handle Unicode and avoid &lt;a href="http://perldoc.perl.org/perlunicode.html#The-%22Unicode-Bug%22"&gt;The Unicode Bug&lt;/a&gt;, in which your strings &lt;i&gt;sometimes&lt;/i&gt;&amp;nbsp;act like they aren't actually Unicode: in perl 5.12+, &lt;code&gt;use feature 'unicode_strings';&lt;/code&gt;. &amp;nbsp;For older perl, see&amp;nbsp;&lt;a href="http://search.cpan.org/~juerd/Unicode-Semantics-1.02/lib/Unicode/Semantics.pm"&gt;Unicode::Semantics&lt;/a&gt;, or use &lt;a href="http://search.cpan.org/~flora/perl-5.14.2/lib/utf8.pm#Utility_functions"&gt;utf8::upgrade&lt;/a&gt; by hand.&lt;/li&gt;&lt;li&gt;If you want strings in your source text with non-ASCII: save it as a utf-8 encoded file and &lt;code&gt;use utf8;&lt;/code&gt;. &amp;nbsp;Or you can encode Unicode code points with hex-escapes, &lt;code&gt;\xae&lt;/code&gt; → ®, or &lt;code&gt;\x{30ab}&lt;/code&gt; → カ. &amp;nbsp;There are technically other options, which have additional drawbacks (utf-16 breaks the #! line; latin-1 is restricted to latin-1 unless you decode it yourself.)&lt;/li&gt;&lt;li&gt;If you want to print to a UTF-8 aware environment like your terminal emulator or CGI STDOUT after issuing a &lt;code&gt;Content-Type: text/html; charset=utf-8&lt;/code&gt; header: setting UTF-8 on the filehandle with&amp;nbsp;&lt;code&gt;binmode(STDOUT, ':utf8')&lt;/code&gt; is the minimum, but &lt;code&gt;:encoding(utf-8)&lt;/code&gt; instead of &lt;code&gt;:utf8&lt;/code&gt; makes stricter guarantees that real code points are coming out.&lt;/li&gt;&lt;li&gt;If you want to read a UTF-16 encoded document into a Unicode string: &lt;code&gt;open(FH, '&amp;lt; :encoding(utf-16)', $name)&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;If you want to convert a Unicode string to a specific set of bytes for some encoding-unaware module to throw on the wire,&amp;nbsp;use the encode function from the Encode module: &lt;code&gt;use Encode; $message-&amp;gt;attr('content-type.charset', 'utf-16'); $message-&amp;gt;data(encode("UTF-16", $body));&lt;/code&gt;&lt;/li&gt;&lt;li&gt;If you want to read a file encoded with charset X, into a string encoded with charset Y, I've found no instant way to do this. &amp;nbsp;It's probably best to pass the input-encoding along as the output-encoding if at all possible.&amp;nbsp;&amp;nbsp;But you might find&amp;nbsp;&lt;code&gt;from_to()&lt;/code&gt; or string-IO as in &lt;code&gt;IO::File-&amp;gt;new(\$out, '&amp;gt;:')&lt;/code&gt; or maybe a whole PerlIO filter as in &lt;a href="http://search.cpan.org/~gfuji/PerlIO-code-0.03/lib/PerlIO/code.pm"&gt;PerlIO::code&lt;/a&gt;&amp;nbsp;helpful if you can't.&lt;/li&gt;&lt;li&gt;If you see "Wide character in ..." warnings, then you passed a string with code points &amp;gt;=0x100 to something that expected a byte string of some sort: either really latin-1, or an encoded string.&lt;/li&gt;&lt;li&gt;If you see longer strings of gibberish where you expected sensible non-ASCII characters, then you have probably double-encoded, either literally, or by printing an encoded string to a filehandle which does encoding.&lt;/li&gt;&lt;li&gt;If you see the Unicode replacement character in a stream that should be UTF-8, you haven't encoded at all, such as printing a byte string on a raw filehandle in an environment expecting UTF-8. &amp;nbsp;Most likely, the filehandle should have an encoding set on it, per point #3 above, though that may cause #8 on other strings you've printed.&lt;/li&gt;&lt;/ol&gt;The most difficult thing to come to terms with for me was, Perl doesn't have any notion of "the string's encoding" &lt;i&gt;despite being Unicode-aware.&lt;/i&gt; &amp;nbsp;It may be a Unicode string, which you could reasonably call decoded.&amp;nbsp; Or, it's an encoded string, sometimes called a byte string--which happens to mean "a string of Unicode code points with values &amp;lt;= 0xFF." &amp;nbsp;This gets especially complicated by the UTF-8 flag, which you are not supposed to care about, and shouldn't have to care about if you understand all of the above.&lt;br /&gt;&lt;br /&gt;I would try to explain, but I've basically given up on trying to understand the UTF-8 flag for now.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-1641026755087855785?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/1641026755087855785/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=1641026755087855785&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1641026755087855785'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1641026755087855785'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2012/01/perl-and-unicode-in-brief.html' title='Perl and Unicode in Brief'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7110435553034672057</id><published>2012-01-13T15:18:00.001-05:00</published><updated>2012-01-14T15:26:22.188-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='vim'/><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>A nice vim highlighting hack</title><content type='html'>I wanted to highlight places where control flow could be redirected in my perl code, so I hacked up my personal colorscheme file to highlight Exceptions specifically:&lt;br /&gt;&lt;br /&gt;&lt;pre style="margin-left:2%;"&gt;hi Exception ctermfg=white ctermbg=blue&lt;/pre&gt;&lt;br /&gt;Now, I just needed to define the things I wanted highlighted as Exception*. &amp;nbsp;Thus, the newly added &lt;code&gt;~/.vim/after/syntax/perl.vim&lt;/code&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre style="margin-left:2%;width:98%;overflow:auto;"&gt;" flow control highlighting&lt;br /&gt;syn keyword perlStatementCtlExit return die croak confess last next redo&lt;br /&gt;syn keyword perlStatementWarn    warn carp cluck&lt;br /&gt;hi link perlStatementCtlExit Exception&lt;br /&gt;hi link perlStatementWarn    Statement&lt;br /&gt;&lt;br /&gt;" and i'm tired of everything being yellow&lt;br /&gt;hi link perlStatementStorage Define&lt;/pre&gt;&lt;br /&gt;The last line isn't related to the above, but it recolors my/local/our in Preprocessor Blue instead of Statement Yellow. &amp;nbsp;They do, after all, affect the state of the compiler at parse time.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;* This means that I'm going to open a non-Perl file sometime and weird things will have Exception highlighting. &amp;nbsp;Nobody notices the subtle differences when it's all Statement colored by default.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7110435553034672057?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7110435553034672057/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7110435553034672057&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7110435553034672057'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7110435553034672057'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2012/01/nice-vim-highlighting-hack.html' title='A nice vim highlighting hack'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-5926312319503857084</id><published>2012-01-11T22:58:00.000-05:00</published><updated>2012-01-11T22:58:13.350-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='http'/><category scheme='http://www.blogger.com/atom/ns#' term='chariot'/><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='xml'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='rpc'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><title type='text'>Layer 7 Routing: HTTP Ate the Internet</title><content type='html'>In the beginning was TCP/IP, and the predominant model was that servers would listen for clients using a pre-established port number.&amp;nbsp; Then came &lt;a href="http://en.wikipedia.org/wiki/Open_Network_Computing_Remote_Procedure_Call"&gt;Sun RPC&lt;/a&gt;, in which RPC servers were established dynamically, and listened on semi-random ports (still, one port per service provided); the problem was solved by baking the &lt;a href="http://en.wikipedia.org/wiki/Portmap"&gt;port mapper&lt;/a&gt; into the protocol.&amp;nbsp; The mapper listens on a pre-established port, and the client first connects there to inquire, "On what port shall I find service X?"&lt;br /&gt;&lt;br /&gt;Then came HTTP, the &lt;a href="http://en.wikipedia.org/wiki/OSI_model#Layer_6:_presentation_layer"&gt;layer 6&lt;/a&gt; protocol masquerading as layer 7.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;h3&gt;Doing One Thing&lt;/h3&gt;Shortly after I started this post, I ran across &lt;a href="http://cowbelljs.blogspot.com/2011/12/lost-art-of-telnet.html"&gt;The Lost Art of Telnet&lt;/a&gt;, which explains how the Internet used to be a bunch of stuff, listening on different ports, each running their own simple text protocol at layer 7.&amp;nbsp; gopher, telnet, MUDs and their friends, and IRC: all text based.&lt;br /&gt;&lt;br /&gt;That was the way HTTP started, as well.&amp;nbsp; To this day, you can often make a successful request with nothing more than the request line and an appropriate Host header.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Layer Six&lt;/h3&gt;The funny thing is that in all my classes that ever touched on the OSI Model, there was never any example for anything at layer 6.&amp;nbsp; Of course, the world was IP by then, so we weren't staring at the implementation the model was designed for, but I think TLS (nee SSL) is the obvious candidate here.&lt;br /&gt;&lt;br /&gt;And today, I'd extend HTTP down there as well, so that it would be spanning layers 6-7.&lt;br /&gt;&lt;br /&gt;Wikipedia tells us that layer 6 included "capabilities such as converting an EBCDIC-coded text file to an ASCII-coded file...."&amp;nbsp; This would be irrelevant, except that HTTP explicitly permits servers to translate texts, which I learned &lt;a href="http://stackoverflow.com/a/4101763"&gt;while looking for the correct JavaScript MIME type&lt;/a&gt;.&amp;nbsp; In fact, Wikipedia lists MIME itself as a layer 6 protocol in the TCP/IP table.&lt;br /&gt;&lt;br /&gt;HTTP's various encoding options, such as chunked and/or compressed encodings, also clearly pertain to layer 6.&lt;br /&gt;&lt;br /&gt;Layer 7 is still clearly involved, though, since that's where entity bodies fall, as well as some of the metadata relating to them like Last-Modified.&lt;br /&gt;&lt;br /&gt;I'm actually not sure what layer HTTP redirection would match up with.&amp;nbsp; If the application allows transparent redirection, then it would appear as part of level 6, but most libraries allow the application to be informed and specify further action when a redirect code is received.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Routing and REST&lt;/h3&gt;Something happened to HTTP over time.&amp;nbsp; The request-line itself provides almost no semantics, which means you can add your own.&amp;nbsp; Either through verbs, the WebDAV/DeltaV way, or by tunneling your application through some other verb, Siri ACE style.&amp;nbsp; (Or POST with _method=put for all you web browsers out there.)&amp;nbsp; Although the spec says some operations are idempotent (have the same result if they are repeated multiple times), you don't have to respect that.&amp;nbsp; You can make state-changing GET requests, for instance those &lt;code&gt;/cgi-bin/counter.cgi&lt;/code&gt; images that were all the rage a decade ago.&lt;br /&gt;&lt;br /&gt;I think I first noticed the extension of semantics to URLs  in the days of Rails 1.1, when URLs were mapped into method calls, such as &lt;code&gt;POST /bread/slice/3&lt;/code&gt; which was handled by BreadController#slice.&amp;nbsp; What happened was that the URL space was &lt;i&gt;fully &lt;/i&gt;virtual, not just some filesystem view, and it could have application-level semantics associated with it.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;User Space Networking&lt;/h3&gt;HTTP had become, essentially, user-space networking; in contrast to TCP/IP and the data link layers, which were all in the kernel and had comparatively limited configuration available.&amp;nbsp; You just can't slap Perl into the TCP processing in the way you can put a PerlAuthenHandler into Apache with mod_perl.&lt;br /&gt;&lt;br /&gt;The ability to perform any action on any URL and have this translated into some real effect was actually envisioned by the designers of HTTP itself.&amp;nbsp; Many of the response codes, such as &lt;a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.3"&gt;202 Accepted&lt;/a&gt;, were created for this "HTTP as an interface to other systems" vision.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Firewall Evasion&lt;/h3&gt;There's another factor involved in the rise of HTTP as a near-universal transport, which is the predominance of the old paradigm of stateless firewalls filtering by TCP port.&amp;nbsp; You could block ports &lt;span title="Internet Relay Chat"&gt;6667&lt;/span&gt; and &lt;span title="Usenet / NNTP"&gt;119&lt;/span&gt; and nobody would be able to waste time.&amp;nbsp; Well, that was the theory.&amp;nbsp; In those times, processing power and available memory didn't allow for much more advanced filtering than port numbers anyway.&lt;br /&gt;&lt;br /&gt;In any case, the web turned out to be so important that it was usually unfiltered by these firewalls, so HTTP worked "everywhere" from the perspective of a client.&amp;nbsp; Thus, RTMP, SOAP, and probably many other protocols have defined standards for communicating over HTTP.&lt;br /&gt;&lt;br /&gt;The obvious result of allowing only "secure" communication in old times has been to &lt;i&gt;make "insecure" communications over "secure" channels.&lt;/i&gt;&amp;nbsp; Therefore, it's now required to run deep packet inspection tools to sort out whether a given HTTP message appears to be genuine HTTP or some tunneled protocol.&lt;br /&gt;&lt;br /&gt;You could carry nearly any layer inside HTTP if you wanted.&amp;nbsp; As a thought experiment, I just designed SOCKS over HTTP, which is amusingly absurd, but not much worse than an &lt;a href="http://en.wikipedia.org/wiki/Layer_2_Tunneling_Protocol"&gt;L2TP&lt;/a&gt; VPN.&amp;nbsp; And if you wanted to cheat, there's always the &lt;a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.42"&gt;Upgrade&lt;/a&gt; functionality.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Dumb Pipes are Still Dumb on a Smart Network&lt;/h3&gt;One last interesting bit of HTTP is the caching architecture.&amp;nbsp; If you obey the standards for idempotent verbs and provide caching headers, then intermediaries can perform their own processing on your requests.&amp;nbsp; The network is no longer a truly dumb pipe, shuttling data mindlessly between origin server and client.&lt;br /&gt;&lt;br /&gt;However, the pipes between the nodes remain somewhat dumb.&amp;nbsp; All they're dealing with are packets.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;OSI Goes Fractal&lt;/h3&gt;Wrapping up, HTTP bears characteristics of layers 5 (through cookies), 6, and 7, and virtually any other layer except 1 could, in theory, be tunneled over HTTP.&amp;nbsp; There are a &lt;i&gt;lot&lt;/i&gt; of tunnels defined already for various levels, even IP-in-IP, and the obvious cases of VPNs.&lt;br /&gt;&lt;br /&gt;Therefore HTTP is not "just" layer 7, and anything being carried at layer 7 may not be "just" application data: HTTP or SMTP may be acting as layer 6 for a SOAP message.&amp;nbsp; You could &lt;i&gt;technically &lt;/i&gt;be tunneling your Ethernet (and the TCP/IP it's carrying) inside those SOAP messages.&amp;nbsp; There's really no end to the amount of insanity you can come up with by wrapping tunnels inside each other.&lt;br /&gt;&lt;br /&gt;The disadvantage to such tunneling is that the ultimate application is affected by the performance characteristics of the tunnel, such as overhead.&amp;nbsp; Or in the case of TCP, tunneling it over a reliable transport interferes with its heuristics and yields &lt;a href="https://github.com/apenwarr/sshuttle#readme"&gt;unreliable performance&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-5926312319503857084?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/5926312319503857084/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=5926312319503857084&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/5926312319503857084'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/5926312319503857084'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2012/01/layer-7-routing-http-ate-internet.html' title='Layer 7 Routing: HTTP Ate the Internet'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7554249967620208411</id><published>2011-12-26T16:00:00.000-05:00</published><updated>2012-01-12T11:58:30.217-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rant'/><category scheme='http://www.blogger.com/atom/ns#' term='php'/><category scheme='http://www.blogger.com/atom/ns#' term='languages'/><title type='text'>PHP: retrospective</title><content type='html'>Once in a while, someone on Reddit asks for justification why everyone there hates PHP.&amp;nbsp; I never reply, because there's too much to list in a comment, but maybe I can write a definitive post here.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;I've put them in an ordered list more for your convenience in referring to the individual points, not because they're actually ordered.&amp;nbsp; &lt;b&gt;I have also limited this list to things that have annoyed &lt;i&gt;me &lt;/i&gt;in particular,&lt;/b&gt; so you won't find a simple list of commonly derided PHP warts here.&amp;nbsp; I knew about &lt;i&gt;them&lt;/i&gt; early enough to dodge them.&lt;br /&gt;&lt;br /&gt;Things that have made the language more clumsy to use:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;E_NOTICE: the language defines a default value (&lt;code&gt;null&lt;/code&gt;) for missing array keys, object properties, and variable names, then complains at you if you rely on this fact.&amp;nbsp; Adding injury to insult, some versions of the engine fetch and format the entire string before passing it down to the layer that checks error_reporting to see if it needs to do anything.&lt;/li&gt;&lt;li&gt;E_ALL: that is not actually "all error levels".&amp;nbsp; After adding this  insanity to 5.0.0, I think E_ALL actually includes almost everything again, somewhere  &amp;gt;= 5.3.0.&lt;/li&gt;&lt;li&gt;Write context: &lt;code&gt;isset(some_func())&lt;/code&gt; yields "Can't use function return value in write context" because, to suppress the E_NOTICE if it isn't set, &lt;code&gt;isset&lt;/code&gt; generates opcodes to fetch the value for writing, so it has to be an &lt;a href="http://en.wikipedia.org/wiki/Value_%28computer_science%29"&gt;lvalue&lt;/a&gt;.&amp;nbsp; This is also why it's a special operator: so it can use the write context to incidentally find out whether the value was already set.&lt;/li&gt;&lt;li&gt;Callbacks: the callback pseudo-type accepts any of: a string (name  of a function); an array of 2 strings (static call to class and method);  an array of 1 object and 1 string (method call on the object instance);  or an actual closure in PHP 5.3+.&amp;nbsp; The first and last can be called  using &lt;code&gt;$callback("arg", "list");&lt;/code&gt; but the OOP  calls cannot.&amp;nbsp; You have to pass them through &lt;code&gt;call_user_func&lt;/code&gt; instead.&lt;/li&gt;&lt;li&gt;There's &lt;code&gt;call_user_func_array&lt;/code&gt; because there's no &lt;code&gt;*args&lt;/code&gt; style operator.&lt;/li&gt;&lt;li&gt;No  keyword arguments: they have to be simulated with associative arrays.&amp;nbsp;  (It doesn't make sense to use the Builder pattern, because it's not  statically checked anyway.)&lt;/li&gt;&lt;li&gt;Language operators disguised as functions: &lt;code&gt;isset&lt;/code&gt;, &lt;code&gt;list&lt;/code&gt;, and so on.&amp;nbsp; I think you'd put &lt;code&gt;func_get_args()&lt;/code&gt; here too even though some limitations have been lifted in recent PHP versions.&lt;/li&gt;&lt;li&gt;Iterators: many, many functions expect and only operate on  arrays.&amp;nbsp; You can't pass anything that implements  &lt;code&gt;Traversable&lt;/code&gt; to  &lt;code&gt;http_build_query&lt;/code&gt;.&amp;nbsp; It fails on my machine  (5.3.2-1ubuntu4.11 in Lucid) with  "Corrupt member variable name", though in the past I do believe it would warn, then blithely do nothing.&lt;/li&gt;&lt;li&gt;&lt;code&gt;array_key_exists&lt;/code&gt;: since  &lt;code&gt;isset()&lt;/code&gt; returns &lt;code&gt;false&lt;/code&gt;  if a value is defined-but-null, there's an extra function for finding if an array key itself exists, which doesn't work on objects implementing  &lt;code&gt;ArrayAccess&lt;/code&gt; because that only handles the special language operators.&lt;/li&gt;&lt;li&gt;Magic quotes: the first thing anyone does is writes an "input normalizer" that disenchants everything if magic_quotes is on.&amp;nbsp; Even then, PHP isn't aware of character encoding, so &lt;code&gt;addslashes()&lt;/code&gt; (that magic_quotes uses) once had a fix issued for &lt;i&gt;causing&lt;/i&gt; the SQL injection it was meant to block in some character encodings.&lt;/li&gt;&lt;li&gt;Since &lt;code&gt;short_tags&lt;/code&gt; and &lt;code&gt;asp_tags&lt;/code&gt; are optional, environment-independent code always needs &lt;code&gt;&amp;lt;?php echo ?&amp;gt;&lt;/code&gt; wrapped around variables, if it wants to pretend PHP is a "template" language.&lt;/li&gt;&lt;li&gt;htmlspecialchars: is just too long and full of options.&amp;nbsp; &lt;code&gt;me.inc.php&lt;/code&gt; therefore has &lt;code&gt;function HE($s) { return &lt;/code&gt;&lt;code&gt;htmlspecialchars($s, ENT_QUOTES, 'UTF-8'); }&lt;/code&gt;.&amp;nbsp; And, it's not the right thing for encoding URI components.&amp;nbsp; This is a Web language that provides no help in producing HTML.&lt;/li&gt;&lt;li&gt;You barely have support for HTTP.&amp;nbsp; There's no shortcut for specifying "Let this cache for five minutes" nor "This expires on $DATE" nor offering a file download nor setting charset without having to specify the rest of the content type header.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;There's no obvious way to specify &lt;code&gt;100 Continue&lt;/code&gt; headers.&amp;nbsp; Your SAPI might require &lt;code&gt;header("HTTP/1.0 410 Gone")&lt;/code&gt; &lt;i&gt;even when you're handling an HTTP/1.1 request.&lt;/i&gt;&lt;/li&gt;&lt;li&gt;There's no way to handle &lt;code&gt;Expect: 100-continue&lt;/code&gt; since mod_php doesn't have a way to invoke code between receiving the headers and receiving the request-body.&lt;/li&gt;&lt;li&gt;Likewise, the core team had to build upload progress monitoring themselves before you could do it portably.&lt;/li&gt;&lt;/ol&gt;&lt;/li&gt;&lt;li&gt;The date function defines meaning for individual alphabetical letters, so a format like "at 13:14 on Dec 12, '11" has to be specified with '\a\t' and '\o\n'.&amp;nbsp; In single quotes, so that "\t" and "\n" aren't tab and newline.&lt;/li&gt;&lt;li&gt;Class constants: can't be named dynamically.&lt;/li&gt;&lt;li&gt;Namespaces.&lt;ol&gt;&lt;li&gt;Using backslash instead of ::, or if that was &lt;i&gt;really&lt;/i&gt; so impossible, :::.&lt;/li&gt;&lt;li&gt;The random restrictions.&amp;nbsp; You can't actually import functions or variables, only namespaces and classes.&lt;/li&gt;&lt;li&gt;The resolution rules.&amp;nbsp; You can't just &lt;code&gt;throw new Exception&lt;/code&gt; anymore, unless you did &lt;code&gt;use \Exception&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;Constant names cannot be defined dynamically.&amp;nbsp; You can&amp;nbsp;&lt;code&gt;define('MyNS\Foo')&lt;/code&gt; but you can only get it back using &lt;code&gt;constant()&lt;/code&gt;, because those constants are an entirely separate thing.&lt;/li&gt;&lt;/ol&gt;&lt;/li&gt;&lt;li&gt;PDO.&lt;ol&gt;&lt;li&gt;It caused a lot of segfaults in 5.1.x.&lt;/li&gt;&lt;li&gt;It doesn't have any conveniences like DBI's &lt;code&gt;selectcol_arrayref&lt;/code&gt; or &lt;code&gt;selectall_hashref($sql, 'id')&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;Using certain versions of PHP+libmysql, prepares could fail and return &lt;code&gt;null&lt;/code&gt;, causing a fatal error when trying to call execute on it, and maybe segfaulting.&lt;/li&gt;&lt;li&gt;mysqli's bound parameter interface is even worse.&lt;/li&gt;&lt;/ol&gt;&lt;/li&gt;&lt;li&gt;Tokenizer extension: gives you either &lt;code&gt;array(T_TOKEN_TYPE, $literal_text)&lt;/code&gt; or a 1-character literal text string.&lt;/li&gt;&lt;li&gt;There's no parser/AST extension,  opcode dumper, or debugger bundled with the language.&amp;nbsp; Though we do have xdebug, at least.&lt;/li&gt;&lt;li&gt;The P&lt;b&gt;CRE&lt;/b&gt; extension names its functions beginning &lt;code&gt;p&lt;b&gt;reg&lt;/b&gt;_&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;I'm always getting passthru/fpassthru/readfile, and shell_exec/exec/system mixed up.&amp;nbsp; Usually, I need readfile or proc_open respectively, and the latter is still a pain to work with.&amp;nbsp; (Especially if you want to launch something async, without invoking the shell, and also know if it actually died instead of launching OK.)&lt;/li&gt;&lt;/ol&gt;Philosophical issues:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Case-insensitivity: for function, class, and method names.&amp;nbsp; But not for variables, properties, or constants.&amp;nbsp; &lt;a href="http://www.i18nguy.com/unicode/turkish-i18n.html"&gt;The Turkish problem&lt;/a&gt;  was baked right in.&amp;nbsp; For some versions of the engine, case wasn't  preserved when reporting errors or returning class names from &lt;code&gt;get_class()&lt;/code&gt;.&amp;nbsp; Once it was, everyone's PHP4-compatible-&lt;code&gt;instanceof&lt;/code&gt; checks needed peppered with &lt;code&gt;strtolower&lt;/code&gt; everywhere.&lt;/li&gt;&lt;li&gt;Constants: &lt;code&gt;define()&lt;/code&gt; runs at runtime, and takes its  first argument as a string.&amp;nbsp; PHP steals the semantic of "equals the  constant name if undefined" in Perl, which seems to be related to why  you can get "unexpected token T_STRING" out of &lt;code&gt;foo("data " CONSTANT . " blah");&lt;/code&gt;.&amp;nbsp; And why strings are T_ENCAPSED_STRING.&lt;/li&gt;&lt;li&gt;The "impossible" ability to parse &lt;code&gt;$klass::foo()&lt;/code&gt;, also used as a justification against allowing "::" as the namespace separator, and which forced everyone to use &lt;code&gt;call_user_func(array($klass, 'foo'))&lt;/code&gt;, was implemented in PHP 5.3.&lt;/li&gt;&lt;li&gt;addslashes and stripslashes: they're &lt;b&gt;backslashes&lt;/b&gt;, not slashes.&lt;/li&gt;&lt;li&gt;Parentheses: optional on 0-argument constructors, and nowhere else.&lt;/li&gt;&lt;li&gt;Dynamic class names to the new operator: from &lt;code&gt;new $klass(...)&lt;/code&gt; to &lt;code&gt;new parent;&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;Constructor return values: I'm ashamed to admit I wrote something that had a use for &lt;code&gt;$foo = parent::__construct()&lt;/code&gt;.&amp;nbsp; (Of course, in the standard use of &lt;code&gt;$f = new Foo();&lt;/code&gt; the return value is discarded, because &lt;code&gt;new&lt;/code&gt; returns the constructed instance.)&lt;/li&gt;&lt;li&gt;Functions that do things functions can't do: &lt;code&gt;compact&lt;/code&gt; and &lt;code&gt;extract&lt;/code&gt;.&amp;nbsp; (Yes, these can be called through &lt;code&gt;call_user_func&lt;/code&gt;.)&lt;/li&gt;&lt;li&gt;Variables that do things variables can't do: superglobals.&lt;/li&gt;&lt;li&gt;Until PHP 5.1.3, true/false/null were looked up at runtime using the  standard constant mechanism.&amp;nbsp; Though they are case-insensitive in spite  of this.&lt;/li&gt;&lt;li&gt;It's almost shared-nothing, &lt;i&gt;except &lt;/i&gt;for the file upload handlers and session extension, that usually comes configured out-of-the-box to produce the longest ID possible.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;In other news, the things that PHP does exceptionally well:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;SimpleXML.&lt;/li&gt;&lt;li&gt;The procedural PCRE interface.&amp;nbsp; It's not Perl or Ruby, but at least it's not Python.&lt;/li&gt;&lt;li&gt;There is no three.&lt;/li&gt;&lt;/ol&gt;I think we're &lt;a href="http://sapphirepaw.blogspot.com/2011/11/programming-languages-to-learn.html"&gt;done&lt;/a&gt; here.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Loved this?&amp;nbsp; Hated it?&amp;nbsp; Blog about it and drop the link to @&lt;a href="https://twitter.com/sapphirepaw_org"&gt;sapphirepaw_org&lt;/a&gt; on Twitter.&lt;/i&gt;&lt;br /&gt;&lt;ol&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7554249967620208411?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7554249967620208411'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7554249967620208411'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/12/php-retrospective.html' title='PHP: retrospective'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7291381344451792</id><published>2011-12-16T23:04:00.002-05:00</published><updated>2011-12-16T23:14:53.366-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='war story'/><title type='text'>War Story: The Stock List</title><content type='html'>&lt;b&gt;A day like any other:&lt;/b&gt; In order to test that all the categories of products are behaving correctly on the website, I spend an hour writing a page to display a table of in-stock (further subdivided) and out-of-stock items.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;About 6 business days after finishing,&lt;/b&gt; while waiting for review: instead of reading the entire history of every single Planet MySQL blog, I spend another half hour fancying up the CSS of my page.&amp;nbsp; My boss catches me, asks what the page is about, rejects the hypothesis that testing is important, and lectures me.&amp;nbsp; We are not making enough money to pay you your pathetic rate; do not do extra work.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Several&lt;/b&gt; business days later: the system is finally approved and live.&amp;nbsp; Nobody in the office is trained on it when an order comes in.&amp;nbsp; The order is for an out-of-stock item.&amp;nbsp; The Big Boss is rather angry, and demands to know whether there is some way to find out "what the site thinks it has in stock."&amp;nbsp; My boss answers "No."&amp;nbsp; I am silent.&amp;nbsp; I'm already looking for a new job.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Business Day 88&lt;/b&gt; (about four months into the 90-day evaluation period): after 2 days and 2 emails, I finally get a meeting with the Big Boss to announce that I'm going to terminate my at-will employment after Day 89 to start my next job, 45 miles closer to home, at $pay * 1.38 + $benefits * 1.25.&amp;nbsp; (I ultimately decide to tell him the exact offered salary, though I can't tell if he's BS'ing me on whether it's an acceptable/common  question to ask, because I figure he won't match it.&amp;nbsp; He doesn't even try to come up with a counteroffer.)&amp;nbsp; He threatens that I might need to stay 2 weeks because he doesn't know if I can leave.&amp;nbsp; The last project was finished somewhere around Day 76, and has been waiting for review.&amp;nbsp; Every time I pinged my boss on a review, ever, including this final task, the answer was: "later today."&lt;br /&gt;&lt;br /&gt;Day 89 was thankfully uneventful.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7291381344451792?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7291381344451792/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7291381344451792&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7291381344451792'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7291381344451792'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/12/war-story-stock-list.html' title='War Story: The Stock List'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-2397088596209287015</id><published>2011-12-13T19:36:00.001-05:00</published><updated>2011-12-13T19:36:15.761-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='dns'/><category scheme='http://www.blogger.com/atom/ns#' term='admin'/><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='mail'/><title type='text'>Observations on SPF: Sender Policy Framework</title><content type='html'>Recently at work, I updated our SPF policy to something accurate.&amp;nbsp; Along the way, to understand the policy I was deploying and what the previous version actually meant, I had to understand the various rules and types involved.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;h3&gt;  What's the difference between Sender ID and SPF?&lt;/h3&gt;As far as I can tell, Sender ID (&lt;a href="http://www.microsoft.com/mscorp/safety/technologies/senderid/default.mspx"&gt;Microsoft&lt;/a&gt;; &lt;a href="http://en.wikipedia.org/wiki/Sender_ID"&gt;Wikipedia&lt;/a&gt;) supports more ways to identify the sender: not just the SMTP MAIL command (what I am calling the "From address" below), but also message headers.&amp;nbsp; They refer to the address that they decide to use as the Purported Responsible Address, PRA.&amp;nbsp; Other than that, the overall design seems practically equivalent to SPF.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;  What are the differences among 'a', 'ptr', and 'mx' in the SPF record?&lt;/h3&gt;These are all SPF "mechanisms", also called "types", to avoid confusion with the DNS records they're named after.&lt;br /&gt;&lt;br /&gt;A bare "a" indicates that a forward (DNS A record) lookup for the hostname in the From address must be done.&amp;nbsp; If any resulting IP matches the IP of the sending machine, then the mail is permitted.&amp;nbsp; The form of "a:mail.example.com" means the same thing, but the A record to be looked up is "mail.example.com" instead, not the From address.&lt;br /&gt;&lt;br /&gt;The "mx" type is similar, except that the mail exchanger (MX) record of the From address is looked up, with the resulting domain name used. &amp;nbsp;That is, a plain "mx" is equivalent to "a:mx.example.com" if the MX lookup returned mx.example.com. &amp;nbsp;The form "mx:example.net" is similar, but instead of using the From address domain, it is the MX record of example.net that is looked up in the first step. &amp;nbsp;I don't think the mx type has that much use in the real world; SPF is concerned with sending outbound email, while the MX records are for receiving inbound email.&lt;br /&gt;&lt;br /&gt;The "ptr" type indicates that the sender's IP address should be looked up with a reverse (PTR) lookup. &amp;nbsp;(To protect against spoofing, the PTR hostname is then looked up forward (A), and if the original IP doesn't come back out, the "ptr" mechanism is ignored.) &amp;nbsp;Then, if the PTR hostname has the same suffix as the From address hostname, the mail is permitted.&amp;nbsp; Again, the From address hostname can be overridden by giving a hostname to the ptr record, as in "ptr:example.com".&amp;nbsp; This lets any machine with a reverse-address such as mail4.b.stl.example.com send mail (as *.example.com matches example.com).&lt;br /&gt;&lt;br /&gt;Technically, when I said the mail is permitted, I mean "the mechanism matched"; since I didn't supply an action like "~a", the default is "+a", so the action taken because the mechanism matched is acceptance, more often called SPF Pass.&amp;nbsp; (This post just reflects how I think about it inside my head.)&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;You said "A record," which is IPv4.&amp;nbsp; This is the future—what about IPv6?&lt;/h3&gt;It turns out that SPF is defined intelligently; if the sender's connection is over &lt;a href="http://en.wikipedia.org/wiki/IPv6"&gt;IPv6&lt;/a&gt;, then all "A lookups" I mentioned above &lt;a href="http://www.gossamer-threads.com/lists/spf/help/35598"&gt;become AAAA lookups&lt;/a&gt;, and of course the PTR lookup also operates on the IPv6 specific domain.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;An SPF policy can declare ANY unrelated domain to be a permitted sender?&lt;/h3&gt;Yes. &amp;nbsp;The designation happens &lt;i&gt;in the SPF record,&lt;/i&gt; which is &lt;b&gt;initially&lt;/b&gt; looked up through the From address.&amp;nbsp; Therefore, there's nothing that example.com can (legally) do to be specified as responsible for example.org's mail without cooperation, since they can't alter example.org's SPF record directly.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;  Can't spammers use SPF too?&amp;nbsp; They already do fancy DNS tricks for their websites.&lt;/h3&gt;Yes, and I've seen it in the wild (a spam message with SPF permitting it.)&lt;br /&gt;&lt;br /&gt;Logically, then, SPF's role must be to &lt;i&gt;authenticate the sender of a message.&lt;/i&gt;&amp;nbsp; Once the sender can be trusted (SPF accepts the message), the domain name can be used to look up its reputation.&amp;nbsp; In older times, a spammer could send mail claiming to be From example.com, trading on example.com's reputation, and there was no way to uncover the deception. &amp;nbsp;SPF should solve that last problem, but authentication alone doesn't guarantee non-spam email.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt; So what does your SPF record look like, sapphirepaw?&lt;/h3&gt;At work, we send out mail from one local server; from our hosted Exchange; and soon, from Amazon &lt;a href="http://aws.amazon.com/ses"&gt;SES&lt;/a&gt;. &amp;nbsp;The policy is therefore a simple chain of rules:&lt;br /&gt;&lt;pre style="overflow-x: auto; overflow-y: hidden; overflow: auto; width: 100%;"&gt;v=spf1 a:notifier.example.com ptr:xchg.example.net include:amazonses.com -all&lt;/pre&gt;The "a" type specifies our outbound server; the ptr is specified by our Exchange hosting company; the include is specified by Amazon SES; and we forbid the rest. &amp;nbsp;For the road warrior, native Exchange, smtp+imap access, and &lt;a href="http://en.wikipedia.org/wiki/Outlook_Web_App"&gt;OWA&lt;/a&gt; will all send via xchg.example.net, so we didn't need to make any special provisions there.&lt;br /&gt;&lt;br /&gt;Names have been changed, except for Amazon SES, which is a public service with publicly available documentation. &amp;nbsp;There didn't seem to be much point in hiding that.&lt;br /&gt;&lt;br /&gt;At home, sapphirepaw.org has just one&amp;nbsp;server, so it's dead simple:&lt;br /&gt;&lt;pre&gt;v=spf1 a ~all&lt;/pre&gt;I haven't gotten around to thoroughly testing it yet, so it still soft-fails, but that's how it is so far.&amp;nbsp; Both work and home records are published as TXT since the respective DNS managers don't support SPF types yet.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;You can find more  posts like this under the &lt;a href="http://sapphirepaw.blogspot.com/search/label/tip"&gt;tip&lt;/a&gt;&amp;nbsp;tag; or check&amp;nbsp;&lt;a href="http://sapphirepaw.blogspot.com/search/label/explanation"&gt;explanation&lt;/a&gt;&amp;nbsp;for longer, in-depth technical articles. &amp;nbsp;You can also &lt;a href="https://twitter.com/sapphirepaw_org"&gt;follow me&lt;/a&gt; on twitter. &lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-2397088596209287015?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/2397088596209287015/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=2397088596209287015&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2397088596209287015'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2397088596209287015'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/12/observations-on-spf-sender-policy.html' title='Observations on SPF: Sender Policy Framework'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-1675284986810120490</id><published>2011-12-08T20:54:00.001-05:00</published><updated>2011-12-16T23:05:09.215-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='war story'/><category scheme='http://www.blogger.com/atom/ns#' term='admin'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><title type='text'>War Story: power.sh</title><content type='html'>When I attended community college, the computers in the labs were running Windows 95, pretty much in a state of constant hilarity.&amp;nbsp; I'll get to that some other time, though; today's wacky hijinks are about &lt;b&gt;The Server&lt;/b&gt;: the most secure machine on all of the campus, since it was the master authentication source.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The Server was running &lt;a href="http://en.wikipedia.org/wiki/SunOS"&gt;SunOS&lt;/a&gt;, if I do remember correctly.&amp;nbsp; Though that makes it a tad more obsolete than the Windows 95 machines.&lt;br /&gt;&lt;br /&gt;In any case, when logged into the Sun via good olde Telnet, we puny students were running in the shell's &lt;a href="http://www.wlug.org.nz/rbash%281%29"&gt;restricted&lt;/a&gt; mode.&amp;nbsp; One of the goals of this process was to prevent us from having the Full Power of &lt;strike&gt;UNIX&lt;/strike&gt; SunOS at our fingertips.&amp;nbsp; I have long since forgotten the allowed commands, but there were only a handful of them, such that they used less than two lines when listed out.&amp;nbsp; The only one of these that was &lt;i&gt;technically &lt;/i&gt;needed in my time at that college was &lt;a href="http://linux.die.net/man/5/passwd"&gt;passwd&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Though it prevented writes to $PATH so that I couldn't &lt;i&gt;add &lt;/i&gt;"allowed" commands, rbash didn't prevent &lt;i&gt;reading &lt;/i&gt;from it.&amp;nbsp; I noticed that &lt;code&gt;~/bin&lt;/code&gt; was part of the list, and of course we had write access to our home directories through some GUI on the Windows machines.&amp;nbsp; (Whatever provided that access also allowed for setting permission bits.)&amp;nbsp; I can't really remember how that all worked, but I quickly studied the Internet and put together a shell script in Notepad that would simply run an &lt;i&gt;unrestricted &lt;/i&gt;shell.&amp;nbsp; This was the ever-so-subtly named &lt;b&gt;power.sh&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;Then I discovered that Notepad writes "CRLF" &lt;a href="http://en.wikipedia.org/wiki/Newline"&gt;line-endings&lt;/a&gt;, and the extra "CR" characters were preventing the script from running.&amp;nbsp; I searched the Internet again; clearly, I needed &lt;code&gt;dos2unix&lt;/code&gt;.&amp;nbsp; But the code I found was implemented as &lt;i&gt;another shell script&lt;/i&gt; that just called &lt;code&gt;tr -d '\r'&lt;/code&gt;.&amp;nbsp; Being rather ignorant, I went home and rewrote power.sh on my Linux machine there.&amp;nbsp; I tested it at home, but it was still torture to wait for the next day in the lab to try it out.&lt;br /&gt;&lt;br /&gt;It worked.&amp;nbsp; I felt like the biggest genius on the whole campus, having clearly outsmarted the &lt;i&gt;Guys Who Were Paid to Secure the Server&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;But other than that, it was actually kind of anticlimactic.&amp;nbsp; Being so ignorant, I didn't really have a use for the Full Power of &lt;strike&gt;UNIX&lt;/strike&gt; SunOS myself.&amp;nbsp; I probably would have explored it harder if I hadn't had Linux at home, though.&lt;br /&gt;&lt;br /&gt;I also didn't bother to tell the administrators.&amp;nbsp; According to the policy, merely &lt;b&gt;attempting&lt;/b&gt; to circumvent &lt;i&gt;any &lt;/i&gt;restriction for &lt;i&gt;any &lt;/i&gt;purpose, including but not limited to reporting the vulnerability, was grounds for having your access revoked.&amp;nbsp; And as a computer science major, that was just a senseless risk to take.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Need more pointless rambling?&amp;nbsp; Follow me &lt;a href="https://twitter.com/sapphirepaw_org"&gt;on twitter&lt;/a&gt;.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-1675284986810120490?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/1675284986810120490/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=1675284986810120490&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1675284986810120490'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1675284986810120490'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/12/war-story-powersh.html' title='War Story: power.sh'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7576319946976209014</id><published>2011-12-02T21:49:00.000-05:00</published><updated>2011-12-12T21:38:52.562-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='chariot'/><category scheme='http://www.blogger.com/atom/ns#' term='tradeoffs'/><category scheme='http://www.blogger.com/atom/ns#' term='xml'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='json'/><category scheme='http://www.blogger.com/atom/ns#' term='languages'/><title type='text'>The Chariot</title><content type='html'>I read "&lt;a href="http://www.mnot.net/blog/2011/11/25/linking_in_json"&gt;Linking in JSON&lt;/a&gt;" the other day.&amp;nbsp; I knew someone had already gotten started on &lt;a href="http://json-schema.org/"&gt;JSON Schema&lt;/a&gt;.&amp;nbsp; (A quick search shows JSON namespace ideas floating around.)&amp;nbsp; JSON as the lightweight alternative to XML being turned into XML?&amp;nbsp; This is beginning to sound familiar.&lt;br /&gt;&lt;br /&gt;With lightweight formats, we tend to get a &lt;a href="http://www.libpng.org/"&gt;proliferation&lt;/a&gt; of &lt;a href="http://en.wikipedia.org/wiki/JPEG_Network_Graphics"&gt;variants&lt;/a&gt; for &lt;a href="http://en.wikipedia.org/wiki/APNG"&gt;different&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/Multiple-image_Network_Graphics"&gt;uses&lt;/a&gt;.&amp;nbsp; (&lt;a href="http://en.wikipedia.org/wiki/INI_file"&gt;Not&lt;/a&gt; &lt;a href="http://www.yaml.org/"&gt;just&lt;/a&gt; &lt;a href="http://search.cpan.org/search?query=Storable&amp;amp;mode=module"&gt;images&lt;/a&gt;, naturally, and CPAN manages to use &lt;i&gt;more than one.&lt;/i&gt;)&amp;nbsp; &lt;a href="http://en.wikipedia.org/wiki/Tagged_Image_File_Format"&gt;Heavier&lt;/a&gt; formats tend to have a problem I'm going to call "&lt;b&gt;Accessories Not Included&lt;/b&gt;": they get sufficiently &lt;a href="http://en.wikipedia.org/wiki/Portable_Document_Format"&gt;large&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Advanced_Audio_Coding"&gt;complex&lt;/a&gt; that not all readers support all format options.&amp;nbsp; If the growth is arrested early enough, you end up with a handful of profiles; if it gets out of hand, you have &lt;a href="http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC#Profiles"&gt;over ten of them&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I never expected XML to be &lt;a href="http://xmlrpc.scripting.com/"&gt;so&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/SOAP"&gt;widely&lt;/a&gt; &lt;a href="http://consolibyte.com/wiki/doku.php/quickbooks_example_qbxml"&gt;used&lt;/a&gt; for &lt;a href="http://en.wikipedia.org/wiki/RSS"&gt;so&lt;/a&gt; &lt;a href="http://bitworking.org/projects/atom/rfc5023.html"&gt;much&lt;/a&gt; &lt;a href="https://www.usps.com/business/webtools.htm"&gt;stuff&lt;/a&gt;, or spawn &lt;a href="http://www.w3schools.com/xquery/default.asp"&gt;so&lt;/a&gt; &lt;a href="http://www.w3schools.com/xpath/"&gt;many&lt;/a&gt; &lt;a href="http://www.w3.org/TR/xptr-framework/"&gt;related&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/XInclude"&gt;specs&lt;/a&gt;.&amp;nbsp; After all, it was verbose!&amp;nbsp; And you could make up &lt;i&gt;just any tag name you wanted!&lt;/i&gt;&amp;nbsp; But it turns out to scale well from a not-quite-&lt;a href="http://us2.php.net/manual/en/book.simplexml.php"&gt;simple&lt;/a&gt; &lt;b&gt;tree data structure&lt;/b&gt;, with &lt;a href="http://clhs.lisp.se/Body/f_symb_4.htm"&gt;annotated&lt;/a&gt; nodes, all the way up to unfashionable Enterprisey uses.&amp;nbsp; But scalability bothers people, because who knows what &lt;a href="http://sites.google.com/site/steveyegge2/art-of-the-witch-hunt"&gt;wacky thing&lt;/a&gt; someone else on the team is going to foist on you, so more restricted alternatives rise to popularity.&amp;nbsp; If this is true, the rise of Java should correlate to (non-game) companies getting burned enough times on C++.&amp;nbsp; (If you think about &lt;a href="http://www.paulgraham.com/icad.html"&gt;Safety&lt;/a&gt;, it makes sense.&amp;nbsp; &lt;i&gt;He wants you to use Java, because he can't hack reversing the polarity of the &lt;a href="http://tvtropes.org/pmwiki/pmwiki.php/Main/TechnoBabble"&gt;template stack flow&lt;/a&gt;.&lt;/i&gt;)&lt;br /&gt;&lt;br /&gt;This sort of thing is practically destined to &lt;a href="http://www.retrologic.com/jargon/W/wheel-of-reincarnation.html"&gt;keep happening&lt;/a&gt;.&amp;nbsp; More features generally cost more memory and processing time, or some other inconvenience like a &lt;a href="http://www.cheetahtemplate.org/docs/users_guide_html_multipage/howWorks.cheetah-compile.html"&gt;compilation&lt;/a&gt; step, which is against the &lt;a href="http://www.retrologic.com/jargon/H/holy-wars.html"&gt;religion&lt;/a&gt; of some developers.&amp;nbsp; Thus, &lt;a href="http://www.zeromq.org/"&gt;lightweight&lt;/a&gt; &lt;a href="http://www.restms.org/"&gt;versions&lt;/a&gt; of things spring up in opposition to &lt;a href="http://activemq.apache.org/amqp.html"&gt;whatever&lt;/a&gt; is perceived to be too heavy.&amp;nbsp; Sometimes compilation &lt;i&gt;is &lt;/i&gt;considered the lightweight alternative, since it's not done on &lt;a href="http://stackoverflow.com/questions/95635/what-does-a-just-in-time-jit-compiler-do"&gt;every&lt;/a&gt; &lt;a href="http://www.php.net/"&gt;request&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Though sometimes many similar projects proliferate because they just aren't that hard.&amp;nbsp; It's easier to write a  web framework  than to learn one, so there are &lt;a href="http://en.wikipedia.org/wiki/Comparison_of_web_application_frameworks"&gt;a lot of them&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: smaller; font-style: italic;"&gt;Made it out of the link forest and need something more to kill the time?&amp;nbsp; Maybe you want to subscribe to my &lt;a href="http://sapphirepaw.blogspot.com/feeds/posts/default"&gt;feed&lt;/a&gt;, or &lt;a href="https://twitter.com/sapphirepaw_org"&gt;follow me&lt;/a&gt; on twitter.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7576319946976209014?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7576319946976209014/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7576319946976209014&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7576319946976209014'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7576319946976209014'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/12/chariot.html' title='The Chariot'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-8113091696933202437</id><published>2011-11-28T22:17:00.000-05:00</published><updated>2011-11-28T22:17:06.629-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tradeoffs'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='ui'/><title type='text'>Resolution Dependence</title><content type='html'>Why are device pixels so &lt;i&gt;meaningful &lt;/i&gt;that we get stuck designing around pixels, even though we "know" we should design for device-independent units?&lt;br /&gt;&lt;br /&gt;The main characteristic of a pixel is that it is &lt;b&gt;crisp.&lt;/b&gt;&amp;nbsp; When rendered on a display with 50% more &lt;a href="http://en.wikipedia.org/wiki/Pixel_density"&gt;PPI&lt;/a&gt;, a 1px line will be either thinner (physical size reduced) or antialiased (blurred).&amp;nbsp; On the other hand, doubling the PPI lets those 1px lines render &lt;i&gt;exactly &lt;/i&gt;as crisply, on precisely 2 physical pixels.&amp;nbsp; (More crispness &lt;a href="http://en.wikipedia.org/wiki/Pixel_density"&gt;is possible&lt;/a&gt;, but Apple's version doesn't  alter any art.)&lt;br /&gt;&lt;br /&gt;If a user wants to zoom so that features are physically 50% larger, then the same problems of rendering 1-pixel features on 1.5px areas occur, but this time we know we can't tweak physical size.&amp;nbsp; Antialiasing happens instead, resulting in a zoomed but blurry UI.&amp;nbsp; Worse, subpixel rendering adds noise when not rendering precisely onto the intended subpixels, but the font rendering is done by the time the zooming layer gets to see it on Linux.&lt;br /&gt;&lt;br /&gt;Unless &lt;i&gt;everything&lt;/i&gt; is lovingly hinted and/or provided at multiple PPI steps, there's basically no solution to the problem.&amp;nbsp; I'm willing to bet that people will skip properly handling multiple PPI settings if it's any more complicated than supporting power-of-2 sizes.&amp;nbsp; As long as pixels matter, which they will up to 600 PPI or more, people are going to design for pixels.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-8113091696933202437?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/8113091696933202437/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=8113091696933202437&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8113091696933202437'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8113091696933202437'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/11/resolution-dependence.html' title='Resolution Dependence'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-6891050485553565960</id><published>2011-11-23T21:47:00.000-05:00</published><updated>2011-11-23T21:47:02.956-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rest'/><category scheme='http://www.blogger.com/atom/ns#' term='tradeoffs'/><category scheme='http://www.blogger.com/atom/ns#' term='technique'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='rpc'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>REST and RPC: Not Actually Antonyms</title><content type='html'>Last month found me writing &lt;a href="http://sapphirepaw.blogspot.com/2011/10/trouble-with-rest.html"&gt;a rant&lt;/a&gt; about REST and the shortcomings of interpreting it as "REST =  HTTP+HATEOAS".&amp;nbsp; I submerged myself into some writings of Fielding, and took some time for reflection, and I've found one of the sources of my problems with "REST".&amp;nbsp; (&lt;a href="http://blogs.msdn.com/b/oldnewthing/archive/2011/11/15/10237051.aspx"&gt;Another.&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;This problem is that there's too much writing on the web that attacks "RPC systems" as the logical opposite of REST, and I took this assumption unquestioned.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;h3&gt;State Transfer&lt;/h3&gt;The very foundation of REST, the part that puts "ST" in "REST" and relegates the "RE" to a mere Latinate adjective, is &lt;i&gt;state transfer.&lt;/i&gt;&amp;nbsp; This is, as far as I can tell, the basic property that makes REST distinct from other styles, and also interesting.&amp;nbsp; The whole thing about state management in REST is: &lt;b&gt;do not keep hidden state; &lt;i&gt;transfer it all.&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You know, "shared nothing."&amp;nbsp; Like &lt;a href="http://en.wikipedia.org/wiki/Network_File_System_%28protocol%29"&gt;NFS&lt;/a&gt;v2 or "modern" scalable architecture.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;RPC Isn't About State&lt;/h3&gt;Wikipedia focuses on the definition of "RPC" as a means of invoking a service on some server in a manner indistinguishable at point-of-use from some local function call.  It just happens to be a special function (the RPC stub) that packs up its arguments and sends them over the network.&lt;br /&gt;&lt;br /&gt;Since the server doesn't have access to the process-local memory, the RPC call must also be a &lt;i&gt;pure function:&lt;/i&gt; all of the state it needs to perform its action is carried in its arguments, and your program is not affected beyond receiving the return value.&amp;nbsp; This is state transfer.&amp;nbsp; Can it be called REST?&amp;nbsp; It probably can, if the function is duck-typed—when any representation can conform to the interface and be accepted.&lt;br /&gt;&lt;br /&gt;What about HATEOAS with our pure-functional RPC?&amp;nbsp; As I understand it, the rationale for HATEOAS is that the server should be in control of its namespace.&amp;nbsp; The systems that I am familiar with and labeled RPC are available on the network at a single URL, just like a "proper" RESTful server.&amp;nbsp; Depending on the exact protocol, there may be a separate file specifying what methods, parameters, and return types are available. Providing a definition file allows the server to retain control of its namespace, as long as clients don't cache a stale definition for too long.&lt;br /&gt;&lt;br /&gt;Of course, nothing prevents an RPC system from implementing its own definition method, or proxying individual methods to other servers on the backend.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;But the Internet Says a "Method" Parameter is Definitely Not RESTful?&lt;/h3&gt;I know, but I don't think that RPC &lt;i&gt;prevents &lt;/i&gt;a RESTful design from occurring.&amp;nbsp; If the server can specify what methods it has, and also use that specification in results (such as listCustomers() being able to indicate "customer id 1, details at getCustomer(id=1)") to form a hyperlink, then it can be RESTful.&lt;br /&gt;&lt;br /&gt;The implementation probably doesn't work out very well with RPC stubs in languages like C, where the function names to be called have to be fully known at compile time, well &lt;i&gt;before&lt;/i&gt; talking to the server.&amp;nbsp; But a language that allows for running code when a function name isn't found is perfectly capable of the trick.&lt;br /&gt;&lt;br /&gt;However, most RPC systems probably don't have the hyperlink aspect to them, in which case they would indeed not be RESTful.&amp;nbsp; This is what makes for the above rule: a "method" parameter generally indicates RPC, which is generally not RESTfully designed, therefore the method parameter is strongly correlated with non-REST architecture.&amp;nbsp; That correlation is simply not a &lt;i&gt;guarantee.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Batch Operations&lt;/h3&gt;Most of the RESTful API design advice I can find seems to be centered around basic CRUD, using the familiar GET, POST, PUT, and DELETE verbs to act on a single object in the data store at a time: &lt;code&gt;DELETE /users/1 If-Match: SomeETagValue&lt;/code&gt; and so forth.&lt;br /&gt;&lt;br /&gt;Actually, I lied.&amp;nbsp; Most of the design advice doesn't even cover If-Match.&amp;nbsp; However, they do ignore searching, as well as anything that you would approach with a transaction in the average non-RESTful system.&amp;nbsp; I ran across a comment (#22 at &lt;a href="http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven"&gt;this page&lt;/a&gt;) by Fielding on the batch operation matter:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;People perceive a need for batch operations because they don’t understand the scope of resources. &lt;br /&gt;&lt;br /&gt;Resources are not storage items (or, at least, they aren’t always  equivalent to some storage item on the back-end).  The same resource  state can be overlayed by multiple resources, just as an XML document  can be represented as a sequence of bytes or a tree of individually  addressable nodes. &lt;i&gt;Likewise, a single resource can be the equivalent of a  database stored procedure, with the power to abstract state changes  over any number of storage items.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;If you find yourself in need of a batch operation, then most likely you just haven’t defined enough resources.&lt;/blockquote&gt;Emphasis mine.&lt;br /&gt;&lt;br /&gt;It seems that being equal to a stored procedure is simply another implication of state transfer: &lt;i&gt;all the necessary state&lt;/i&gt; to perform the operation is sent in a single request.&amp;nbsp; HTTP support like &lt;code&gt;If-Match&lt;/code&gt; offers the ability to use the server in a &lt;a href="http://en.wikipedia.org/wiki/Non-blocking_algorithm"&gt;lock-free&lt;/a&gt; style.&amp;nbsp; Request some bits, do work, and send updated bits with some sort of version indication so that the server can tell whether they're out of date.&amp;nbsp; If they are, the server returns 409/Conflict and the client tries again.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;URLs as Identities&lt;/h3&gt;If you're not familiar with Clojure: identities are a series of states (actual stored data) over time.&amp;nbsp; These are very similar to variables, but Clojure offers identities as specific &lt;i&gt;kinds &lt;/i&gt;of variables, each with unique semantics for updating them.&amp;nbsp; A value is simply the state of an identity at some point in time.&amp;nbsp; Conceptually, it's a copy, so that once you have the value, it doesn't change, even if the state of the identity is updated.&lt;br /&gt;&lt;br /&gt;If we access /users/3 in our RESTful API, we expect to get the most recent version of user 3.&amp;nbsp; Since the data at this URL can change over time, it's clearly not a value, so the URL must be an identity instead.&amp;nbsp; The URL+ETag together form a &lt;i&gt;state, &lt;/i&gt;and associated data is the value.&lt;br /&gt;&lt;br /&gt;The lock-free style must identify a specific &lt;i&gt;state, &lt;/i&gt;so that the Conflict error can be generated if the state operated on doesn't match the current state.&amp;nbsp; That's what atomic compare-and-swap instructions offer in client-side lock-free programming, and what &lt;code&gt;If-Match&lt;/code&gt; provides when we PUT /users/3; but how does it carry over to batch operations, where the resource may need to identify several entities' states?&lt;br /&gt;&lt;br /&gt;Apparently, the URLs need to include the ETag to indicate the &lt;i&gt;state &lt;/i&gt;rather than the &lt;i&gt;identity.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Refining REST&lt;/h3&gt;Now that I defined state as the value of the entity at a URL at some point in time, we can use that definition to get a deeper understanding of state transfer. &amp;nbsp;In its basic form, when we PUT /users/3, we have both an identity (the request URL) and value; the request itself is asking to associate the two, as of the current time. &amp;nbsp;Once time is involved, we can say there's a state: the PUT request is defining a new state of the identity.&lt;br /&gt;&lt;br /&gt;Therefore, "state transfer" means that we're sending a complete state along the wire, either literally or by reference through URIs to other states. &amp;nbsp;This is how we can have idempotent requests and caching: if a request is conditional upon a state, has no effect when the condition does not hold, and changes the condition when successful, then duplicate requests will fail if any others succeed. &amp;nbsp;An RPC-based system may or may not make this guarantee. &amp;nbsp;Systems implemented over HTTP must follow the guarantees that HTTP defines for its methods, or else use POST for everything, which is an anti-pattern in its own right.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;SOAP and WSDL Don't Guarantee RPC or a Non-RESTful Architecture&lt;/h3&gt;I want to wrap this up with an example. &amp;nbsp;QuickBooks offers an SDK, which allows for scripting the QuickBooks GUI. &amp;nbsp;(Literally so: you cannot interact with company X if the GUI is logged into company Y.) &amp;nbsp;They also provide a QuickBooks Web Connector (QBWC), which connects to a server from the client machine and makes requests. &amp;nbsp;The server (eventually) responds with chunks of qbXML that represent QuickBooks operations, which the web connector passes to QuickBooks on the server's behalf. &amp;nbsp;Finally, QBWC gets another qbXML document from QuickBooks in response, which it returns to the server.&lt;br /&gt;&lt;br /&gt;This server is actually a web service, available over SOAP, using the WSDL that Intuit provides with the SDK. &amp;nbsp;However, HTTP and SOAP are simply building a tunnel for the exchange of qbXML, and the design of those files is RESTful.&lt;br /&gt;&lt;br /&gt;To build an audit log and/or prevent conflicting changes in multi-user mode, QuickBooks can only allow the most-recent state to be updated. &amp;nbsp;Therefore, every object in QuickBooks has both a unique identity and state: ListID and EditSequence. &amp;nbsp;Writes MUST provide an EditSequence, and the EditSequence value MUST match; otherwise, a conflict has occurred, and the write is denied.&amp;nbsp; This rule applies to all calls, user-initiated and SDK-generated, and therefore naturally applies to QBWC as well.&lt;br /&gt;&lt;br /&gt;Since the server's only point of contact with QuickBooks is qbXML, and it doesn't support constructing a change over the course of multiple calls, all of the data to complete a request must also be included in a single qbXML request. &amp;nbsp;Once again, this is a signature of REST.&lt;br /&gt;&lt;br /&gt;As far as I can tell, the  stack involves SOAP and WSDL to provide a guide to the interface between the client and server.&amp;nbsp; (Even so, the WSDL indicates certain methods are optional, but QBWC will not proceed if they are missing.&amp;nbsp; I think there's a grumpy comment in our source about that, regarding clientVersion and serverVersion.)&amp;nbsp; In other words, the stack tunnels over SOAP and WSDL, and uses XML as its resource format, because things like WSDL-to-code generators and XSD schema validators exist.&lt;br /&gt;&lt;br /&gt;Using SOAP and WSDL means that Intuit didn't have to create any sort of WSDL equivalent for services exposed directly on HTTP, nor do people implementing their SOAP service have to worry about URL routing.&amp;nbsp; Implementors can put their service at any URL, and that's all that the tunnel needs to work.&amp;nbsp; Using this to build the layer that RESTful qbXML travels on butters the bread on both sides: Intuit gets the benefit of WS-* and XML tooling for setting up the requests, and also the benefits of REST when processing them.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Closing Remarks&lt;/h3&gt;Fielding has also remarked that it's not possible to determine whether an application is using an interface in the REST style by scanning its network traffic.&amp;nbsp; If the server provides URLs, but the client ignores them and requests hard-coded URLs that happened to be in the response, then the network traffic is identical, but the decision hasn't been made by actually chasing the hypertext pointers.&lt;br /&gt;&lt;br /&gt;It's important to remember that REST is a &lt;i&gt;style.&amp;nbsp; &lt;/i&gt;It's not about integrating with HTTP.&amp;nbsp; It's not about "not being RPC" or "not being SOAP".&amp;nbsp; If a service were implemented as &lt;a href="http://en.wikipedia.org/wiki/Efficient_XML_Interchange"&gt;EXI&lt;/a&gt; sent over &lt;a href="http://en.wikipedia.org/wiki/Stream_Control_Transmission_Protocol"&gt;SCTP&lt;/a&gt; just to be exotic, the result could still be RESTful.&lt;br /&gt;&lt;br /&gt;I am now of the opinion that similar things befell the terms "object oriented" and "REST", at least if we take Smalltalk as 'defining' OO: some important, useful, and somewhat mind-bending concept went mainstream in the programming world, and the difficult parts were lost in translation.&amp;nbsp; This makes it rather annoying to come along later and learn about it, because it seems silly or half-baked instead of deeply insightful, and I begin to wonder why everyone is so enamored with it.&amp;nbsp; At least until I find the True Meaning.™&lt;br /&gt;&lt;br /&gt;Now get out there and &lt;i&gt;rock.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-6891050485553565960?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/6891050485553565960/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=6891050485553565960&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/6891050485553565960'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/6891050485553565960'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/11/rest-and-rpc-not-actually-antonyms.html' title='REST and RPC: Not Actually Antonyms'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7421950973752883716</id><published>2011-11-18T22:58:00.000-05:00</published><updated>2011-11-18T23:00:41.336-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='value'/><category scheme='http://www.blogger.com/atom/ns#' term='tradeoffs'/><category scheme='http://www.blogger.com/atom/ns#' term='prediction'/><category scheme='http://www.blogger.com/atom/ns#' term='apple'/><title type='text'>Smart TV and Split Attention</title><content type='html'>I don't want to waste much effort on trivialities, but regarding Gruber's &lt;a href="http://daringfireball.net/2011/10/apps_are_the_new_channels"&gt;semi-recent post&lt;/a&gt;:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Imagine watching a baseball game on a TV where ESPN is a smart app, not a  dumb channel. When you’re watching a game, you could tell the TV to  show you the career statistics for the current batter. You could ask the  HBO app which other movies this actress has been in. Point is: it’d be better for both viewers and the networks if a TV “channel” were an interactive app rather than a mere single stream of video.&lt;/blockquote&gt;This is not actually a universal good for viewers.&amp;nbsp; They'll probably like it and want it, but if there's one thing I have learned about myself, it is simply:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: Georgia,&amp;quot;Times New Roman&amp;quot;,serif; font-size: large; text-align: center;"&gt;&lt;b&gt;Splitting my attention between things means I don't remember &lt;i&gt;either&lt;/i&gt; thing.&lt;/b&gt;&lt;/div&gt;&lt;br /&gt;Worse, the things that a smart channel offers me in Gruber's vision—the things actually &lt;i&gt;related to &lt;/i&gt;the show I'm watching—are useless trivialities.&amp;nbsp; If I had smart TV and lacked the discipline to avoid these side quests, then I wouldn't gain &lt;i&gt;anything &lt;/i&gt;out of my screen time.&amp;nbsp; I'd forget the answers to the fleeting distractions, and also not be able to remember what I was watching in the first place.&lt;br /&gt;&lt;br /&gt;I can say all this because I already know what the price of distraction is.&amp;nbsp; I refuse to pick up my iPod while watching things, no matter how interesting it seems at the time, because I'd rather focus on &lt;i&gt;the show or movie.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;What makes me happy?&amp;nbsp; It's not the Internet; it's not TV; it's not apps; it won't be all three of them rolled together into smart TV.&amp;nbsp; However, a smart TV done well will still be a success in the market.&amp;nbsp; We'll find out sooner or later whether Apple did it well.&amp;nbsp; It's almost certain that they'll try.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7421950973752883716?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7421950973752883716/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7421950973752883716&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7421950973752883716'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7421950973752883716'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/11/smart-tv-and-split-attention.html' title='Smart TV and Split Attention'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-3356684169626795511</id><published>2011-11-16T18:25:00.001-05:00</published><updated>2011-11-16T19:50:09.291-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='pointers'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='languages'/><title type='text'>Programming Languages to Learn</title><content type='html'>Many languages these days are fairly &lt;a href="http://www.paulgraham.com/diff.html"&gt;Lispy&lt;/a&gt;, except for being &lt;a href="http://c2.com/cgi/wiki?HomoiconicLanguages"&gt;homoiconic&lt;/a&gt; and thus having a full-strength macro system instead of C's token pasting or many other languages' &lt;i&gt;nothing.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;But which ones are absolutely vital to learn, and which ones are "just different languages"?&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;h2&gt;Ruby&lt;/h2&gt;&lt;a href="http://www.ruby-lang.org/"&gt;Ruby&lt;/a&gt; is an incredibly thoughtful language, and using it, you can't help but think about the decisions Matz made, and the ones you're making.&amp;nbsp; Coming to Ruby after having seen Perl, the inspiration is clear, but Ruby has been &lt;i&gt;distilled.&lt;/i&gt;&amp;nbsp; You can probably make Perl do everything Ruby does, but in Perl you have to work for it.&amp;nbsp; In Ruby, it's frictionless.&lt;br /&gt;&lt;br /&gt;Ruby also has regular expression syntax in the language.&amp;nbsp; This power will corrupt you, and you'll wish for it every time you need to run regular expressions in any other language (except Perl, which also has it).&amp;nbsp; But it's also worth it.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Clojure&lt;/h2&gt;It's just as thoughtful of a language as Ruby, but it's also on the JVM, and its Java integration is so good that it's a better Java than the Java language itself.&lt;br /&gt;&lt;br /&gt;It's also a Lisp that &lt;a href="http://steve-yegge.blogspot.com/2006/04/lisp-is-not-acceptable-lisp.html"&gt;isn't Common Lisp or Scheme&lt;/a&gt;.&amp;nbsp; This frees it from a lot of historical decisions, which makes the language much more focused on being a &lt;i&gt;modern &lt;/i&gt;functional language.&amp;nbsp; It keeps all the benefits of Lisp, but also brings in immutability and leverages that for its concurrency models.&amp;nbsp; (Because you don't always want the same kind of concurrency.)&lt;br /&gt;&lt;br /&gt;In short, &lt;a href="http://www.clojure.org/"&gt;Clojure&lt;/a&gt; &lt;i&gt;should &lt;/i&gt;expand your mind just as much as any other Lisp, and then some, all without the archaic names and relatively poor suitability for modern work.&amp;nbsp; You don't need a well-stocked &lt;a href="http://common-lisp.net/project/asdf/"&gt;asdf&lt;/a&gt; to have a basic, portable environment set up.&amp;nbsp; Since Clojure is, at the moment, just one canonical implementation, you don't even have to worry that hard about portability, unless you want to write for ClojureScript.&amp;nbsp; (But for learning Clojure itself, you probably don't.)&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;C&lt;/h2&gt;I debated putting this one on the list, but I think it's still vital simply for the fact that it offers raw, unfettered pointers.&amp;nbsp; Once you grasp pointers vs. values, you have the foundation to understand pointers in any other language.&amp;nbsp; Though they won't be called pointers, since those got a bad rap for being "hard" because C crashes when you make a mistake.&amp;nbsp; Which you do a lot, when you're learning.&lt;br /&gt;&lt;br /&gt;C style syntax also forms the basis of several other languages, and it's still a sort of &lt;i&gt;lingua franca,&lt;/i&gt; though that role has been moving to Java.&amp;nbsp; (Which tells you something: Java is the &lt;i&gt;least &lt;/i&gt;mind-bending language around, if it's believed to be easiest to understand for the most people.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Beyond the Gate&lt;/h2&gt;There are probably other &lt;a href="http://prog21.dadgum.com/38.html"&gt;interesting&lt;/a&gt; languages to learn, but I would have a hard time calling any of them out as &lt;i&gt;vital.&lt;/i&gt;&amp;nbsp; Clojure and Ruby pack a lot of attention to design in their languages, and that shows through in everyday programming.&amp;nbsp; At the same time, though, I feel like there's something deep that I'm missing about the &lt;a href="http://www.complang.tuwien.ac.at/forth/gforth/Docs-html/index.html#Top"&gt;Forth&lt;/a&gt; &lt;a href="http://concatenative.org/wiki/view/Front%20Page"&gt;family&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Therefore, I don't think this post can be the definitive list for all time; it's incomplete because I haven't learned every language, and it's not a static list because languages and the practice of programming itself will change over time.&amp;nbsp; But for now, today, this is my list.&lt;br /&gt;&lt;br /&gt;Go forth and program.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-3356684169626795511?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/3356684169626795511/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=3356684169626795511&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/3356684169626795511'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/3356684169626795511'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/11/programming-languages-to-learn.html' title='Programming Languages to Learn'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-6016894098529232643</id><published>2011-11-15T22:00:00.000-05:00</published><updated>2011-11-16T10:11:40.764-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='explanation'/><category scheme='http://www.blogger.com/atom/ns#' term='cloudfront'/><category scheme='http://www.blogger.com/atom/ns#' term='s3'/><category scheme='http://www.blogger.com/atom/ns#' term='aws'/><title type='text'>Private Streaming with CloudFront: A Guide</title><content type='html'>I'll just assume you're aware of the &lt;acronym title="Infrastructure as a Service"&gt;IaaS&lt;/acronym&gt; offering known as &lt;a href="http://aws.amazon.com/"&gt;Amazon Web Services&lt;/a&gt;, AWS.&amp;nbsp; CloudFront is a &lt;acronym title="Content Delivery Network"&gt;CDN&lt;/acronym&gt; in the AWS micropayments-as-you-go style, which offers the ability to serve non-public content stored in S3.&amp;nbsp; This is a compendium of the things I learned setting up a &lt;i&gt;private streaming&lt;/i&gt; distribution for use with PHP.&lt;br /&gt;&lt;br /&gt;This is going to be fairly low-level, since I like to drink deeply of the systems I'm working with.&amp;nbsp; I don't think AWS works smoothly enough yet that you can put the API on the "&lt;a href="http://sites.google.com/site/steveyegge2/practical-magic"&gt;it's magic&lt;/a&gt;" side of the line. &lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;h2&gt;Don't touch that yet: Making S3 CloudFront friendly&lt;/h2&gt;CloudFront will request content from S3 using the virtual-hosting style, your-bucket-name.s3.amazonaws.com, so you can only create a CloudFront distribution &lt;b&gt;for a bucket that has a "DNS safe" name.&lt;/b&gt;&amp;nbsp; For me, that meant going back and &lt;a href="http://sapphirepaw.blogspot.com/2011/11/renaming-buckets-on-s3.html"&gt;renaming the bucket&lt;/a&gt; since I didn't want to upload another 1G of files.&lt;br /&gt;&lt;br /&gt;If you're going to serve your content with TLS (nee SSL) enabled, &lt;b&gt;over https, you need an ssl-safe name: one without dots.&lt;/b&gt;&amp;nbsp; Amazon has a wildcard certificate for &lt;tt&gt;*.s3.amazonaws.com&lt;/tt&gt; which will not match &lt;tt&gt;bucket.example.com.s3.amazonaws.com&lt;/tt&gt;, since the * in SSL only applies to one level of name.&amp;nbsp; Replacing the dots with dashes, &lt;tt&gt;bucket-example-com&lt;/tt&gt; would work fine.&lt;br /&gt;&lt;br /&gt;Once you have a suitable bucket name ready for CloudFront, you can proceed with creating your distribution.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Creating or Configuring a Distribution&lt;/h2&gt;The CloudFront web console (the AWS management console's CloudFront tab) is particularly &lt;b&gt;lacking in support for private distributions at the moment.&lt;/b&gt;&amp;nbsp; If I were doing things again, I'd probably skip creating the distribution in the  console, and hack up a script to do everything that way.&amp;nbsp; The lack of support means that you end up having to call the API somehow, regardless.&lt;br /&gt;&lt;br /&gt;At any rate, whether you're creating the distribution from whole cloth in a script or re-configuring one created via the console, there are three important settings to engage, along with one dependency.&amp;nbsp; That dependency is to create an origin access identity (OAI).&amp;nbsp; Then, you can associate the OAI to the account and to the origin, and set yourself as trusted signer.&amp;nbsp; At this point, the distribution is fully private, and you can grant access to the OAI to your S3 objects, and move on to signing your URLs.&lt;br /&gt;&lt;br /&gt;I'm working in PHP as usual, so I grabbed the AWS SDK for PHP via Amazon's pear channel: &lt;b style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;pear channel-discover pear.amazonwebservices.com ; pear install pear.amazonwebservices.com/sdk&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This lets you &lt;b style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;include('AWSSDKforPHP/sdk.php');&lt;/b&gt; in your PHP script, and start using calls &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/"&gt;as documented here&lt;/a&gt;.&amp;nbsp; The first thing we need to call is &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/#m=AmazonCloudFront/create_oai"&gt;create_oai&lt;/a&gt; which will be one of our dependencies.&amp;nbsp; &lt;b&gt;Save yourself some time, and print every response you get.&lt;/b&gt;&amp;nbsp; In particular, you'll need the Id and S3CanonicalUser from the create_oai response later.&amp;nbsp; Otherwise, you'll have to call get_oai_list (which returns &lt;i&gt;array &lt;/i&gt;rather than CFResponse) and get_oai to find what you missed.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Creating a Distribution from Scratch (Recommended)&lt;/h3&gt;If you have created a configuration through &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/index.html#m=AmazonCloudFront/generate_config_xml"&gt;generate_config_xml&lt;/a&gt;, or a distribution through &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/index.html#m=AmazonCloudFront/create_distribution"&gt;create_distribution&lt;/a&gt; which internally calls generate_config_xml, then S3Origin will properly receive the OriginAccessIdentity child, and you can skip the following section. &amp;nbsp;All you need to do is pass&amp;nbsp;&lt;b style="font-family: 'Courier New', Courier, monospace;"&gt;'OriginAccessIdentity' =&amp;gt; $oai_id, 'TrustedSigners' =&amp;gt; array('Self'), 'Streaming' =&amp;gt; true&lt;/b&gt;&amp;nbsp;in the options, and everything will work.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Updating the Config of an Existing Distribution&lt;/h3&gt;If you've created the distribution already, you can get its ID&amp;nbsp;by looking at its properties in the console, or through &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/index.html#m=AmazonCloudFront/list_distributions"&gt;list_distributions&lt;/a&gt; with the Streaming option set.&amp;nbsp; Next, call &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/#m=AmazonCloudFront/get_distribution_config"&gt;get_distribution_config&lt;/a&gt; with the Streaming option to fetch the distribution's configuration.&amp;nbsp; You can pass this response straight to &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/#m=AmazonCloudFront/update_config_xml"&gt;update_config_xml&lt;/a&gt;; again, with the Streaming option set, but also &lt;b style="font-family: 'Courier New', Courier, monospace;"&gt;'OriginAccessIdentity' =&amp;gt; $oai_id,  'TrustedSigners' =&amp;gt; array('Self')&lt;/b&gt;.  Now here's the part that tripped me up for almost three days: &lt;b&gt;the XML that comes back does not include an OriginAccessIdentity element as a child of S3Origin.&amp;nbsp; This seems to apply to 1.4.7 as well as the 1.4.6 version of the PHP SDK I was using.&lt;/b&gt;&amp;nbsp; The distribution is private (will require signed URLs) because the OriginAccessIdentity is present in the distribution config, but you need to add the OriginAccessIdentity under S3Origin by hand.&lt;br /&gt;&lt;br /&gt;For me, it was something like this:&lt;br /&gt;&lt;br /&gt;&lt;b style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$sxml = simplexml_load_string($new_cfg);&lt;br /&gt;$sxml-&amp;gt;registerXPathNamespace("cf",&lt;/b&gt;&lt;br /&gt;&lt;b style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; "http://cloudfront.amazonaws.com/doc/2010-11-01/");&lt;br /&gt;foreach ($sxml-&amp;gt;xpath('//cf:S3Origin') as $node) {&lt;/b&gt;&lt;br /&gt;&lt;b style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; $node-&amp;gt;addChild('OriginAccessIdentity',&lt;/b&gt;&lt;br /&gt;&lt;b style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "origin-access-identity/cloudfront/$oai_id");&lt;br /&gt;}&lt;br /&gt;$new_cfg = $sxml-&amp;gt;asXML();&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The call to registerXPathNamespace tells SimpleXML that we want to access items in Amazon's namespace with a prefix of "cf", then we use that in the XPath expression to find the S3Origin node, to which we add our OAI.&amp;nbsp; Without this, CloudFront won't try to actually use the OAI to access S3 content, and will therefore be limited to public files.&lt;br /&gt;&lt;br /&gt;At this point, you have something suitable for &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/#m=AmazonCloudFront/set_distribution_config"&gt;set_distribution_config&lt;/a&gt;, with the Streaming option, of course.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Letting CloudFront Access S3&lt;/h2&gt;I didn't want &amp;nbsp;to set every video as accessible to the CloudFront OAI individually, so I set up a &lt;i&gt;bucket policy&lt;/i&gt;&amp;nbsp;instead. &amp;nbsp;Bucket policies let you grant/revoke permissions for all objects in the bucket, overriding the individual object ACLs.&lt;br /&gt;&lt;br /&gt;However, they are not terribly easy to write. &amp;nbsp;I took the example from the &lt;a href="http://docs.amazonwebservices.com/AmazonS3/latest/dev/"&gt;S3 documentation&lt;/a&gt;&amp;nbsp;Access Control → Using Bucket Policies → Example Cases.... The CanonicalUser to use can be found via get_oai's S3CanonicalUser element, if you didn't record it after calling create_oai.&lt;br /&gt;&lt;br /&gt;After substituting your S3CanonicalUser and bucket name in the appropriate spots in the policy, it needs to be set as the actual bucket policy. &amp;nbsp;To do this at the console: click the bucket, choose Actions → Properties from the button above, and look for the "Edit bucket policy" button on the Permissions tab.  Paste and save your policy, and you should be set.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Signing URLs&lt;/h2&gt;CloudFront does request authentication differently than other AWS services, so you need to create an RSA keypair for it in your AWS account.&amp;nbsp; From the console, click your name way up in the upper right, above the tabs, and choose Security Credentials.&amp;nbsp; Log in again, and scroll down a bit, and you'll see a tab for Key Pairs.&amp;nbsp; This is where you generate your signing key. &amp;nbsp;You'll need to save the private key and keep it on your server for creating the signatures.&lt;br /&gt;&lt;br /&gt;If you search the web hard enough, you can find some URL signing code from RyanP@AWS. &amp;nbsp;I played with this for a while, but for streaming distributions it gives you a certainly-wrong URL, and I never did test whether it actually worked once I understood how to sign URLs. &amp;nbsp;(I suspect it doesn't.) &amp;nbsp;&lt;b&gt;The URL signer that was actually helpful&lt;/b&gt; was the one that is included in the &lt;a href="http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/"&gt;CloudFront Developer's Guide&lt;/a&gt;&amp;nbsp;under "Using a Signed URL.... → Signature Code, Examples, and Tools → Create ... using PHP".&lt;br /&gt;&lt;br /&gt;Whatever you're using to sign your URLs with, &lt;b&gt;test it with the private key and example policies&lt;/b&gt; from the developer's guide to make sure you get the same output. &amp;nbsp;I had to go back and undo some of my "improvements" to my signing script. &amp;nbsp;For one, the canned policy is not &lt;i&gt;necessarily&amp;nbsp;&lt;/i&gt;the JSON which would be output by a json encoder: slashes in the resource URL are not backslash-escaped. &amp;nbsp;Secondly, I assembled some parameters through http_build_query, which turned the '~' characters in the Signature field into %7E, and CloudFront didn't care for that either.&lt;br /&gt;&lt;br /&gt;Since we're streaming, using RTMP, there are two URLs involved in the process. &amp;nbsp;There's a streamer at syourhostname.cloudfront.net/cfx/st which is the "streamer" in jwPlayer parlance, or "netConnectionURL" in flowplayer. &amp;nbsp;This is where Flash connects to access the stream; it is like a server in its own right, that takes another "file" or "URL" (again, in jwPlayer and flowplayer terms, respectively) to determine what content should be streamed over that connection.&lt;br /&gt;&lt;br /&gt;The streamer &lt;b&gt;does not participate&lt;/b&gt; in the URL signing process at all. &amp;nbsp;It is only the content to be streamed that is signed. &amp;nbsp;Therefore, the URL that actually gets signed is &lt;b&gt;the full path of your file, with the extension, within your S3 bucket.&lt;/b&gt; &amp;nbsp;If you have video-example-com as your bucket, then &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;video-example-com.s3.amazonaws.com/usa/managing-payments.flv&lt;/span&gt; will use &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;usa/managing-payments.flv&lt;/span&gt;&amp;nbsp;as the &lt;i&gt;resource&lt;/i&gt;&amp;nbsp;to sign.&lt;br /&gt;&lt;br /&gt;The resulting signature then gets attached to the stream name &lt;b&gt;as Flash prefers it,&lt;/b&gt; &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;flv:usa/managing-payments?Expires=...&lt;/span&gt;. &amp;nbsp;jwPlayer seems to work with the URL as-is, when invoking it through its jwplayer JavaScript function, but flowplayer needs the signed URL to be URL-encoded, except for the colon. &amp;nbsp;That is, my final URL-building for flowplayer takes the shape: &lt;b style="font-family: 'Courier New', Courier, monospace;"&gt;$file_url = 'flv:' . urlencode("folder/file?$sign_params");&lt;/b&gt;&amp;nbsp; I did not need any more encoding of slashes or spaces in the folder/file names.&lt;br /&gt;&lt;br /&gt;I can't really say more about jwPlayer here, because once I got the OriginAccessIdentity associated to the S3Origin and could  access my files in either player, I went back to using flowplayer.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Troubleshooting&lt;/h2&gt;If you've just made changes, wait a bit and try again. &amp;nbsp;S3 ACLs don't seem to apply instantly, so bucket policies probably don't either, and Amazon notes somewhere that it can take up to 15 minutes for a CloudFront distribution to be updated after an edit is made.&lt;br /&gt;&lt;br /&gt;If you're still having problems after waiting it out,&amp;nbsp;&lt;b&gt;turn on access logging&lt;/b&gt;&amp;nbsp;for both CloudFront and the S3 bucket. &amp;nbsp;That's what ultimately directed me toward my problem, that CloudFront was not using its OAI to access the files in S3.&lt;br /&gt;&lt;br /&gt;Access logging of an S3 bucket uses the "prefix" parameter as-is: if you set it to 's3-videos-' you'll get files in the root of the access logging bucket, with names beginning 's3-videos-'. &amp;nbsp;The CloudFront console's Edit Distribution screen &lt;i&gt;seems&lt;/i&gt; to suggest the same, but it actually uses it &lt;b&gt;as a folder name.&lt;/b&gt; &amp;nbsp;I gave it a prefix of 'cf-' and all my CloudFront logs ended up in 'cf-/'.&lt;br /&gt;&lt;br /&gt;The access log files aren't available instantly, but in time, many very short files will show up. &amp;nbsp;This is where a tool like s3sync or Cloudberry Explorer comes in handy, to fetch them so you can see them without a three-click download process between each one.&amp;nbsp; CloudFront's logs are gzip-compressed, but S3's are just text.&lt;br /&gt;&lt;br /&gt;Incidentally, you &lt;b&gt;can&lt;/b&gt; edit a distribution in CloudFront without affecting its private status. &amp;nbsp;The console only changes the properties that it supports.&lt;br /&gt;&lt;br /&gt;I can't really endorse any particular app for accessing or managing CloudFront distributions, because of the ones that understood streaming distributions, they all seemed blissfully unaware of my misconfiguration.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Conclusion&lt;/h2&gt;To compress all of the above into an "If I were doing this again from scratch" set of steps:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Create a DNS- and possibly SSL-safe S3 bucket.&lt;/li&gt;&lt;li&gt;Upload your videos into it.&lt;/li&gt;&lt;li&gt;Create a CloudFront Origin Access Identity, &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/#m=AmazonCloudFront/create_oai"&gt;through the API&lt;/a&gt;. &amp;nbsp;Record the OAI's Id and S3CanonicalUser values.&lt;/li&gt;&lt;li&gt;Create a CloudFront Distribution &lt;a href="http://docs.amazonwebservices.com/AWSSDKforPHP/latest/#m=AmazonCloudFront/create_distribution"&gt;through the API&lt;/a&gt;, using the OAI Id from the previous step, TrustedSigners=[Self], and Streaming=true to make it a private streaming distribution.&lt;/li&gt;&lt;li&gt;Create a &lt;a href="http://docs.amazonwebservices.com/AmazonS3/latest/dev/"&gt;bucket policy&lt;/a&gt; on the S3 bucket, using the S3CanonicalUser from the OAI, granting it GetObject on all items in the bucket.&lt;/li&gt;&lt;li&gt;Create an RSA keypair in the Security Credentials meta-console. &amp;nbsp;(For lack of a better term.)&lt;/li&gt;&lt;li&gt;Save the private key to your server.&lt;/li&gt;&lt;li&gt;Fetch the example code from the &lt;a href="http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/"&gt;CloudFront Developer's Guide&lt;/a&gt; to sign your URLs.&lt;/li&gt;&lt;li&gt;Use the signing code to sign URLs for "path/file.flv".&lt;/li&gt;&lt;li&gt;Use "flv:path/file?Expires=...&amp;amp;Signature=...&amp;amp;Key-Pair-Id=..." as the file URL in your player.&lt;/li&gt;&lt;li&gt;Use the http or https://syourhostname.cloudfront.net/cfx/st URL as the streamer URL in your player.&lt;/li&gt;&lt;/ol&gt;At this point, everything should be working. &amp;nbsp;If not, Amazon allows themselves 15 minutes for changes to CloudFront to propagate, so take a little break and try again.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-6016894098529232643?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/6016894098529232643/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=6016894098529232643&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/6016894098529232643'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/6016894098529232643'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/11/private-streaming-with-cloudfront-guide.html' title='Private Streaming with CloudFront: A Guide'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-6479659353094942378</id><published>2011-11-05T18:55:00.001-04:00</published><updated>2011-11-05T18:55:29.636-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='hash'/><category scheme='http://www.blogger.com/atom/ns#' term='php'/><title type='text'>PHP's hash(): how tiger192,3 and tiger192,4 differ</title><content type='html'>PHP lists a handful of hash algorithms for &lt;a href="http://www.cs.technion.ac.il/%7Ebiham/Reports/Tiger/"&gt;Tiger&lt;/a&gt;:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;tiger128,3 tiger160,3&amp;nbsp;tiger192,3&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;tiger128,4&amp;nbsp;tiger160,4&amp;nbsp;tiger192,4&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;What's the difference? &amp;nbsp;Which one is standard? &amp;nbsp;Is one harder to break? &amp;nbsp;Why don't &lt;i&gt;any&lt;/i&gt; of the outputs match &lt;i&gt;either&lt;/i&gt; of Wikipedia's examples?&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The first number—128, 160, or 192—is easy. &amp;nbsp;It's &lt;b&gt;the number of bits of output,&lt;/b&gt; so the 128/160/192 variants will return 16/20/24 bytes of output. &amp;nbsp;By default, it's in ASCII format, so the actual strings are 32/40/48 characters. &amp;nbsp;The shorter versions are simply the first bytes of the longer output, so any of the tiger___,3 functions will output a hash starting "24f0130c63ac93321616" when hashing the empty string.&lt;br /&gt;&lt;br /&gt;I had to dip into the &lt;a href="http://svn.php.net/viewvc/php/php-src/branches/PHP_5_3_8/ext/hash/hash_tiger.c?view=markup"&gt;source code&lt;/a&gt;&amp;nbsp;to understand the second number. &amp;nbsp;It is &lt;b&gt;the number of &lt;/b&gt;&lt;i&gt;&lt;b&gt;passes&lt;/b&gt; &lt;/i&gt;that PHP performs when computing the hash. &amp;nbsp;The compression function (the compress macro) is set up to perform either 3 or 4 passes, with a key scheduling between each.&amp;nbsp; The code is slightly tricky because they store the distinction between 3 and 4 passes in 1 bit of the hash context, as the number of &lt;i&gt;additional&lt;/i&gt; &lt;i&gt;&lt;/i&gt;passes to perform.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The algorithms ending in ",3" are the standard Tiger algorithm;&lt;/b&gt;&amp;nbsp;their output for the empty string matches the empty string's hash in the &lt;a href="http://www.cs.technion.ac.il/%7Ebiham/Reports/Tiger/testresults.html"&gt;testtiger&lt;/a&gt; code from the Tiger website. &amp;nbsp;But the output is not in MSB-first order; see the note about Wikipedia below.&amp;nbsp; (The bug has been filed with PHP.)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The algorithms ending in ",4" are &lt;i&gt;non-standard,&lt;/i&gt;&lt;/b&gt; and should perform 33% slower because there's 33% more work being done. &amp;nbsp;As I understand it, this extends Tiger to 32 rounds (from 24), which should make it significantly harder to break. &amp;nbsp;&lt;b&gt;Disclaimer: I'm &lt;i&gt;not&lt;/i&gt; a crypto expert,&lt;/b&gt; so you should consult with experts if you want accurate, up-to-date information on the security of your specific application. &amp;nbsp;Ideally, you'd be using a well-vetted library that already made a secure choice.&lt;br /&gt;&lt;br /&gt;For the last question, &lt;b&gt;Wikipedia prints the output MSB-first, but testtiger and PHP don't.&lt;/b&gt;&amp;nbsp; The testtiger code prints each 64-bit chunk in byte-swapped order: so&amp;nbsp;3293...f024 in the output is printed as&amp;nbsp;24f0...9332.&amp;nbsp; Therefore, although PHP's 3-pass algorithm is standard Tiger, its hashes won't match any software that chose the MSB-first representation instead.&amp;nbsp; (For instance, the &lt;a href="http://search.cpan.org/%7Eclintdw/Digest-Tiger-0.03/Tiger.pm#NOTE"&gt;Digest::Tiger NOTE&lt;/a&gt;&amp;nbsp;for the Perl implementation explains that it returns hashes MSB-first.)&amp;nbsp; In that case, one of them needs to be reordered before the match is attempted.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-6479659353094942378?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/6479659353094942378/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=6479659353094942378&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/6479659353094942378'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/6479659353094942378'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/11/phps-hash-how-tiger1923-and-tiger1924.html' title='PHP&apos;s hash(): how tiger192,3 and tiger192,4 differ'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-1491849378308355099</id><published>2011-11-01T21:36:00.000-04:00</published><updated>2011-11-01T21:38:05.689-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='s3'/><category scheme='http://www.blogger.com/atom/ns#' term='aws'/><title type='text'>Renaming buckets on S3</title><content type='html'>A technical note, since the search engines of the internet don't seem to have noticed: &lt;b&gt;Amazon's S3 management console lets you cut and paste files now&lt;/b&gt; (including whole folders).&amp;nbsp; So the process to "rename" an S3 bucket is simply:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Create the new bucket with the desired name.&lt;/li&gt;&lt;li&gt;Go to the old bucket and select all files: click the first and then shift+click the last.&lt;/li&gt;&lt;li&gt;Above the file listing, in the button row, is one marked "Actions", which opens a menu that includes "Cut" and "Copy".&amp;nbsp; Pick one.&lt;/li&gt;&lt;li&gt;Go to the new bucket, click Actions, and Paste your files.&lt;/li&gt;&lt;/ol&gt;Done.&amp;nbsp; No 3rd-party software required.&lt;br /&gt;&lt;br /&gt;Why would anyone want to rename a bucket?&amp;nbsp; In our case, we created a StudlyCapsStyle bucket, which can't be used with CloudFront's dns-compatible-style.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt;In double-checking this post for accuracy, I noticed that Cut/Copy are available on the right-click menu for a &lt;i&gt;single &lt;/i&gt;selection, but not the multi-select.&amp;nbsp; Weird.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-1491849378308355099?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/1491849378308355099/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=1491849378308355099&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1491849378308355099'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1491849378308355099'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/11/renaming-buckets-on-s3.html' title='Renaming buckets on S3'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-1852914993863749428</id><published>2011-10-25T20:59:00.000-04:00</published><updated>2011-10-25T21:00:55.148-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='data'/><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><category scheme='http://www.blogger.com/atom/ns#' term='character sets'/><title type='text'>Character Sets: Get PHP, Perl, MySQL, and Unicode to Play Together</title><content type='html'>This is an extended remix of &lt;a href="http://sapphirepaw.blogspot.com/2011/10/character-sets-encodings-mysql-and-your.html"&gt;my recent post&lt;/a&gt; on the subject, only less of a rambling story and more focused.&amp;nbsp; Again, I'll start with some background definitions.&lt;br /&gt;&lt;br /&gt;I'll also &lt;b&gt;assume that you're going to make everything UTF-8,&lt;/b&gt; because as a US-centric American who has the luxury of using English, that's what makes the most sense for my systems.&amp;nbsp; However, if you understand everything I wrote, it should not be difficult to make everything UTF-16 or any other encoding you desire.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;h3&gt;Terminology and History&lt;/h3&gt;First, &lt;b&gt;character sets.&lt;/b&gt;&amp;nbsp; A character set defines a set of characters and numbers for representing them.&amp;nbsp; For instance, ASCII defines 0-127, including the Latin alphabet necessary for US English.&amp;nbsp; So, "A" is 65 and "a" is 97 in ASCII.&amp;nbsp; After this was defined, the 8-bit byte was standardized and computers went overseas.&amp;nbsp; ASCII wasn't suitable, so various locales took advantage of the 8th bit to add 127 more characters for their languages.&amp;nbsp; The most common of these in the Western world are the &lt;a href="http://en.wikipedia.org/wiki/ISO/IEC_8859"&gt;ISO-8859&lt;/a&gt; family, which gave us latin-1, latin-7, and so forth.&amp;nbsp; These sorts of character sets are known as "8-bit" because they define 8 bits worth of space.&lt;br /&gt;&lt;br /&gt;Meanwhile in Asia, they had &lt;b&gt;character encodings&lt;/b&gt; like &lt;a href="http://en.wikipedia.org/wiki/Big5"&gt;Big5&lt;/a&gt; that let them use far more than 256 codes for their scripts that had far more than 256 symbols, and also embed ASCII for quoting English text.&amp;nbsp; However, Big5 did not fully specify what the embedded character set was.&lt;br /&gt;&lt;br /&gt;Looking at this mess, &lt;b&gt;Unicode&lt;/b&gt; was born, to define a single character set, once and for all.&amp;nbsp; Ideally, everything would be Unicode everywhere, and any user could see any text and be relieved of figuring out whether the document was (for example) CP1251 or KOI8-R.&amp;nbsp; The first encoding of Unicode, UCS-2, used 16 bits but was otherwise largely identical to other systems in that the character codes (what Unicode called &lt;b&gt;code points&lt;/b&gt;) and output bytes were identical.&amp;nbsp; UCS-2 proved to be too small, yet was considered bloated by US and Western European users who had mostly-ASCII text, since UCS-2 took twice the space for most of their characters.&lt;br /&gt;&lt;br /&gt;UTF-8 solved this problem by dividing bytes into segments that contained information about the byte stream (such as "Part 1 of 3" or "Subsequent part") and "character data".&amp;nbsp; With a bit of cleverness I'll skip here, ASCII remains as-is, and higher characters are represented with a multi-byte sequence.&amp;nbsp; Thus, 0xceb1 is the specific UTF-8 &lt;i&gt;encoding &lt;/i&gt;of the character 0x03b1 (greek small letter alpha).&lt;br /&gt;&lt;br /&gt;In the US, we're lazy and still call UTF-8 a character &lt;i&gt;set.&lt;/i&gt;&amp;nbsp; This is just a historical artifact of us not using Big5 or some other encoding that can have different character sets embedded in it, and it doesn't end up being &lt;i&gt;too &lt;/i&gt;confusing since UTF-8 can contain only one character set, Unicode.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Symptoms of Character Encoding Problems&lt;/h3&gt;There are two obvious symptoms when trying to output UTF-8.&amp;nbsp; First is the double encoding problem, where you see things like "â€™" where there should be only one special character.&amp;nbsp; This typically indicates that UTF-8 encoded data was interpreted as if it were an 8-bit encoding, and converted to UTF-8.&amp;nbsp; Each UTF-8 byte has become a whole UTF-8 character.&lt;br /&gt;&lt;br /&gt;The opposite symptom is a text littered with �, the "replacement character", instead of your special characters.&amp;nbsp; This usually means some 8-bit encoding is actually being interpreted as if it were UTF-8, and the bytes not following the rules of the latter encoding are replaced with the Dreaded Diamond Question Mark (or possibly Dreaded Box for some Windows machines).&lt;br /&gt;&lt;br /&gt;A more subtle symptom is when everything appears to work, but the labels are not entirely accurate: a Web page declares its character set to be UTF-8, but the database is labeled as "latin1".&amp;nbsp; Yet if the script that serves the Web page tells the database to send results in UTF-8 mode, then suddenly the Web page appears double-encoded (our first symptom from above).&amp;nbsp; In this case, the database actually contains UTF-8 data, but the connection and column (or table or database/schema) are both labeled as latin1; thus &lt;i&gt;no &lt;/i&gt;conversion is happening and sending the result in a UTF-8 page allows the client to display it correctly.&lt;br /&gt;&lt;br /&gt;Another symptom I have seen in the wild is that strings in UTF-8 containing only some particular 8-bit encoding's characters (such as Latin-1) get littered with �, while other strings that happen to contain any characters outside that range get printed in correct UTF-8—including any characters that break in other strings!&amp;nbsp; This one is Perl with a non-UTF-8 output encoding.&amp;nbsp; If you have warnings on, and check the appropriate error log, you may find "Wide char in print" warnings generated for the strings that seemed to print correctly.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Fixing Things&lt;/h3&gt;As mentioned in my previous post, to fix the database mislabeling with MySQL, I could dump the data from mysql and restore it.&amp;nbsp; I created two dumps: one with &lt;tt&gt;--no-data&lt;/tt&gt; that I piped through &lt;tt&gt;sed -e s/latin1/utf8/g&lt;/tt&gt; and one with &lt;tt&gt;--no-structure --no-create-db --default-character-set=latin1&lt;/tt&gt; that I changed the &lt;tt&gt;SET NAMES&lt;/tt&gt; line to be &lt;tt&gt;utf8&lt;/tt&gt; in my favorite editor.&amp;nbsp; This gave me a database structure labeled UTF-8, which I then populated with UTF-8 data (as it was) also labeled as UTF-8 (because of the new SET NAMES).&lt;br /&gt;&lt;br /&gt;I also took a moment to edit &lt;tt&gt;/etc/my.cnf&lt;/tt&gt; to add a &lt;tt&gt;[client]&lt;/tt&gt; section with &lt;tt&gt;default-character-set=utf8&lt;/tt&gt;.&amp;nbsp; I also set the server's variables to utf8: character_set_X, where X is client, connection, database, results, and server.&lt;br /&gt;&lt;br /&gt;The PHP website was already issuing the appropriate header for UTF-8 output, so it displayed replacement characters until I changed the connection character set.&amp;nbsp; With PHP 5.3.6 and up with PDO, adding &lt;tt&gt;charset=utf8&lt;/tt&gt; to &lt;a href="http://www.php.net/manual/en/ref.pdo-mysql.connection.php"&gt;the DSN&lt;/a&gt; works; otherwise, I simply issued &lt;tt&gt;SET NAMES utf8&lt;/tt&gt; after connecting.&amp;nbsp; If I were using mysqli, I would use &lt;a href="http://www.php.net/manual/en/mysqli.set-charset.php"&gt;set_charset&lt;/a&gt; after connecting, so that the library would also know about the new encoding in use.&lt;br /&gt;&lt;br /&gt;Perl also displayed replacement characters after I added &lt;tt&gt;mysql_enable_utf8 =&amp;gt; 1&lt;/tt&gt; to the DBI connection options.&amp;nbsp; To avoid breaking older programs, Perl considers everything to be in a "machine" encoding unless it's told otherwise.&amp;nbsp; I'm not entirely sure where it comes from; it's certainly not locale, because that was UTF-8.&amp;nbsp; However, when faced with a UTF-8 string being printed on some filehandle that is not UTF-8, Perl tries to convert the string to the filehandle encoding.&amp;nbsp; If it succeeds, it writes that &lt;i&gt;8-bit encoding &lt;/i&gt;out.&amp;nbsp; If it fails, it writes out &lt;i&gt;the UTF-8 bytes &lt;/i&gt;and generates the "Wide char in print" warning.&lt;br /&gt;&lt;br /&gt;The fix for Perl was to set the filehandle to UTF-8 output with &lt;tt&gt;binmode(STDOUT, ':encoding(UTF-8)');&lt;/tt&gt; before writing non-ASCII data to it.&amp;nbsp; If I had any in my source code, I would also need &lt;tt&gt;use utf8;&lt;/tt&gt; to prevent Perl from double-encoding UTF-8 sequences in the program source (particularly, in strings).&amp;nbsp; And if I were using Perl 5.14, I'd probably also &lt;tt&gt;use feature 'unicode_strings';&lt;/tt&gt; as well, as mentioned in &lt;a href="http://perldoc.perl.org/perlunicode.html#The-%22Unicode-Bug%22"&gt;The "Unicode Bug"&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;I've Earned It, Now&lt;/h3&gt;I need to buy one of the I � Unicode items from &lt;a href="http://www.cafepress.com/nucleartacos/317769"&gt;this shop&lt;/a&gt;.&amp;nbsp; And if you want more perspective on this topic, Jeff Atwood collected a few links in his &lt;a href="http://www.codinghorror.com/blog/2008/03/i-entity-unicode.html"&gt;I {entity} Unicode post&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-1852914993863749428?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/1852914993863749428/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=1852914993863749428&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1852914993863749428'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1852914993863749428'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/10/character-sets-get-php-perl-mysql-and.html' title='Character Sets: Get PHP, Perl, MySQL, and Unicode to Play Together'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7360363610865153054</id><published>2011-10-21T20:52:00.001-04:00</published><updated>2011-10-21T20:57:22.634-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='rest'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='rpc'/><category scheme='http://www.blogger.com/atom/ns#' term='magic'/><title type='text'>The Trouble with REST</title><content type='html'>REST is easy to describe.&amp;nbsp; It goes a little something like this: "You have some representation, and you send (or receive) the whole thing to read it or make changes."&amp;nbsp; People coming from Clojure would understand it as &lt;i&gt;REST sends values.&lt;/i&gt;&amp;nbsp; I can GET an object, receive the value, manipulate it, and PUT the new value.&amp;nbsp; It's so easy because it just uses HTTP!&lt;br /&gt;&lt;br /&gt;Right?&amp;nbsp; Maybe not.&amp;nbsp; If REST is so easy, why is there &lt;a href="http://en.wikipedia.org/wiki/HATEOAS"&gt;HATEOAS&lt;/a&gt;*?&amp;nbsp; Shouldn't that have been obvious?&amp;nbsp; Why do we have arguments about versioning and parameters and formats and headers on Reddit?&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;REST is not easy.&amp;nbsp; It's quick to explain in a limited manner, but the basic explanation leaves out important details, like how one actually goes about designing a proper REST API, and what makes an API RESTful to begin with.&lt;br /&gt;&lt;br /&gt;Let's design our first API.&amp;nbsp; With the CRUD operations mapped to HTTP verbs (POST, GET, PUT, DELETE), we start with a URL space for users. /users/21 for a specific user, or I can POST to /users to create a new user.&amp;nbsp; All well and good, but now I want to search for a user, which doesn't have an obvious, direct mapping onto HTTP verbs.&lt;br /&gt;&lt;br /&gt;I can define a special URL, /search/user?name=bob or /users/search/bob.&amp;nbsp; I can define a new verb to SEARCH /users?name=bob.&amp;nbsp; I can (though this heads straight out of REST territory) define a parameter acting as a sub-method: /users?search=bob.&amp;nbsp; But then what if I want to find a particular Bob who is younger than 26 and has a GMail address?&lt;br /&gt;&lt;br /&gt;The "simple" explanation just blew up in my face because there's a gaping hole in the map.&amp;nbsp; Nobody even labels it "Here be Dragons" because we're all optimists.&lt;br /&gt;&lt;br /&gt;The post on Reddit today was focused on versioning.&amp;nbsp; There are those who think a "REST API" should never contain a version embedded in the URL, and apparently I just have to hope that nobody anywhere is running an old, incompatible client when that shiny new version is rolled out on the same URL.&amp;nbsp; If I changed the URL, that's like putting a version &lt;i&gt;in the URL,&lt;/i&gt; which is forbidden.&lt;br /&gt;&lt;br /&gt;This discussion also went off into whether it was better to version by asking for "Accept: application/vnd.my-api.user+v3" or something like that, instead of text/xml.&lt;br /&gt;&lt;br /&gt;Speaking of formats, when an API provides both XML and JSON (and probably &lt;a href="http://sapphirepaw.blogspot.com/2010/08/on-xml-and-data-formats.html"&gt;does one badly&lt;/a&gt;), that invokes more argument around format specification.&amp;nbsp; Accept header?&amp;nbsp; format=json query parameter?&amp;nbsp; And of course there's the (old?) Rails way of /users/get/31.json.&amp;nbsp; And hey, that "get" looks like a verb in the URL, which isn't RESTful.&lt;br /&gt;&lt;br /&gt;And then there's the "use http" ethos.&amp;nbsp; If GET /users/superman is invoked with authorization (HTTP auth or session cookie (speaking of which... that's &lt;i&gt;state&lt;/i&gt; and that's &lt;i&gt;bad,&lt;/i&gt; mkay?)) and the user is not allowed to see that particular user, what code do you return... and how is it differentiated from "your session has timed out"?&amp;nbsp; Do you start adding random custom headers like X-Session: timed-out?&lt;br /&gt;&lt;br /&gt;Let me not forget conflicting edits.&amp;nbsp; Most systems solve this by returning some sort of version identifier for an item, then requesting that identifier to be returned on save.&amp;nbsp; If another client has sent in an edit, then the identifier doesn't match what's stored on the server, and the edit is refused.&amp;nbsp; Isn't RPC roundly criticized for relying on server state?&amp;nbsp; Isn't the version tag inherently shared state?&amp;nbsp; Sure, REST may call for sending the entire state around in every request/response, necessary or not, but I really don't see a difference between sending an EditSequence to QBWC over SOAP, which is as RPC as you can get (until the world decided "let's use all document-literals all the time"), and sending a version tag back to bugzilla so it can detect mid-air collisions.&amp;nbsp; Well, at least we have HTTP 409!&lt;br /&gt;&lt;br /&gt;&lt;b&gt;If this is all supposed to be so simple, why do we have all these arguments?&lt;/b&gt;&amp;nbsp; And if Roy Fielding knows everything about REST, maybe he should drop by reddit and enlighten us once and for all.&lt;br /&gt;&lt;br /&gt;Actually, for five minutes.&amp;nbsp; This &lt;i&gt;is &lt;/i&gt;the Internet.&lt;br /&gt;&lt;br /&gt;My answer to the question I boldly asked is that this stuff &lt;i&gt;isn't &lt;/i&gt;easy nor clear-cut.&amp;nbsp; Programming is about wrestling the dragon on the path of implementation, and &lt;i&gt;all &lt;/i&gt;of the paths have a dragon of some sort.&amp;nbsp; Pretty much anything can be "made to work," which is why most seasoned programmers hate everyone else's code—working code is not necessarily easy to understand, and nothing interesting is universally easy (since &lt;a href="http://www.infoq.com/presentations/Simple-Made-Easy"&gt;easy is subjective&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;I think I must be reading the wrong part of the Internet again, because I'm wishing for less talk of "what makes something RESTful", and more focus on real-world API design considerations.&amp;nbsp; Especially the disadvantages of each approach.&amp;nbsp; Nobody wants to talk weakness; it gets left to be rediscovered over and over by those that walk the same road.&lt;br /&gt;&lt;br /&gt;I think, in essence, the trouble with &lt;i&gt;X&lt;/i&gt; is that nobody talks about the disadvantages of &lt;i&gt;X.&lt;/i&gt;&amp;nbsp; Today, REST just happens to be filling in for X.**&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;* BTW, when did Service Oriented Architecture start meaning WS-* and J2EE?&amp;nbsp; I guess I'd better start calling it API Oriented.&lt;br /&gt;&lt;br /&gt;** There's no conclusion here; this is a rant.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7360363610865153054?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7360363610865153054/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7360363610865153054&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7360363610865153054'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7360363610865153054'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/10/trouble-with-rest.html' title='The Trouble with REST'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7365106896728960298</id><published>2011-10-19T15:36:00.000-04:00</published><updated>2011-10-19T15:40:26.247-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='admin'/><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><title type='text'>Notes on using mysqlbinlog for copying updates</title><content type='html'>I commented &lt;a href="http://geehwan.posterous.com/moving-a-production-mysql-database-to-amazon"&gt;on this post&lt;/a&gt;, but for posterity:&lt;br /&gt;&lt;blockquote&gt;It seems by sheer luck  that I stumbled over a way to take care of everything.  I save a copy of  the interpreted binlog as it files through the pipe:&lt;br /&gt;&lt;br /&gt;mysqlbinlog ... | tee binlog-play.sql | mysql ...&lt;br /&gt;&lt;br /&gt;Then  if I get an error message, mysql will tell me e.g. "Error ... at line  42100". Running "vim +42100 binlog-play.sql" lets me inspect the stream  to see what went wrong in detail.&lt;br /&gt;&lt;br /&gt;Inside binlog-play.sql, the "#at  112294949" comments can be used in e.g. "--start-position=112294949" to  the next mysqlbinlog command, to retry the statement after I fix the  problem.  (Alternatively end_pos seems to tell the position of the next  command, if I need to skip the one which failed, e.g. I was testing out  CREATE FUNCTION and it was logged as "CREATE DEFINER=... FUNCTION" which  RDS refuses.)&lt;br /&gt;&lt;br /&gt;The final piece of the puzzle is that executing  "FLUSH LOGS;" or "mysqladmin flush-logs" will push mysqld on to the next  binlog file, so you can safely play out the one you want.  Once you've  finished processing a file through mysqlbinlog, you can just remember  the file boundary, and flush mysql's logs if you want to process the one  it's presently writing to.&lt;/blockquote&gt;This is in regards to piping mysqlbinlog output from one mysql server into the mysql client to execute on another; the post I linked above discusses doing so for switching to Amazon RDS. &amp;nbsp;The basic strategy is to minimize downtime by loading a database dump from the source on the destination, then use mysqlbinlog on the source and the mysql client to feed updates from the source to the destination. &amp;nbsp;The updates can be faster to load than a new dump; and when it's time to switch servers, it's a matter of stopping database clients, turning off the source mysqld, sending the final binlog updates, pointing the clients to the destination server, and turning the clients back on. &amp;nbsp;As opposed to waiting for a whole dump to load while the clients are off.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7365106896728960298?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7365106896728960298/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7365106896728960298&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7365106896728960298'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7365106896728960298'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/10/notes-on-using-mysqlbinlog-for-copying.html' title='Notes on using mysqlbinlog for copying updates'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-14365001899959205</id><published>2011-10-18T19:38:00.000-04:00</published><updated>2011-10-19T15:37:49.585-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='explanation'/><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><category scheme='http://www.blogger.com/atom/ns#' term='unicode'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='character sets'/><title type='text'>Character Sets, Encodings, MySQL, and your data</title><content type='html'>I'm currently moving data from a (relatively old now) MySQL 5.0 server into Amazon &lt;acronym title="Relational Database Service"&gt;RDS&lt;/acronym&gt;.&amp;nbsp; I've been here before, when I was moving data from MySQL 4.x into 5.0 and mangling character sets.&amp;nbsp; This time, I want to make 100% sure everything comes across with maximum fidelity, and also get the character encoding as stored to be labeled correctly in MySQL.&lt;br /&gt;&lt;br /&gt;First, a quick definition or two:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Character Set: a specific table to translate between characters and numbers. &amp;nbsp;Example: ASCII defines characters for numbers 0-127; "A" is 65.&amp;nbsp; This can also be described as "a set of characters, and their corresponding representation inside the computer."&lt;/li&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Character_encoding"&gt;Character Encoding&lt;/a&gt;: a means of "packing" numbers from the character set into a container. &amp;nbsp;Example: UTF-8. &amp;nbsp;The Unicode character 0x2013 becomes 0xE2,80,99. The "E" signifies "Part 1 of 3", and part of the remaining bytes simply indicate "Continued"; the 0x2013 is then divided up to fit in the parts of the bytes that aren't indicating their "Part 1" or "Continued" status.&amp;nbsp; In the specific case of UTF-8, the encoding is designed so that the ASCII range 0-127 (0x00-7F) is encoded without change: a leading 0-7 means "Part 1 of 1".&lt;/li&gt;&lt;li&gt;8-bit character encoding: In older, simpler days, character sets defined only as many characters as could fit in 8 bits, and defined the encoding as simply the numbers. &amp;nbsp;Character number 181 would encode as a byte (8 bits) with value 181.&lt;/li&gt;&lt;li&gt;A character encoding implies the associated character set, because the encoding defines how numbers in its character set become individual bytes.&amp;nbsp; How characters in other sets would be encoded is left undefined and basically impossible.&lt;/li&gt;&lt;/ul&gt;This last point is why MySQL lets you set "character sets" to UTF-8, though the latter is an encoding.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;Now, I'm staring at a database where the actual bits that make up the data are encoded in &lt;a href="http://en.wikipedia.org/wiki/UTF-8"&gt;UTF-8&lt;/a&gt;, but MySQL thinks they are in &lt;a href="http://en.wikipedia.org/wiki/ISO/IEC_8859-1"&gt;Latin-1&lt;/a&gt;. &amp;nbsp;Whenever anything connects to the database, the character set is always Latin-1, so MySQL does no translation.&amp;nbsp; Nor does the code serving our web page, preferring instead to send it as-is to the user.&amp;nbsp; In the end, the HTML page is labeled as UTF-8, so the untranslated UTF-8 bytes display correctly in the browser.&lt;br /&gt;&lt;br /&gt;The problem is, every database backup that I have taken might be secretly broken: mysqldump is Unicode-aware, and left to its own devices (no --default-character-set given), it sets the connection into UTF-8 mode and reads out the data. &amp;nbsp;This means the bits that end up in the file are double-encoded: when coming across a left-curly-quote, the server says, "Hmm, the client is asking for UTF-8 and I have Latin-1. &amp;nbsp;I'll need to translate these 0xE2,80,9C bytes here into UTF-8 and send it that way."&lt;br /&gt;&lt;br /&gt;What comes out on disk?&amp;nbsp; I checked a hex dump, and it's 0xC3,A2,E2,82,AC,C5,93.&amp;nbsp; Viewed as UTF-8, that byte stream is A-circumflex, Euro, and lowercase OE ligature. &amp;nbsp;This is clearly impossible, since Latin-1 predated the creation of the Euro, and it's &lt;a href="http://en.wikipedia.org/wiki/ISO/IEC_8859-15"&gt;Latin-15&lt;/a&gt; that contains it; it's &lt;b&gt;double&lt;/b&gt; impossible because the Euro character in Latin-15 is 0xA4 which has nothing to do with any of the bytes we were trying to encode.&lt;br /&gt;&lt;br /&gt;However, in &lt;a href="http://en.wikipedia.org/wiki/Windows-1252"&gt;Windows-1252&lt;/a&gt;, the 0x80 byte is the Euro symbol, which can only mean: &lt;b&gt;when MySQL says "latin1", it is actually interpreting the data as Windows-1252.&lt;/b&gt;&amp;nbsp; But it's not so bad, because the entire transformation is completely reversible (as long as latin1 continues to mean the same thing)... but what about the right-curly-quote, which ends in 0x9D, whose value is not defined in Windows-1252? &amp;nbsp;The "9D" value is just assumed to map to Unicode character 9D, so 0xC5,93 is replaced by 0xC2,9D.&lt;br /&gt;&lt;br /&gt;What character is that, anyway?&amp;nbsp; It's OSC, the Operating System Command, part of the C1 control set and a control sequence introducer.&amp;nbsp; &lt;i&gt;Headdesk!&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I think this still works okay for me, because the exact reverse happens going back into Windows-1252.&amp;nbsp; At least, on the same server version and whatever Unicode libraries it's using.&amp;nbsp; Nobody tries to interpret the OSC anyway, but it still looks odd to find a "?" showing where there actually is a valid UTF-8 sequence in the source.&lt;br /&gt;&lt;br /&gt;Now that I knew all this, I could get a database dump out in an unconverted format, by including &lt;b&gt;--default-character-set=latin1&lt;/b&gt; on the&amp;nbsp; mysqldump command line.&amp;nbsp; The only problem was, this would give me a file which was exactly like the database: it contained UTF-8 data labeled as Latin-1.&amp;nbsp; Loading it into a MySQL server would recreate the same situation as before.&lt;br /&gt;&lt;br /&gt;My solution, found nowhere on the Internet, and which would probably make a lot of purists scream at me, was to take two dumps: one with &lt;b&gt;--no-data&lt;/b&gt; that I piped through &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;sed -e s/latin1/utf8/g&lt;/span&gt; to change the labeling, and one with &lt;b&gt;--no-create-info --no-create-db&lt;/b&gt; in which I edited the &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;SET NAMES latin1&lt;/span&gt; by hand (to utf8, of course), to avoid editing the "latin1" in actual data when I was talking about the problem in the bug database.&lt;br /&gt;&lt;br /&gt;Then I did some additional sleuthing and fixed the non-binary non-UTF-8 data which was lying around in the dump, and it all imported into RDS fine.&lt;br /&gt;&lt;br /&gt;As a possible point of interest, it took about 10 minutes and 11 seconds to load the 351 MB data-only dump into a small DB instance.&amp;nbsp; (Work has a symmetrical 8 Mbps connection, so it wasn't necessarily network-bound.)&amp;nbsp; It did take a noticeable amount of time to load the 1 MB of schema, but unfortunately I didn't expect that to be difficult, so I don't have timing for it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-14365001899959205?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/14365001899959205/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=14365001899959205&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/14365001899959205'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/14365001899959205'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/10/character-sets-encodings-mysql-and-your.html' title='Character Sets, Encodings, MySQL, and your data'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-3303276770207215284</id><published>2011-10-11T09:30:00.001-04:00</published><updated>2011-10-25T21:00:06.666-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='future'/><category scheme='http://www.blogger.com/atom/ns#' term='meta'/><title type='text'>iPad vs. Tablet PC</title><content type='html'>One of them succumbed to &lt;a href="http://headrush.typepad.com/creating_passionate_users/2006/01/death_by_riskav.html"&gt;death by risk-aversion&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;One of them couldn't let go of the tether and fly.&lt;br /&gt;&lt;br /&gt;I think Linus said the same of svn: paraphrased, "If you're trying to make 'a better CVS' then you have already lost, because CVS is too broken to fix."&lt;br /&gt;&lt;br /&gt;Hey, sapphirepaw: make sure what you do is good on its own, not "an X only &lt;a href="http://www.dartlang.org/"&gt;different&lt;/a&gt;".&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-3303276770207215284?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/3303276770207215284/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=3303276770207215284&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/3303276770207215284'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/3303276770207215284'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/10/ipad-vs-tablet-pc.html' title='iPad vs. Tablet PC'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-3368606970703334500</id><published>2011-10-08T10:12:00.002-04:00</published><updated>2011-10-08T10:13:18.506-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='future'/><category scheme='http://www.blogger.com/atom/ns#' term='meta'/><category scheme='http://www.blogger.com/atom/ns#' term='apple'/><title type='text'>Steve Jobs</title><content type='html'>I'm getting old: if I were to pass on at the same age Jobs did, my life would be more than half over already.&lt;br /&gt;&lt;br /&gt;What separates me from Jobs? &amp;nbsp;There's the matter of leverage, where he could take his vision and coordinate the prototyping and development of it, into the iPod, the iPhone, the Macbook Air, the iPad. &amp;nbsp;There's also the matter of &lt;i&gt;having vision.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;In 2006 or so, I beheld my first iPod in real life, an old (FireWire based) model with a physical click-wheel. &amp;nbsp;In 2008 I picked up a different, small MP3 player and for the first time, &lt;i&gt;immediately noticed &lt;/i&gt;the limitations of digital control. &amp;nbsp;Without having handled the iPod and getting a feel for the analog response of the wheel, I probably wouldn't have given the buttons a second thought. &amp;nbsp;Do you want to scroll on the generic? &amp;nbsp;Click-click-click-click. &amp;nbsp;Or click-and-hold, guess at how long you need to go (since the screen is slow enough to be unreadable at this scrolling speed, and they don't slow updates to compensate), and release.&lt;br /&gt;&lt;br /&gt;The point here is, Jobs saw humans as inherently analog, and adapted all of his machines to analog control. &amp;nbsp;It's a simple thing, but Jobs was apparently &lt;i&gt;devoted&lt;/i&gt; to HCI. &amp;nbsp;The "vision" simply falls out of that.&lt;br /&gt;&lt;br /&gt;It's not like the limitations of digital control weren't apparent in the 1980s. &amp;nbsp;Compare Rad Racer to a real car's steering wheel. &amp;nbsp;Anyone focused on "how it feels" could have been Jobs back then, inventing 2010 in the 16-bit era instead of carrying 8-bit paradigms through the 1990s. &lt;br /&gt;&lt;br /&gt;In contrast, I seem to lack vision because I'm busy implementing arbitrarily complex business rules at work, and staying away from the bleeding edge of gadgetry. &amp;nbsp;I'm not in the consumer space; I'm not taking any research toward the consumer space; and I'm not thinking about what's next for it, either (at least, not beyond what turns out to actually be the next thing*.)&amp;nbsp; But, I'm also having little impact on the wider world, writing code that never leaves the house.&amp;nbsp; It's &lt;i&gt;important, &lt;/i&gt;but after I am gone, will these be the best years of my life?&amp;nbsp; Will &lt;i&gt;I&lt;/i&gt; think college was the best time of my life, forever?&lt;br /&gt;&lt;br /&gt;I think it's time to put my free time to better use and &lt;i&gt;do something&lt;/i&gt; instead of watching the world slowly develop towards Jobs' vision on its own.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;* I have a dead draft  which discusses the crazy idea of "having a set-top box inside the remote" in 2006 or so.&amp;nbsp; It then points out that h.264-over-wifi ought to handle the bandwidth to do exactly that from your iPhone now.&amp;nbsp; It starts fleshing out what would be necessary to make it happen, then abruptly ends with a note: "Two days after I started writing this, Apple announced AirPlay."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-3368606970703334500?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/3368606970703334500/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=3368606970703334500&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/3368606970703334500'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/3368606970703334500'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/10/steve-jobs.html' title='Steve Jobs'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-2399319367837593055</id><published>2011-10-06T12:39:00.000-04:00</published><updated>2011-10-06T14:19:42.306-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='admin'/><category scheme='http://www.blogger.com/atom/ns#' term='recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='ec2'/><title type='text'>Setting Everything on Fire</title><content type='html'>I created a new user, gave them wheel group, and in case I needed another admin user, added %wheel to sudoers through visudo. &amp;nbsp;Then, I was trying to do more stuff, and...&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;[sudo] password for ec2-user: _&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Wait. &amp;nbsp;What? &amp;nbsp;Not only does ec2-user &lt;i&gt;have no password,&lt;/i&gt; but I didn't change its NOPASSWD line in sudoers.&lt;br /&gt;&lt;br /&gt;It turns out that ec2-user is also in group wheel, and when confronted with the two permission sets, sudo did what I didn't mean: applied the %wheel rule and started requiring passwords for ec2-user. &amp;nbsp;Of course su was no help either: root likewise has no password set, because you have sudo as ec2-user....&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;Thus began the adventure. &amp;nbsp;I hacked around a little bit, but setgroups(2) is a privileged call, and thus can't be used to drop supplemental groups if you're not root. &amp;nbsp;My hope was that I could use it to shed the wheel group to make sudo skip the problematic %wheel rule, but the reality was not to be.&lt;br /&gt;&lt;br /&gt;I did some &lt;a href="http://www.vlent.nl/weblog/2010/09/06/locked-myself-out-root-account-ec2-ubuntu-instance/"&gt;reading&lt;/a&gt; on how to recover this the hard way, and worked my way through it. &amp;nbsp;Apparently &lt;b&gt;by a massive amount of concentrated luck, I attached the volume with the busted sudoers &lt;i&gt;after &lt;/i&gt;booting the new instance&lt;/b&gt; (a second one, in the matching availability zone: the volume in 1b didn't offer the choice of attaching to the instance in 1a), and fixed sudoers.&lt;br /&gt;&lt;br /&gt;Next, I stopped the new instance and reattached the EBS volume to my old one.&lt;br /&gt;&lt;br /&gt;Then the old instance wouldn't boot. &amp;nbsp;The AWS management console claimed it was running, but it was dead to the world. &amp;nbsp;The system log was empty (a handful of spaces). &amp;nbsp;So I detached its volume, put it back on the new instance, started it up, and things got really confusing: the system log from the old volume had events from the new system on it. &amp;nbsp;Eventually, I sorted this out: someone (AWS or the kernel) sees a disk labeled "/" and boots off of it, or mounts root there; since "LABEL=/" is listed as the root device in whatever /etc/fstab is in use, and both volumes have that label, it's possible that the kernel just picks one to resolve the conflict.&lt;br /&gt;&lt;br /&gt;So apparently the old instance wouldn't boot, yet the new instance &lt;i&gt;would&lt;/i&gt; boot with the old instance's disk at &lt;b&gt;sdf1&lt;/b&gt; mounted as root, though its boot device was &lt;b&gt;sda1&lt;/b&gt;; even after some hackery, when sda1's fstab pointed specifically at sda1, the new instance ended up with the old volume visible as the root filesystem. &amp;nbsp;Confusing as this was, it was at least &lt;i&gt;after &lt;/i&gt;I had fixed sudoers, so I could poke around the system as root, since the old volume's sudoers file was in effect by this point.&lt;br /&gt;&lt;br /&gt;I never did solve the mystery of why the old instance wouldn't boot. &amp;nbsp;The new instance booted with its new volume detached and the old volume mounted at sda1, so I ended up deleting the new volume and old instance and calling it good enough.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-2399319367837593055?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/2399319367837593055/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=2399319367837593055&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2399319367837593055'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2399319367837593055'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/10/setting-everything-on-fire.html' title='Setting Everything on Fire'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-5442179753898933447</id><published>2011-09-09T20:47:00.001-04:00</published><updated>2011-12-16T23:05:34.492-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='war story'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><category scheme='http://www.blogger.com/atom/ns#' term='php'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>War Story: The Training Jump</title><content type='html'>One of the first things I did at my current job was to rewrite a Perl/CGI (the module, and the actual cgi-script execution model) site into PHP.&amp;nbsp; Part of this site implemented a single-signon (SSO) system for a partner site that hosts our training videos.&amp;nbsp; Clicking the link led to the innocuously-named "training_jump.pl" CGI script.&lt;br /&gt;&lt;br /&gt;The goal in life of the training_jump is to redirect a user to the partner site, passing along the username and email address.&amp;nbsp; The partner site creates the user if necessary, starts a session for them on its server, and ultimately displays the actual training content.&lt;br /&gt;&lt;br /&gt;Inside training_jump is an innocuous-looking "use OAuth::Lite;" line.&amp;nbsp; I didn't know what OAuth was at the time, so of course I went and looked it up: OAuth is designed to let a site like ExampleMashup authenticate someone as "twitter user chris" without needing to ask chris directly for their twitter password.&amp;nbsp; Of course, this makes no sense, because in our case, &lt;i&gt;we possess the account&lt;/i&gt;, not our partner.&amp;nbsp; Likewise, once the login is complete, the user should &lt;i&gt;end at our partner's site&lt;/i&gt; rather than our own.&amp;nbsp; We have nothing to use the oauth token for, because we don't perform any operation at the partner site aside from the login.&lt;br /&gt;&lt;br /&gt;Yet here inside training_jump was OAuth.&amp;nbsp; The user hit training_jump; we redirected to the partner by IP address (!) with the OAuth request token, all the necessary user data, a callback URL (training_jump again); they duly redirected; we collected the response token and redirected the user back to the partner with that token as the parameter.&amp;nbsp; The end result is still kind of fragile, in that AFAIK, it only works in the first browser you sign up with.&amp;nbsp; If you log in with Firefox, then try it in Chrome, the latter gives you an error somewhere along the line instead of videos.&lt;br /&gt;&lt;br /&gt;IIRC, research at the time indicated that there was no good PHP OAuth library, and/or the suitable libraries didn't implement the exact flavor of OAuth API that was being used by the Perl code.&amp;nbsp; I'm absolutely certain I considered replacing the Perl entirely, but I don't remember why I rejected PHP OAuth as a solution.&lt;br /&gt;&lt;br /&gt;I couldn't simply continue using training_jump as-is, because the CGI module and PHP store their session data in different locations, in different formats.&amp;nbsp; The username in the PHP session wouldn't be accessible to pass through the authentication dance, and it was clearly inadvisable to modify training_jump to accept a username as a URL parameter.&lt;br /&gt;&lt;br /&gt;Nowadays, training_jump has been succeeded by the cleverly named training_jump2, which actually reads request variables on stdin and produces an answer on stdout.&amp;nbsp; (The format of this text is much like LiveJournal's ancient API, from back when I had a LiveJournal client.&amp;nbsp; There was no convenient interchange format, as the Perl code didn't have JSON installed at the time and PHP didn't have XML.&amp;nbsp; "Lightweight" eats you again.)&amp;nbsp; The PHP training_jump manages the connection between server environment and training_jump2, and training_jump2 simply had its server environment replaced with communicating over pipes.&lt;br /&gt;&lt;br /&gt;We're in the negotiation phase of moving to the provider's newer platform, which has a proper, encrypted SSO system.&amp;nbsp; training_jump2 is slated to become irrelevant, eventually.&amp;nbsp; In the meantime, it's the only bit of Perl CGI that never made the jump to mod_php.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-5442179753898933447?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/5442179753898933447/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=5442179753898933447&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/5442179753898933447'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/5442179753898933447'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/09/war-story-training-jump.html' title='War Story: The Training Jump'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-1791622731336676821</id><published>2011-09-04T10:35:00.000-04:00</published><updated>2011-09-04T18:32:39.056-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='explanation'/><category scheme='http://www.blogger.com/atom/ns#' term='references'/><category scheme='http://www.blogger.com/atom/ns#' term='pointers'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='php'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='c'/><title type='text'>sapphirepaw's Introduction to Pointers, version 2</title><content type='html'>I programmed in assembly for some time, using pointers without understanding what they were, or that they were called pointers.&amp;nbsp; When I finally got to learning C, the pointer syntax was downright inscrutable, but when I got it, suddenly &lt;i&gt;all of C &lt;b&gt;and &lt;/b&gt;all of assembler &lt;/i&gt;laid clear before me, all at once.&amp;nbsp; It was a beautiful thing.&lt;br /&gt;&lt;br /&gt;I was reminded about this while reading &lt;a href="http://dave.fayr.am/posts/2011-08-19-lets-go-shopping.html"&gt;this post&lt;/a&gt; from HN.&amp;nbsp; It inspired me to try explaining pointers from the opposite direction.&amp;nbsp; Instead of trying to teach pointers via C syntax, let me try to start with pointers outside of programming, then discuss them in relation to C and PHP.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;h3&gt;Pointers IRL: use and mention&lt;/h3&gt;A distinction can be made between &lt;i&gt;using &lt;/i&gt;a word and &lt;i&gt;mentioning&lt;/i&gt;&amp;nbsp;it. &amp;nbsp;Consider the sentence, "The password is incorrect." &amp;nbsp;If it is saying that the wrong password has been given, then 'incorrect' is being used; if it is saying that the word 'incorrect' is the password, then incorrect is being mentioned. &amp;nbsp;When mentioned, the word is not used for its meaning, but for the word itself.&lt;br /&gt;&lt;br /&gt;In effect, a word is a pointer to its definition. &amp;nbsp;Normally, a word is used, so there is no written convention to signify use. &amp;nbsp;We just write the words. &amp;nbsp;When mentioned, the word is typically quoted or italicized, to indicate the mention and separate it from a mundane use. &amp;nbsp;The example sentence would then become, "The password is 'incorrect'."&amp;nbsp; In speech we don't have quotation marks, except for the much-maligned air-quotes, so inflection and timing are used instead: "The password &lt;b&gt;is... &lt;/b&gt;incorrect."&lt;br /&gt;&lt;br /&gt;Steve Yegge has a talk somewhere on branding, and how brands are also pointers. &amp;nbsp;They're like any other word, except that they have to be distinctive names, and the company associates this name to their specific product. &amp;nbsp;Thus, when anyone says 'Dr. Pepper,' there is a single product that comes to mind. &amp;nbsp;Yet there's nothing specific to branding that makes a brand into a pointer, without applying to other words. &amp;nbsp;The only difference is that a brand's word—its pointer—is invented so that the owner can have exclusive control over the definition, instead of the multiple definitions that ordinary words commonly have.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Variables in Memory&lt;/h3&gt;Our next stop on this trip is to examine the structure of the average C program. &amp;nbsp;Everything the program is working on has to be held in memory. &amp;nbsp;This consists of the actual program instructions, an area called the heap for large or long-lived data that the program needs to handle, and a dual-purpose area known as the stack. &amp;nbsp;The stack's primary purpose is for holding return addresses, so that when a function is called, the place to return to is stored on the stack, and is read when the function returns in order to access the next instruction in the caller. &amp;nbsp;The nature of the stack also makes it possible for individual functions to store small, temporary data there.&lt;br /&gt;&lt;br /&gt;In order to catch runaway programs, the stack often has a limited size. &amp;nbsp;It may only be possible to store 8 MB there, while the heap may hold 1,000 MB or more.&lt;br /&gt;&lt;br /&gt;To use the value stored in memory, a variable has an address. &amp;nbsp;At some level, every variable is a pointer, pointing to its associated storage. &amp;nbsp;Normal use of a variable, as in "&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;c = a+b&lt;/span&gt;", reads or modifies the contents of memory at the variable's address, like normal uses of words rely on their definitions.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Function Calls&lt;/h3&gt;What happens when a variable is passed to a function? &amp;nbsp;It's &lt;i&gt;used &lt;/i&gt;in the call, so its contents are retrieved, then given to the function, which reserves its own storage for its parameters, then places the value it received into its respective new address. &amp;nbsp;Any use of the variable within the function uses this new address, and the caller's copy is unaffected because it's stored at a different address.&lt;br /&gt;&lt;br /&gt;If the stack is a small pile of index cards, which can be taken, written on in pencil, erased, and then returned in order, then calling a function takes an index card for each parameter, copies the value from the given parameter onto the card, and gives the card to the function.&lt;br /&gt;&lt;br /&gt;This works fairly well for small data, like numbers, but what if we loaded a copy of &lt;i&gt;War and Peace&lt;/i&gt;&amp;nbsp;into the heap, and wanted to find the length of it? &amp;nbsp;It's long enough to take a lot of time to copy onto a series of index cards, and we may not even have enough cards available to do so. &amp;nbsp;After all, the stack is limited in size. &amp;nbsp;It would be ideal if we could simply tell the function where to find the text, instead of trying to send it a copy of the text itself.&lt;br /&gt;&lt;br /&gt;If we could attach a cord to the text, and the other end to an index card, then the function could follow that cord to get to the text, without us having to copy it anywhere. &amp;nbsp;And, it would only use one of our cards, instead of consuming a huge amount of stack space.&lt;br /&gt;&lt;br /&gt;This is essentially what a pointer is. &amp;nbsp;If we pass the &lt;i&gt;address&lt;/i&gt;&amp;nbsp;of the start of the text, then the called function (the callee) can go directly to that address and start reading the text to find its length. &amp;nbsp;In this case, we are now &lt;i&gt;mentioning &lt;/i&gt;where the text is located, instead of &lt;i&gt;using&lt;/i&gt;&amp;nbsp;the whole text itself.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;char *: Your First Pointer&lt;/h3&gt;This is exactly how it works in C. &amp;nbsp;In fact, C is famous for not having a specific "string" type, because strings are treated as arrays of characters, with an array being a continuous series of addresses all holding the same type. &amp;nbsp;A string begins at some starting address, then continues until some address whose content is 0, also known as the '\0' character, which is simply pre-defined to represent the end of a string.&lt;br /&gt;&lt;br /&gt;C's string is represented to the programmer as an array of characters, but pointers and arrays are practically equivalent in C. &amp;nbsp;Thus, &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char[]&lt;/span&gt; and &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char *&lt;/span&gt;&amp;nbsp;more or less represent the same type: a variable which holds the address of a character. &amp;nbsp;When code operates on a string, it assumes that more characters follow, according to the convention above.&lt;br /&gt;&lt;br /&gt;Therefore, when we call &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;strlen(war_and_peace)&lt;/span&gt;, we give it the address of &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;war_and_peace&lt;/span&gt; rather than the text itself (because as a string, &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;war_and_peace&lt;/span&gt; already is a pointer to the text, the address of the start of the text); and for its part, strlen expects this, and &lt;i&gt;dereferences&lt;/i&gt;&amp;nbsp;the pointer it receives to work with the data. &amp;nbsp;It actively converts the mention into a use.&lt;br /&gt;&lt;br /&gt;Yet, there's no reason that pointers &lt;i&gt;must&lt;/i&gt; point to primitive values, like "&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char&lt;/span&gt;" in "&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char *&lt;/span&gt;".&amp;nbsp; A pointer's contents can be another pointer. &amp;nbsp;If you have a pointer-to-char, you have an array of char, which is a string; it follows that pointer-to-pointer-to-char is pointer-to-string, which is an array of string. &amp;nbsp;Notably, this is how command-line arguments appear to a program. &amp;nbsp;(Although it's still easier to reason about as &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char*[]&lt;/span&gt; than &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char**&lt;/span&gt; for me, even though the effect is equivalent.)&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Use and Mention in C&lt;/h3&gt;There are two pointer-related functions of the &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;*&lt;/span&gt; operator in C. &amp;nbsp;In a declaration, it tells how many levels of pointers are needed to reach the final variable contents. &amp;nbsp;Outside of a declaration, which I'll refer to as "as an instruction", it accesses the content of a pointer, yielding the pointed-to type. &amp;nbsp;Both of these may be stacked, as we saw with &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char**&lt;/span&gt; for declarations.&amp;nbsp; For instructions, a function with a signature of &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;int strlen(char *s)&lt;/span&gt; would access the character at &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;s&lt;/span&gt; with &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;*s&lt;/span&gt;; the first letter of the first string in&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char **argv&lt;/span&gt; could be accessed with &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;**argv&lt;/span&gt;. &amp;nbsp;Likewise, &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;*argv&lt;/span&gt; would yield the entire first argument as a string (since it would move from &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char**&lt;/span&gt; to &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char*&lt;/span&gt; as a type), for instance if you wanted to find &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;strlen(*argv)&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;There is one more pointer-related operator, &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;&amp;amp;&lt;/span&gt;. &amp;nbsp;Given a variable, instead of using its value, we can mention it (retrieving its address) with the ampersand. &amp;nbsp;Consider a function is defined as &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;int get_config_int(char* name, int* out)&lt;/span&gt; which returns a success or error code as its return value, and on success writes the value associated with &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;name&lt;/span&gt; into &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;*out&lt;/span&gt;. &amp;nbsp;If the caller has some variable defined as &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;int mem_limit&lt;/span&gt;&amp;nbsp;which it would like to use, it can use the &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;&amp;amp;&lt;/span&gt; operator to get the address of &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;mem_limit&lt;/span&gt; to pass to &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;get_config_int&lt;/span&gt;, as in: &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;ok = get_config_int("max_mem", &amp;amp;mem_limit);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Unlike &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;*&lt;/span&gt;, &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;&amp;amp;&lt;/span&gt; can't be repeated. &amp;nbsp;A normal variable has an address; but the address does not have its own place to be stored. &amp;nbsp;It can be stored into some other variable of the appropriate type, and that variable can have its address taken, but this is simply returning the address of the second variable, not the address of an address.&lt;br /&gt;&lt;br /&gt;I've been using the terms already, but to keep things as clear as possible, in a declaration, &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;char **argv&lt;/span&gt; reads literally as "char-pointer-pointer", though I usually swap the order to get "pointer-to-pointer-to-char". &amp;nbsp;In either case, this can be translated to "string array" as well, but I think of that as a translation rather than how it appears in the code. &amp;nbsp;In an instruction, &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;*out&lt;/span&gt; might be read as "value-at-out", though I'm used to C enough to just read it as "star-out" and know what it means. &amp;nbsp;Finally, &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;&amp;amp;mem_limit&lt;/span&gt;&amp;nbsp;is conveniently said as "address-of-mem_limit".&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Thinking in Pictures&lt;/h3&gt;It helps, especially when building larger data structures, to draw out the relationship between variables and pointers in a diagram. &amp;nbsp;The usual way is a box-and-arrow diagram, in which the&amp;nbsp;boxes represent storage space. &amp;nbsp;Individual boxes may be named, if they are a variable. &amp;nbsp;The boxes contain either a primitive value or a pointer, the latter of which is represented by an arrow leading to the box it points to.&lt;br /&gt;&lt;br /&gt;Here's an example with a hypothetical &lt;span style="font-family: 'Courier New',Courier,monospace;"&gt;char **argv&lt;/span&gt;, for a program named "hello" run with one argument of "there".&amp;nbsp; I've additionally chosen to list types beneath the boxes to help demonstrate how a star is consumed when a pointer is followed.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-RjZWuLw8p6Y/TlBYAqV0uAI/AAAAAAAAAKA/gYWFLyNoMLE/s1600/08-20-argv-pointers.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="261" src="http://2.bp.blogspot.com/-RjZWuLw8p6Y/TlBYAqV0uAI/AAAAAAAAAKA/gYWFLyNoMLE/s400/08-20-argv-pointers.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Note that the two actual strings, and the intermediate array, don't actually have names: they're accessed using the argv variable and some combination of dereferencing (another technical name for following the pointer, converting its mention to a use). &amp;nbsp;Simply using the &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;*&lt;/span&gt; operator to follow the pointer doesn't let us reach "sideways" into the arrays, though—for that, the natural way is to use e.g. &lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;argv[1]&lt;/span&gt; to access the string "there".&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Not Explaining C&lt;/h3&gt;There are a number of details I've forced myself to leave out, since I'm trying to make this more about pointers and less about C. &amp;nbsp;This includes pointer arithmetic, a long discussion of equivalence between pointer arithmetic and array syntax, when and why you use pointers and pointers-to-pointers, and so forth. &amp;nbsp;These things are simply beyond the scope of this article.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Pointers Meet PHP&lt;/h3&gt;In PHP, pointers don't exist as a first-class concept. &amp;nbsp;However, a few things are built on the same ideas, which tends to make the manual complicated when it doesn't want to use the word "pointer" for any of them.&lt;br /&gt;&lt;br /&gt;Let us first consider the case of a humble variable. &amp;nbsp;A PHP variable name is a string pointing to a value. &amp;nbsp;In contrast to C, where a variable has a type, it is the value which carries the type in PHP.&amp;nbsp; (This is what lets PHP be dynamically typed: a name may point to different values over time, and each value may be of any type.)&amp;nbsp;  Another difference to C is that PHP always uses values through a name, so it doesn't provide a way to access the address of a value.&amp;nbsp; If you want to use it, you need one of its names.&lt;br /&gt;&lt;br /&gt;References in PHP are simply pointing a second name to the same value.&amp;nbsp; When you write &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$y =&amp;amp; $x&lt;/span&gt;, then you are doing a pointer copy, similar to &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;y = x;&lt;/span&gt; in C when both are of type &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;int*&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Variable-variables, which let you do &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x=42; $y="x"; echo $$y;&lt;/span&gt; to print 42, are just a chain of ordinary variable lookups.&amp;nbsp; &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$y&lt;/span&gt; is not directly pointing to the value of &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x&lt;/span&gt;, but only to a string value containing the name "x", which is then immediately used to look for the value that &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x&lt;/span&gt; is pointing to.&lt;br /&gt;&lt;br /&gt;PHP's manual also has a long explanation regarding &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;unset()&lt;/span&gt;, which becomes fairly simple to describe with pointers: &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;unset()&lt;/span&gt; removes the name and its link to its value, leaving the value unaffected (unless this was the &lt;i&gt;last &lt;/i&gt;name for the value, at which point the value is garbage collected).&amp;nbsp; &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$y =&amp;amp; $x; unset($y);&lt;/span&gt; does not result in observable change to &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x&lt;/span&gt;.&amp;nbsp; The equivalent C is &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;y = x; y = NULL;&lt;/span&gt; and this also does not affect x.&lt;br /&gt;&lt;br /&gt;The other major caution regarding &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;unset()&lt;/span&gt; is that the &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;global&lt;/span&gt; keyword creates a reference, so that &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;global $x; unset($x);&lt;/span&gt; has no effect, as with any other reference: you delete the current &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x&lt;/span&gt;, which points to the same value as the global &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x&lt;/span&gt;, without affecting the pointed-to value.&lt;br /&gt;&lt;br /&gt;PHP variables also act like they're copied by value when used, except for object instances as of PHP 5.0.&amp;nbsp; This works by secretly including another pointer: an object-type value holds not the actual object, but an identifier for the instance, also known as an "object handle".&amp;nbsp; When you use such a variable, PHP (under the hood) notices that it's an object, and fetches the real object based on the identifier in the value.&amp;nbsp; This way, the value can be copied like any other, yet still references "the same object" which is nearly always what you want.&lt;br /&gt;&lt;br /&gt;You can generate a new instance by using  &lt;a href="http://us2.php.net/clone"&gt;clone&lt;/a&gt; on the value, which makes PHP actually create a new instance, with a new identifier, and return a new value which contains that new identifier.&amp;nbsp; I think I used this precisely once in my career, back when I was less skilled as a system designer.&lt;br /&gt;&lt;br /&gt;Actually, this value-is-a-pointer pattern is not new with OO: resources work the same way.&amp;nbsp; When you get a file pointer with &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;fopen()&lt;/span&gt;, assigning it does not open the file again.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;But what about assembler?&lt;/h3&gt;It's pretty much irrelevant by now.&amp;nbsp; But just for fun, here's how things looked in AssemPro on our Amiga, running a 68000:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;move.l #42, d0&lt;/span&gt;&amp;nbsp; ; load a constant, aka immediate value, into register d0 &lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;lea intbase, a0&lt;/span&gt;&amp;nbsp; ; load &amp;amp;intbase into a0  (PC-relative)&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;/span&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;move.l #intbase, a6&lt;/span&gt;&amp;nbsp; ; load &amp;amp;intbase as a constant into a6 (not PC-relative)&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;move.l intbase, a6&lt;/span&gt;&amp;nbsp; ; load  *(&amp;amp;intbase) into a6&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;move.l 0(a6), d0&lt;/span&gt;&amp;nbsp; ; load *(a6) into d0: a6 must hold some pointer aka address&lt;br /&gt;&lt;br /&gt;The third example loads the value of &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;amp;intbase&lt;/span&gt; at the time of assembly.&amp;nbsp; This may not be the actual address of the &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;intbase&lt;/span&gt; label once the code is loaded to run, which is why relocation was invented.&amp;nbsp; (Though I am entirely clueless about how AmigaOS actually handled them.)&lt;br /&gt;&lt;br /&gt;I can't remember now if the fourth example is PC-relative or not.&amp;nbsp; I didn't understand yet why you would write such code, when I was doing this stuff in the 1990s, and it was "more restrictive" so I usually didn't bother. PC-relative code used only relative offsets to the current instruction as addresses, so it could be loaded at any position in memory without having to be relocated, nor having to use an offset table.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Feedback Still Encouraged&lt;/h3&gt;This post has been largely rewritten  for quality, but suggestions and questions are still welcome in the comments.&amp;nbsp; In spite of my efforts to pare down the excess, there remain many digressions and hints of deeper layers throughout.&amp;nbsp; I have just too much knowledge to know what's useful and what's extraneous for the topic of pointers specifically.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-1791622731336676821?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/1791622731336676821/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=1791622731336676821&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1791622731336676821'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1791622731336676821'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/08/sapphirepaws-introduction-to-pointers.html' title='sapphirepaw&apos;s Introduction to Pointers, version 2'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-RjZWuLw8p6Y/TlBYAqV0uAI/AAAAAAAAAKA/gYWFLyNoMLE/s72-c/08-20-argv-pointers.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-2309930747517364642</id><published>2011-08-08T19:46:00.002-04:00</published><updated>2011-08-08T19:48:04.538-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pdf'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='ui'/><category scheme='http://www.blogger.com/atom/ns#' term='gnome'/><title type='text'>What Would "Better" Be? PDF Reader Edition</title><content type='html'>I opened up a PDF in the default Gnome PDF reader last night, and it was once again a terrible experience.&amp;nbsp; It opened with the zoom set to "fit page width", and the scrolling set to continuous.&amp;nbsp; There's no concept of a persistent user preference, or user preferences that override the document preferences.&lt;br /&gt;&lt;br /&gt;Then I got to considering the underlying reasons why I didn't like the default display.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;I read a wide range of PDFs from a variety of sources.&amp;nbsp; Some have columns, some don't, and my zoom preference varies based on which it is.&amp;nbsp; If there are columns, I typically make the viewer as tall as possible, set the zoom to "Fit Page", and put up with relatively tiny fonts because then scrolling doesn't keep switching direction to go back up the page when switching columns, then down to the next page (which never aligns with a clean PageDown keystroke).&lt;br /&gt;&lt;br /&gt;Without columns, Fit Page Width and continuous scrolling are almost what I want.&amp;nbsp; I really wouldn't mind fitting the &lt;i&gt;content &lt;/i&gt;width instead, like OpenOffice's "optimal" zoom, but the page width is typically close enough in spite of losing 10% to the margins.&amp;nbsp; With optimal zoom, I could have the same text size in a smaller window.&lt;br /&gt;&lt;br /&gt;But underneath it all, it seems like there's another problem with the interface, carried forward from the days of yore: all scrolling is vertical.&amp;nbsp; On the keyboard, there are no PageLeft or PageRight keys.&amp;nbsp; The mouse wheel offers the vertical axis as a nice wheel, but the horizontal one is reduced to tilting that glorious wheel, which thus becomes as ineffective as the left and right arrow keys for large scrolling.&lt;br /&gt;&lt;br /&gt;The vertical bias to scrolling means there's no support for horizontal scrolling.&amp;nbsp; I have a beautiful widescreen display, but I can't use the width except to make a single page &lt;i&gt;extremely&lt;/i&gt; wide.&lt;br /&gt;&lt;br /&gt;Ultimately, it would be nice to have a PDF renderer that was aware of structure, so that it could destructure the PDF and present it in a screen-friendly way—that is, without columns—on the screen.&amp;nbsp; Presumably, since I can select from a single column at a time, the text is stored linearly inside the PDF (although it also carries glyphs and kerning info, to get that perfectly exact rendition on any interpreter).&amp;nbsp; I'm also guessing that since the text is selectable, there's not a dumb bitmap coming back from libpoppler.&amp;nbsp; And in the event where a PDF turns to mush when trying to linearize it, the faithful representation would be at hand as a fallback, albeit with reduced user happiness.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-2309930747517364642?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/2309930747517364642/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=2309930747517364642&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2309930747517364642'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2309930747517364642'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/08/what-would-better-be-pdf-reader-edition.html' title='What Would &quot;Better&quot; Be? PDF Reader Edition'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-162036320175969068</id><published>2011-07-14T12:29:00.004-04:00</published><updated>2011-07-14T16:35:19.568-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><category scheme='http://www.blogger.com/atom/ns#' term='net neutrality'/><category scheme='http://www.blogger.com/atom/ns#' term='future'/><category scheme='http://www.blogger.com/atom/ns#' term='markets'/><title type='text'>The Facets of Net Neutrality</title><content type='html'>In its original conception, "Network Neutrality" as I understood it was about a lack of privilege amongst &lt;b&gt;competing traffic sources:&lt;/b&gt; that Google, Viacom, the atheism reddit, the Anglican Council, and the Time Cube site would all be subject to equal traffic slowdowns in the face of congestion.&amp;nbsp; A bit of thought would suggest that treating &lt;b&gt;individual packets &lt;/b&gt;equally was not, in fact, desirable: you probably don't want your VOIP call and each individual P2P connection to be subject to the same rules, really.&amp;nbsp; You'd rather the call got through even at the expense of delaying a few packets of your (or your neighbor's) download.&lt;br /&gt;&lt;br /&gt;Certain large ISPs have been trying to twist it to mean &lt;b&gt;they can charge on both sides,&lt;/b&gt; for content providers to be allowed to send data to "their" customers, though the customers are already paying (quite profitably for the companies) for their own access.&amp;nbsp; They would be charging everyone for access, so it's "neutral," right?&amp;nbsp; This is an anti-neutrality stance trying to co-opt the word so that it sounds like a good thing.&lt;br /&gt;&lt;br /&gt;Pro-neutrality forces (in the first sense) argue that requiring content providers to pay for carriage, or for "premium" speeds, would completely destroy the internet as we know it. &amp;nbsp;Also, many of them believe they are preserving &lt;i&gt;existing &lt;/i&gt;neutrality, but this turns out to be incorrect. &amp;nbsp;A &lt;b&gt;content delivery network&lt;/b&gt; (CDN)&amp;nbsp;essentially is an implementation of pay-for-speed, because the content provider pays for their content to be stored closer to end-users, which reduces load time for those users. &amp;nbsp;Although the end-user's ISP doesn't receive payment directly, the content provider's payment to the CDN also funds the overall system by paying for the CDN's own connectivity at the ends, and infrastructure in the middle.&lt;br /&gt;&lt;br /&gt;I think the value of the Internet is in two things:&amp;nbsp;&lt;b&gt;uniformity of access&lt;/b&gt; for end-users, and &lt;b&gt;fair division of capacity&lt;/b&gt;. &amp;nbsp;Uniformity of access is simply that &lt;i&gt;any &lt;/i&gt;connection should be able to carry packets from &lt;i&gt;any &lt;/i&gt;content&amp;nbsp;provider, so that the view of "the Internet" from any one ISP is the same view as from any other. &amp;nbsp;Otherwise, "the Internet" would cease to have meaning, as it reverted to the days of online services like CompuServe, Prodigy, and AOL.&lt;br /&gt;&lt;br /&gt;Fair division of capacity is exactly what it says on the tin, that speeds and latencies should be balanced among customers of an ISP. &amp;nbsp;I shouldn't be able to start a download and prevent Netflix from delivering video to my neighbor, and a bunch of people on 6Mbps connections shouldn't be able to deny service to 1.5Mbps subscribers.&lt;br /&gt;&lt;br /&gt;The real emotional punch that gets brought into neutrality discussions seems to come from the&amp;nbsp;&lt;a href="http://tvtropes.org/pmwiki/pmwiki.php/Main/LeonineContract"&gt;leonine&lt;/a&gt;&amp;nbsp;terms the ISPs would like to apply: around &lt;b&gt;one-tenth&lt;/b&gt; of the current (often secret) usage limits, for as low as six-tenths of the price, as in Time-Warner's experiment last year. Though the current arrangement is apparently profitable and growing more so over time: the cost of carriage is falling faster than inflation is diluting revenues. &amp;nbsp;The fear is that ISPs will establish these terms "in order to build out next-generation networks" and then not follow through on that investment, &lt;b&gt;artificially limiting &lt;/b&gt;their service and allowing inflated payments that do nothing but lift the artificial restriction—in order to offer what is on the market today.&lt;br /&gt;&lt;br /&gt;Promises, after all, are cheap.&lt;br /&gt;&lt;br /&gt;This fear is only exacerbated by the incumbent ISPs' wars against municipal broadband. &amp;nbsp;City-owned networks are being opposed in many states as 'unfair' competition. &amp;nbsp;In at least one case, the city in question embarked on its network building course because the ISP claimed they would &lt;i&gt;never &lt;/i&gt;offer higher speed. &amp;nbsp;Yet as soon as the city decided to offer higher speed itself if nobody else was going to, the ISP frantically began upgrading their infrastructure, hurrying to complete it before the city's project was finished, so they could argue that the city network was 'unnecessary' due to the ISP offering its (new) high-speed service.&lt;br /&gt;&lt;br /&gt;This fear is further exacerbated by the regular broadband reports showing that countries with more competition amongst ISPs, regardless of&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/File:Urban_population_in_2005_world_map.PNG"&gt;urbanization&lt;/a&gt;, have the fastest speeds and highest limits on data transferred, where applicable. &amp;nbsp;If larger companies truly did have more efficiency and more benefit to the customer as they claim, then the average US broadband connection should meet—or exceed—the average connection in Japan. &amp;nbsp;Instead, large companies' performance suggests they are the major &lt;b&gt;impediment&lt;/b&gt; to improved service.&lt;br /&gt;&lt;br /&gt;For the Internet to continue its course of innovation and convenience for the American consumer, protection of uniformity of access and fair division of capacity are sorely needed. &amp;nbsp;Placing these responsibilities into the hands of existing large ISPs who have been actively demonstrating their complete lack of commitment to the principles, or their customers, except when threatened &lt;i&gt;en masse &lt;/i&gt;with an alternative network, is clearly the wrong course of action to ensure the result. &amp;nbsp;It is putting the fox with feathers stuck in its teeth in charge of the hen house.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-162036320175969068?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/162036320175969068/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=162036320175969068&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/162036320175969068'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/162036320175969068'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/07/facets-of-net-neutrality.html' title='The Facets of Net Neutrality'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7467858612171375883</id><published>2011-07-11T18:47:00.001-04:00</published><updated>2011-07-12T13:45:33.540-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tcp'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><title type='text'>TCP: Conflicting Goals</title><content type='html'>David Singleton writes in "&lt;a href="http://blog.davidsingleton.org/mobiletcp"&gt;Why mobile apps suck when you're mobile (TCP over 3G)&lt;/a&gt;":&lt;br /&gt;&lt;blockquote&gt;TCP assumes that the connection has a more or less constant RTT and assumes delays are losses due to congestion somewhere on the path from A to B.&lt;/blockquote&gt;This struck a special chord with me, because I had just recently read about TCP algorithms that had been designed to combat "buffer bloat": instead of scaling strictly based on packet loss, assume increases in latency are due to &lt;i&gt;buffering &lt;/i&gt;on the path.&amp;nbsp; Then, back off to avoid both packet loss and longer latency, which is &lt;b&gt;measured by RTT.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Since 3G attempts to implement reliable delivery itself, TCP-in-3G bears performance characteristics similar to TCP-in-TCP that is explained in &lt;a href="https://github.com/apenwarr/sshuttle"&gt;Avery Penwarr's sshuttle README&lt;/a&gt;.&amp;nbsp; (sshuttle takes care to extract data from the one TCP connection and copy it to a technically distinct connection, instead of wrapping it, in order to avoid the problem.)&amp;nbsp; And actually, I see that Singleton linked to another source going into more detail, which I skipped reading the first time around.&lt;br /&gt;&lt;br /&gt;So not only is 3G a bad transport for that reason, but the variable RTT its delivery mechanism introduces also sinks TCP algorithms which try to use increased RTT to avoid queueing in buffers.&amp;nbsp; The buffer-avoidance aspect &lt;i&gt;can't distinguish &lt;/i&gt;between "bad" buffers like those in a cheap home router that take huge chunks of data off the Ethernet at 100 Mbps, then dribble it out at 0.6 Mbps to the Internet at large; and "good" buffers like those in the 3G system that are &lt;i&gt;unclogging &lt;/i&gt;the spectrum rather than &lt;i&gt;crowding &lt;/i&gt;other users of the tubes. &lt;br /&gt;&lt;br /&gt;Singleton proposes some mitigations for app developers; I'd rather try to "fix" TCP so that it gracefully handles variable RTT.&amp;nbsp; It may violate the perfect conceptual segregation of the OSI Seven Layer Model, but simply having the phone's TCP stack aware of the wireless interface itself would go a long way toward mitigating the problem.&amp;nbsp; Perhaps if the 3G hardware could indicate "link restored" and "backlog cleared", TCP could skip using the RTT of packets received between those events in its congestion avoidance.&lt;br /&gt;&lt;br /&gt;It seems like WiFi would need some mitigations as well.&amp;nbsp; It is particularly prone to periods of "solid" packet loss, occasionally even destroying the beacon signal and thus kicking everyone off, and periods of fairly reliable reception.&amp;nbsp; However, when you do get reception back, the data pours in without significant degradation in speed, so the underlying issue is a bit different.&amp;nbsp; However, the connection always seems to be particularly slow if it has the bad luck of being started during a period of loss.&lt;br /&gt;&lt;br /&gt;In the end, the problems seem to come from allowing endpoints to specify receive-windows, but not the network.&amp;nbsp; TCP views the network as a dumb thing that it can draw conclusions about based on end-to-end behavior.&amp;nbsp; Yet the increasing prevalence of wireless, and of sending TCP over wireless links, seems to indicate that "the network" should be able to add metadata to the packets (probably at the IP level, since the network is conceptually unable to peek inside of IP data) to indicate that the delivery of the packet was delayed for reliability.&amp;nbsp; Unfortunately, rogue devices could set that bit for their buffer-bloated packets, so it's about as practical as the &lt;a href="http://en.wikipedia.org/wiki/Evil_bit"&gt;Evil Bit&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7467858612171375883?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7467858612171375883/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7467858612171375883&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7467858612171375883'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7467858612171375883'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/07/tcp-conflicting-goals.html' title='TCP: Conflicting Goals'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-4021517108695322249</id><published>2011-06-05T08:22:00.000-04:00</published><updated>2011-06-05T08:22:03.351-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><title type='text'>Python's sum()</title><content type='html'>In Python, the &lt;code&gt;sum()&lt;/code&gt; builtin gives you the ability to take a list, say &lt;code&gt;[1, 2, 10]&lt;/code&gt; and find the sum of it as if you had written out &lt;code&gt;1 + 2 + 10&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;The &lt;code&gt;+&lt;/code&gt; operator is also defined for lists, where if you write out &lt;code&gt;[1] + [2] + [10]&lt;/code&gt; you'll get a list back: &lt;code&gt;[1, 2, 10]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;What happens if we put these two observations together? Can we &lt;code&gt;sum()&lt;/code&gt; a list of lists to get one flattened list?&lt;br /&gt;&lt;pre&gt;Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) &lt;br /&gt;[GCC 4.4.3] on linux2&lt;br /&gt;Type "help", "copyright", "credits" or "license" for more information.&lt;br /&gt;&amp;gt;&amp;gt;&amp;gt; print sum([[1],[2],[10]])&lt;br /&gt;Traceback (most recent call last):&lt;br /&gt;  File "&amp;lt;stdin&amp;gt;", line 1, in &lt;module&gt;&lt;br /&gt;TypeError: unsupported operand type(s) for +: 'int' and 'list'&lt;br /&gt;&amp;gt;&amp;gt;&amp;gt; &lt;br /&gt;&lt;/module&gt;&lt;/pre&gt;Nope.  sum() internally starts with "0 + (first element of sequence)" so you can only pass things that can be added to integers.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-4021517108695322249?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/4021517108695322249/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=4021517108695322249&amp;isPopup=true' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/4021517108695322249'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/4021517108695322249'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/06/pythons-sum.html' title='Python&apos;s sum()'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-1176797284382812066</id><published>2011-06-03T21:02:00.001-04:00</published><updated>2011-06-03T21:04:37.699-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='php'/><category scheme='http://www.blogger.com/atom/ns#' term='compiler'/><title type='text'>The First Step of a Long Journey</title><content type='html'>Over the past couple of weeks, I have assembled a reader in PHP, such that it understands code of the form &lt;code&gt;(print (== (+ 4 4 6) (- 30 15 1)))&lt;/code&gt; and will be able to create PHP source that ultimately prints out "1".&amp;nbsp; It's kind of brokenly stupid in other ways, but it's the bare-bones skeleton of a working compiler.&amp;nbsp; Something I have never been able to build prior to this attempt, largely because I wanted to tokenize something superficially like PHP, and I always got bored of defining all the stupid tokens.&amp;nbsp; Going with s-expressions made for only a handful of token types so that I could get on with the interesting bits instead of grinding out pages of &lt;code&gt;/&amp;amp;&amp;amp;|\|\|/&lt;/code&gt; crud.&amp;nbsp; Because almost anything can go in an identifier, I can treat everything as identifiers for now.&lt;br /&gt;&lt;br /&gt;There are a few obvious things it needs next: string types.&amp;nbsp; Variables.&amp;nbsp; defun.&amp;nbsp; defmacro.&amp;nbsp; Separate namespaces for functions and variables, defined by context, so you can say &lt;code&gt;(array_map htmlspecialchars row)&lt;/code&gt; and it will know that the first argument passed is a callable and the second is a expression, so that they can compile to 'htmlspecialchars' and $row, respectively.&amp;nbsp; And to serve its original purpose as an "enhanced PHP"-to-PHP compiler, it needs to  read that source language rather than s-expressions.&amp;nbsp; Of course, with a non-sexp-based language, macros might not work out so well, but I do want to be able to run code to rewrite the AST (or the whole tokenizer: aka reader macros) at compile-time.&lt;br /&gt;&lt;br /&gt;There's a bunch of features I want to add, too.&amp;nbsp; Proper named arguments.&amp;nbsp; Multiple-value return.&amp;nbsp; Ubiquitous lexical scope, so obviously &lt;code&gt;let&lt;/code&gt; and its function equivalent (&lt;code&gt;flet&lt;/code&gt; perhaps?). Something else that I'm forgetting at the moment.&lt;br /&gt;&lt;br /&gt;In the long run, I also want to do some optimizations; ideally, I could turn &lt;code&gt;$efoo = array_map('htmlspecialchars', $foo);&lt;/code&gt; into &lt;code&gt;$efoo=array(); foreach ($foo as $k=&amp;gt;$v) $efoo[$k]=htmlspecialchars($v);&lt;/code&gt; as well as doing simple optimizations like &lt;code&gt;i++;&lt;/code&gt; to &lt;code&gt;++i;&lt;/code&gt;.&amp;nbsp; I'd also love to be able to compile some 5.3 code like &lt;code&gt;$foo::bar("baz")&lt;/code&gt;, &lt;code&gt;?:&lt;/code&gt;, and "nowdoc" syntax into 5.2-compatible renditions (answer to the first: &lt;code&gt;call_user_func(array($foo, 'bar'), "baz")&lt;/code&gt; though my accumulated wisdom now considers such things to be a code smell).&lt;br /&gt;&lt;br /&gt;The weird thing about this is that if I succeed, I'll be doing what Rasmus did to create PHP—riffing on an existing system in the domain to come up with something a little better.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-1176797284382812066?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/1176797284382812066/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=1176797284382812066&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1176797284382812066'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1176797284382812066'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/06/first-step-of-long-journey.html' title='The First Step of a Long Journey'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-4668945375247683665</id><published>2011-05-19T22:31:00.001-04:00</published><updated>2011-05-19T22:47:38.165-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='php'/><title type='text'>Accidental Lisp</title><content type='html'>It began with a simple bit of laziness: I wanted a preprocessor so that I could write as if PHP had multiple return values.&amp;nbsp; I'd write "return $x, $y;" in the callee, and "$a, $b = fn();" in the caller, and the preprocessor would rewrite it to valid PHP (throwing array() and list() around the appropriate expressions).&lt;br /&gt;&lt;br /&gt;But I'm even too lazy for that.&amp;nbsp; To do this right, I'd need to &lt;i&gt;fully parse &lt;/i&gt;the PHP, so I could understand more complicated return expressions like method calls.&amp;nbsp; So instead of that, I slapped together a lexer for s-expressions.&amp;nbsp; They're a lot less hairy, and this is just some twisted experiment.&lt;br /&gt;&lt;br /&gt;I was halfway through putting together a parser this evening for the lexer output, when I realized: a few years ago, I ported the metacircular evaluator from the SICP lectures into Ruby... then discovered I would need to write an s-expression parser, which you get for free with Lisp.&amp;nbsp; (That project then died.)&amp;nbsp; But if I finish an s-expression parser... I can port the metacircular evaluator to it and have the world's stupidest Lisp-1 implementation, i.e. it'll be done in PHP.* &lt;br /&gt;&lt;br /&gt;Alternatively, I can define a package in SBCL that emits PHP, and have the reader and macros for free.&amp;nbsp; Then my head exploded.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;* Because this tool was intended for PHP shops, the compiler would have to  be written in and emit PHP so there's no Scary Foreign Language  involved, other than the compiler's input.&amp;nbsp; And originally, that input  language was going to be almost PHP.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-4668945375247683665?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/4668945375247683665/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=4668945375247683665&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/4668945375247683665'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/4668945375247683665'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/05/accidental-lisp.html' title='Accidental Lisp'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-537409972998988082</id><published>2011-05-10T22:31:00.000-04:00</published><updated>2011-05-10T22:31:06.917-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='dh'/><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='ipsec'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><title type='text'>Quickie: Diffie-Hellman Groups</title><content type='html'>Relying on others' suggested magic numbers for crypto is probably a Bad Idea, so recently I studied Diffie-Hellman a while to understand what the "DH Group" parameter was in my IPSEC setup, and my PuTTY settings.&lt;br /&gt;&lt;br /&gt;DH turns out to be a lot like RSA, so bit lengths are comparable between the two and neither is directly comparable to symmetric ciphers like AES.&amp;nbsp; A specific Diffie-Hellman exchange happens using some parameters: a generator for the base, and a prime to use as modulus.&amp;nbsp; (An exponent remains secret.)&amp;nbsp; DH Groups refer to specific, pre-chosen prime-and-generator pairs so that, for example, SSH can negotiate "group 14" instead of transferring the complete parameters themselves.&lt;br /&gt;&lt;br /&gt;These groups have been standardized in &lt;a href="https://datatracker.ietf.org/doc/rfc2409/"&gt;RFC 2409&lt;/a&gt;, with additional groups defined in &lt;a href="https://datatracker.ietf.org/doc/rfc3526/"&gt;RFC 3526&lt;/a&gt;.&amp;nbsp; The latter RFC defines the bit lengths of the groups explicitly, stating that group 5 is 1536 bits, group 14 is 2048, and group 16 is 4096 bits.&amp;nbsp; As far as I can tell, groups 1 and 2 defined in the earlier RFC are only 768 and 1024 bits, respectively.&lt;br /&gt;&lt;br /&gt;Note well: &lt;b&gt;I believe this means DH groups 1 and 2 are &lt;a href="http://en.wikipedia.org/wiki/Cryptographic_key_length#Asymmetric_algorithm_key_lengths"&gt;dangerously short&lt;/a&gt;&lt;/b&gt; and should not be used to set up an IPSEC VPN today.&amp;nbsp; Likewise, PuTTY should really be configured out-of-the-box to warn about the use of anything less than DH group 14.&amp;nbsp; However, before I take my own advice, I need to do some experiments to determine whether the IPSEC client in iOS actually handles DH groups other than 2.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-537409972998988082?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/537409972998988082/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=537409972998988082&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/537409972998988082'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/537409972998988082'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/05/quickie-diffie-hellman-groups.html' title='Quickie: Diffie-Hellman Groups'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7696825317710223567</id><published>2011-05-04T20:07:00.000-04:00</published><updated>2011-05-04T20:07:15.734-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='dns'/><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><title type='text'>Quickie: The Necessity of Whimsical Names</title><content type='html'>Rackspace recently announced that they'd like to discontinue Slicehost at some point, migrate everyone to the EC2-like Rackspace Cloud, and make people worry per GB about the bandwidth they're consuming.&amp;nbsp; So I'm preparing a move to Linode for more of everything*, and in the planning, I've come across a new argument in favor of whimsical names for servers.&lt;br /&gt;&lt;br /&gt;If I give each server a whimsical name, like alice.example.com and bob.example.com, I can always refer to the old and new IP addresses as "alice" and "bob", while the change of IP of "www" propagates through the DNS.&amp;nbsp; Between the time where the new address is set and the old one is expired (and note that there's no way to &lt;i&gt;force &lt;/i&gt;an ISP's resolver to honor the TTL if they choose to assume "no TTLs will be shorter than an hour") the name being transitioned points to a more-or-less random server.&lt;br /&gt;&lt;br /&gt;Basically, the whimsical name is like a server ID, and the service-based names are just conveniences.&amp;nbsp; Though a program is three lines long, someday it must be maintained; though a server hosts one service, someday it will have to be replaced.&amp;nbsp; When an organization gets big enough that it can't generate whimsy as fast as it needs servers, then it should go with something more regular for the server name, but each server should still have a unique, non-service-based name.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;* Except bandwidth, but the 11% difference is smaller than my current monthly consumption, so it turns out not to matter much.&amp;nbsp; Even if it did matter, that much transfer on The Cloud (insert angelic chord here) would be expensive, so Linode still wins.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7696825317710223567?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7696825317710223567/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7696825317710223567&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7696825317710223567'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7696825317710223567'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/05/quickie-necessity-of-whimsical-names.html' title='Quickie: The Necessity of Whimsical Names'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-5905344423948509222</id><published>2011-04-21T22:45:00.002-04:00</published><updated>2011-04-21T22:54:52.018-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='microsoft'/><category scheme='http://www.blogger.com/atom/ns#' term='markets'/><category scheme='http://www.blogger.com/atom/ns#' term='apple'/><title type='text'>Perception</title><content type='html'>They say you don't get a second chance to make a first impression, but that depends on who you are.&amp;nbsp; Apple seems to have managed a couple of major architecture transitions and their own Vista without too much ill will, yet Microsoft was practically crucified for Vista with no architecture transitions.&lt;br /&gt;&lt;br /&gt;Fair warning: many links in this post lead to tvtropes. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;To be sure, Apple had the advantage on their architecture transitions that they were moving between chips that had significantly more performance, which greatly reduced the penalty of emulating the old chips; a 100 MHz PowerPC (that, at the time, Amiga Computing* boasted would run like a 200 MHz Pentium, or something like that) was rather adept at &lt;a href="http://en.wikipedia.org/wiki/Mac_68K_emulator"&gt;pretending&lt;/a&gt; to be a 40 MHz 68040.&lt;br /&gt;&lt;br /&gt;Vista, on the other hand, was widely panned in the tech press, and subject to the same problems regarding driver availability as Mac OS X 10.0, if anyone remembers that.&amp;nbsp; "zOMG there's no drivers!"&amp;nbsp; And it was &lt;i&gt;so bloated.&lt;/i&gt;&amp;nbsp; Well, on the machines of the day.&amp;nbsp; I don't really fault Microsoft for believing that PCs would continue &lt;a href="http://tvtropes.org/pmwiki/pmwiki.php/Main/BeyondTheImpossible"&gt;getting faster&lt;/a&gt; and everyone would keep buying new ones for Vista, considering the market had been working that way for something like 15 years.&amp;nbsp; Multicore just threw them off their old, proven, and profitable game.&lt;br /&gt;&lt;br /&gt;Anyway, in 2009, I got a new machine provisioned for me at my new job, the first one that wasn't cobbled together in-house out of spare parts, or handed down from the previous developer.&amp;nbsp; Unfortunately my boss forgot to check the "Vista Home Premium" upgrade, so it came with Home Basic and was thus ineligible for the free Windows 7 upgrade program.&amp;nbsp; So I've been using Vista  for a year and a half, and it sucks.&amp;nbsp; Totally, it's... uh... er,&amp;nbsp; yeah.&amp;nbsp; There's actually nothing wrong with it other than the fact that it doesn't have full Aero so I don't get semi-transparent blurring titlebars, which actually means the title text is readable.&amp;nbsp; Yep.&lt;br /&gt;&lt;br /&gt;Compare to my days of being a Tech Press Believer back in 2007, when I was worried (near to the point of tears, on one occasion) about building a computer for my then-fiancée 'soon enough' that I could still get XP for it, because Vista was so indescribably awful.&lt;br /&gt;&lt;br /&gt;I didn't know then that Vista was actually a truly modernized OS, with a sane security model, and all the horrors of UAC were because 90% of Windows software was completely broken, and Windows of yore just didn't enforce proper coding at all.&lt;br /&gt;&lt;br /&gt;Yet the tech press buried Vista under a negative wave of publicity.&amp;nbsp; It became Microsoft's New Coke.&amp;nbsp; The refrain of Windows 7, like Coke Classic, was essentially the message of "We listened."&amp;nbsp; Now in the se7en era, the tech-o-sphere passes around the Official Belief that "7 is what Vista should have been, and I can prove it because it's even &lt;i&gt;actually &lt;/i&gt;called 6.1", somewhat oblivious to the fact that 7 is largely what Vista is.&amp;nbsp; Had 7 been released without Vista in between, how would it have been received?&lt;br /&gt;&lt;br /&gt;Probably like KDE 4, Gnome 2, or Vista.&amp;nbsp; "There's no drivers and they changed &lt;b&gt;everything!&lt;/b&gt; Whyyyyyyyyy!"&amp;nbsp; Because without the intervening release of Vista, many drivers at the launch of seven would have been in a similar situation as the launch of Vista: companies don't want to move on things until it's proven that they must, because preparing for futures that don't arrive is almost completely wasted effort.&lt;br /&gt;&lt;br /&gt;An interesting point of divergence here between Apple's and Microsoft's approaches to OS X and Vista, respectively, is that OS X included an emulator for Mac OS Classic.&amp;nbsp; There was much more of a break between the environments, with the new not making any pretense of compatibility with the old beyond a sandbox.&lt;br /&gt;&lt;br /&gt;With Vista, however, Microsoft faced a &lt;a href="http://tvtropes.org/pmwiki/pmwiki.php/Main/ptitlelewwvnvy"&gt;bunch of bad choices&lt;/a&gt;: either they could continue in XP's footsteps, releasing NT 5.(x+1), and continue taking heat for security**; they could try an OS X or NT 3.x-like break with history, and get flamed for lack of compatibility or performance of an XP Mode if they tried emulating it; or they could launch Vista as we saw it, breaking compatibility with drivers and bad (but widespread) coding practices to put necessary pressure on just about everyone to fix their broken stuff.&lt;br /&gt;&lt;br /&gt;No matter what they did, someone would complain.&amp;nbsp; "XP mode" would inevitably be slower than running XP on the bare metal, and if Microsoft put in a lot of effort to paravirtualize it,  doing so would make it possible for others to mimic that layer under it and run it on unauthorized systems.&amp;nbsp; On the security front, they could continue the "anyone can break the system completely" model of security inherited from their single-user days, they could keep silently failing like limited accounts on XP tend to, or they could add elevation prompts and face complaints about the necessary inconvenience with legacy programs.&lt;br /&gt;&lt;br /&gt;People forgave Apple for OS X*** 10.0, but they didn't cut Microsoft any slack for Vista.&amp;nbsp; At least, the tech press and its devoted fans didn't.&amp;nbsp; The Mojave Experiment brings out another interesting similarity to New Coke: supposedly, when people were using a &lt;a href="http://en.wikipedia.org/wiki/Mojave_Experiment"&gt;rebranded version of Vista&lt;/a&gt;, they rated it much more highly than Vista itself.&amp;nbsp; Likewise, Coke only went through with New Coke because &lt;i&gt;testing showed that people liked it better.&lt;/i&gt;&amp;nbsp; But take the blinds off, and New Coke?&amp;nbsp; &lt;a href="http://tvtropes.org/pmwiki/pmwiki.php/Main/TheyChangedItNowItSucks"&gt;Eww&lt;/a&gt;!&lt;br /&gt;&lt;br /&gt;In the end, 'success' (when defined as widespread usage of a product) is a result of perception of that product.&amp;nbsp; Substantially similar products may end up at very different outcomes simply because of a difference in perception between the two that gives one an advantage relative to the other.&amp;nbsp; Much-maligned Vista, a major and necessary improvement to Windows, underperformed because of widespread perception that it was Bad, among people who decided what to buy for their Fortune 1000 Company; whereas Windows 7 adoption is moving full steam ahead, such that it overtook cumulative Vista share quite a while ago now.&lt;br /&gt;&lt;br /&gt;It makes me wonder how the scenario would have been played out if the tech press and their followers had not worked so hard to squelch Vista.&amp;nbsp; Would Microsoft have deployed some of the efficiency improvements found in Windows 7 as a service pack or "R2" to Vista, bringing the memory efficiency improvements into their older OS?&amp;nbsp; I doubt they would have overhauled the Taskbar so drastically, but as it was, the need to escape the "Vista = Awful" conception drove them to roll out a new brand for the OS as soon as they could. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;* For a while after the end of Commodore, the Amiga IP staggered around, zombie-fashion, among companies that didn't really take much advantage of it.&amp;nbsp; One of the possible, wonderful futures of those days called for an Amiga resurrection around the PowerPC platform; formerly, it also ran on the 68000 series, like the Mac, so it's not as irrelevant or crazy as it sounds.&amp;nbsp; Then again, if they wanted to do anything like switch from bitplane to pixel-based graphics, it would have required &lt;i&gt;seriously hardcore &lt;/i&gt;emulation compared to the Macs.&amp;nbsp; AFAIK.&lt;br /&gt;&lt;br /&gt;** In reality, as the biggest and most valuable target, Microsoft needs a much greater level of security in order to result in similar exploit counts, which is how people like to simplify 'security.'&amp;nbsp; OSS proponents used to whine when someone totaled MS Advisories vs. Red Hat advisories for a year and concluded that Linux was less secure because RH issued more bulletins, since they issued them on a much wider range of software. However, even if you counted only "equivalent" vulnerabilities, you would find that bulletins-per-installation is higher for Linux because Microsoft has such a huge divisor on that metric.&lt;br /&gt;&lt;br /&gt;*** I pronounce this "Oh Ess Ex," by the way, because "OS Ten Ten Point Five" sounds completely dumb, and nobody ever writes "OS X.5".&amp;nbsp; Though I'd call it "Oh Ess Ex Five" if they did, because nobody uses Roman Decimals either.&amp;nbsp; Maybe I should make a set of Perl modules so we can `use V::X::III` instead of 5.010_03 or 5.10.3, but then, Damian Conway probably beat me to it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-5905344423948509222?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/5905344423948509222/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=5905344423948509222&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/5905344423948509222'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/5905344423948509222'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/04/perception.html' title='Perception'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-8593363652889371578</id><published>2011-03-04T07:00:00.001-05:00</published><updated>2011-03-04T07:00:13.487-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ambiguity'/><category scheme='http://www.blogger.com/atom/ns#' term='customization'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='ui'/><title type='text'>The Authority of the User</title><content type='html'>I used to believe that my computer was mine, and no program had any authority to do anything without my consent.&amp;nbsp; (This can probably be traced back to my days on Slashdot, a decade ago; if I didn't get the opinion from there, they certainly reinforced it.)&amp;nbsp; I believed I was sufficiently smart to manage my own software, without everyone's updater constantly nagging me to do so.&amp;nbsp; I especially didn't want the updater to do it on its own; this often lead to problems, especially when Firefox got updated behind the scenes while I was using it.&amp;nbsp; However, I liked automatic security updates on Linux, so I got rather used to restarting Firefox when links mysteriously failed to be followed, or menus and tabs couldn't be opened—these being the days before the "Firefox has been updated and needs to be restarted" notification.&lt;br /&gt;&lt;br /&gt;Then, everything changed. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;It started with Chrome. Following a shift to pragmatism over idealism, my work PC running Windows started to collect useful, if bloated, software like Safari, Flash, Java, Adobe Reader, and the like. Then I had a bunch of update reminders popping up at random, because they all live in their own little universe and can't be coordinated. (And many of these use their update notification to try to sneak in the non-Google-search-toolbar-&lt;i&gt;du-jour&lt;/i&gt; or worse, iTunes, if you're not paying attention.)&amp;nbsp; Many of these updates also needed a restart of the system afterwards.&lt;br /&gt;&lt;br /&gt;Chrome, on the other hand, has neither nag screen, nor reboot requirement, nor sneaky-ware.&amp;nbsp; Which, as a user, is immensely convenient, in comparison to manual updates or the alternative "automatic" update systems that I have seen.&amp;nbsp; Once again, it turns out that users don't know what they actually want.&lt;br /&gt;&lt;br /&gt;However, I've heard on the internet (and therefore it &lt;i&gt;must be true&lt;/i&gt;) that Chrome doesn't offer any control to an overworked IT administrator in a large company.&amp;nbsp; It doesn't integrate with Group Policy, it frequently writes to the user profile and stores a lot of data there, and it doesn't offer any control over the update process.&amp;nbsp; All of which combine to make it quite a thorn in the side of someone responsible for (say) 1,000 desktops.&lt;br /&gt;&lt;br /&gt;Taking these points into consideration, the authority of The User over Their Machine is not an absolute thing.&amp;nbsp; For one, the machine may not be technically "theirs," and for another, they may not be truly qualified or involved enough to make the decision themselves.&amp;nbsp; These days, I would prefer "don't annoy me" over that feeling of "don't do anything without my permission" that I so fervently believed in before.&lt;br /&gt;&lt;br /&gt;The nature of the software being updated also comes into play.&amp;nbsp; If Chrome breaks, there's typically IE, Safari, or Firefox co-installed.&amp;nbsp; Failure of Chrome is not as important as a failure in Windows, Photoshop, or a line-of-business application, all of which may be vital to a worker in a company.&amp;nbsp; When Chrome fails, it's extremely likely that there's an alternative already installed on the machine which can be used with no downtime—and Linux distributions relying on Chromium (Chrome's parent project) as a primary browser do not take updates as they are released, but use a version from the channel that has been further vetted for a week or two.&lt;br /&gt;&lt;br /&gt;Another consideration with the software is that the updates to Chrome have been relatively minor, from an interface perspective.&amp;nbsp; It's not comparable to the interface jump between major versions of Firefox, for instance.&amp;nbsp; I don't think many users would take it well if they opened up Photoshop one day and it looked entirely different than the day before, without any warning nor any way to get the comfortable UI back.&amp;nbsp; In comparison, even if you jumped from Chrome 1.0 to 9.0, about the only thing to contend with is a unified menubar, and if you installed any, the Apps area of the New Tab page.&lt;br /&gt;&lt;br /&gt;In comparison to my old whiny-egotistical-child view of "IT'S MINE," my current perception of the amount of authority a user should have over the computer is, "It depends."&amp;nbsp; (That's the ultimate cop-out answer to everything, it seems.)&amp;nbsp; The amount of warning and veto power a user needs depends on their skill, the software and its rate of change, what they're using the software for, and what environment the computer is in (work vs. home vs.  mobile broadband connection).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-8593363652889371578?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/8593363652889371578/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=8593363652889371578&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8593363652889371578'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8593363652889371578'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/03/authority-of-user.html' title='The Authority of the User'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-2907295697385023644</id><published>2011-03-02T10:35:00.001-05:00</published><updated>2011-03-02T10:35:55.039-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='man'/><category scheme='http://www.blogger.com/atom/ns#' term='tip'/><category scheme='http://www.blogger.com/atom/ns#' term='MANPATH'/><category scheme='http://www.blogger.com/atom/ns#' term='shell'/><title type='text'>Quick tip: extending the man search path without $MANPATH</title><content type='html'>If you've ever tried to add a directory to man's search path, you've undoubtedly noticed that the &lt;code&gt;MANPATH&lt;/code&gt; environment variable &lt;i&gt;replaces &lt;/i&gt;rather than &lt;i&gt;extends&lt;/i&gt;&amp;nbsp;man's built-in search path. &amp;nbsp;Today, I rediscovered a clever little setup on a machine at work.&lt;ol&gt;&lt;li&gt;Copy /etc/man.config to somewhere in your home dir. &amp;nbsp;Mine seems to be at ~/.config/man/man.config for optimal redundant redundancy. &amp;nbsp;(I will say that keeping the "man.config" name of the file makes vim highlight it without additional fuss.)&lt;/li&gt;&lt;li style="margin-top: 0.7em;"&gt;Add your desired &lt;code&gt;MANPATH&lt;/code&gt; lines to this file at whatever position you wish. &amp;nbsp;Don't forget to curse the lack of an include mechanism at this point, which prevents you from automatically getting changes to /etc/man.config. &amp;nbsp;Cheer up, because there probably won't be any.&lt;/li&gt;&lt;li style="margin-top: 0.7em;"&gt;Add an alias to your shell. &amp;nbsp;For bash, you would put something like &lt;code&gt;alias man='man -C ~/.config/man/man.config'&lt;/code&gt; (which obviously includes the name of the file chosen in step 1) into &lt;code&gt;~/.bashrc&lt;/code&gt;. &amp;nbsp;Remember to &lt;code&gt;source ~/.bashrc&lt;/code&gt; to make it take effect in the current session.&lt;/li&gt;&lt;/ol&gt;That's all! &amp;nbsp;Now when you run &lt;code&gt;man&lt;/code&gt;, your personal manpages will be searched as well.&lt;br /&gt;&lt;br /&gt;The documentation for &lt;code&gt;man&lt;/code&gt; on the system in question claims that it will use $PATH to guess at additional man page locations, but this does not actually work for me.&amp;nbsp; Having a command&amp;nbsp;in &lt;code&gt;~/.install/bin&lt;/code&gt; does not allow &lt;code&gt;man&lt;/code&gt; to find the manpage in &lt;code&gt;~/.install/share/man&lt;/code&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-2907295697385023644?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/2907295697385023644/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=2907295697385023644&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2907295697385023644'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2907295697385023644'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/03/quick-tip-extending-man-search-path.html' title='Quick tip: extending the man search path without $MANPATH'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-2085070703300852692</id><published>2011-02-28T22:29:00.002-05:00</published><updated>2011-08-20T15:23:09.030-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='explanation'/><category scheme='http://www.blogger.com/atom/ns#' term='fp'/><category scheme='http://www.blogger.com/atom/ns#' term='scope'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><category scheme='http://www.blogger.com/atom/ns#' term='perl'/><title type='text'>Variable scope, require, and use</title><content type='html'>I ran into some interesting problems in Perl, which invoked more learning around the require/use mechanisms and how constants are interpreted.&amp;nbsp; In this post, I'll lay out some general terms about variable scoping, such as lexical scope, dynamic scope, the differences between them, and how they all interact in Perl.&amp;nbsp; And then I'll cover require and use with that foundation in place.&lt;br /&gt;&lt;br /&gt;If you've been wondering about lexicals or closures, this is your post.&amp;nbsp; I've tried to lay things out more or less from basic principles, despite the verbosity of the approach, because this has taken me years to understand.&amp;nbsp; I started programming with Perl in 2000 and &lt;i&gt;still&lt;/i&gt; learned a bit more about it &lt;i&gt;today.&lt;/i&gt;&amp;nbsp; Yes, it's 2011 now.&amp;nbsp; Hopefully, you can read this and get it in less time.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;h3&gt;Lexical Scope&lt;/h3&gt;With lexical scope, the variables visible to a certain block of code depend on the physical layout of the source code.&amp;nbsp; A lexical variable can be used by code in the same block scope; typically in Perl, this is a package, a subroutine, an if/else structure, or a loop, and "a package" is often synonymous with an entire file.&lt;br /&gt;&lt;br /&gt;An example is helpful here: note that lexical variables (variables to which lexical scoping rules are applied) are defined in Perl by the &lt;code&gt;my&lt;/code&gt; keyword.&amp;nbsp; The standard use of lexicals in perl is for variables local to a subroutine:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x = 10;&lt;br /&gt;sub foo { my $x = 13; print "In foo, it's $x\n"; }&lt;br /&gt;print "At first, x is $x.\n";&lt;br /&gt;foo();&lt;br /&gt;print "Back outside, it's $x.\n";&lt;/blockquote&gt;This will inform you that x is 10, 13, and 10.&amp;nbsp; foo's definition of &lt;code&gt;my $x&lt;/code&gt; restricts the use of that variable to the scope of foo's declaration.&amp;nbsp; Instead, suppose the file declares $x with &lt;code&gt;my&lt;/code&gt;, and the sub does not use &lt;code&gt;my&lt;/code&gt;:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;my $x = 10;&lt;br /&gt;sub foo { $x = 13; print "Of course, foo sees $x.\n"; }&lt;br /&gt;print "At first, $x is still $x.\n";&lt;br /&gt;foo();&lt;br /&gt;print "Afterward, it is $x.\n";&lt;/blockquote&gt;In this case, you'll see that foo was able to change the variable declared outside of it.&amp;nbsp; How?&amp;nbsp; The sub foo is defined within the scope where $x was declared lexical, in this case the file.&amp;nbsp; If you move the sub foo into a separate file called foo.pl:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;sub foo { $x = 13; print "Of course, foo sees $x.\n"; }&lt;br /&gt;1;&lt;/blockquote&gt;And then change the original to require foo.pl:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;my $x = 10;&lt;br /&gt;require 'foo.pl'; # THIS IS BAD CODE&lt;br /&gt;foo();&lt;br /&gt;print "With require, x is $x after calling foo.\n";&lt;/blockquote&gt;Now you'll see final value as 10 again.&amp;nbsp; The &lt;code&gt;require&lt;/code&gt; establishes a new lexical scope, so foo can no longer see the &lt;code&gt;my $x&lt;/code&gt; declaration in the main script.&amp;nbsp; Note the bad code comment; this is not a good way to use require, and I'll explain why later in this post.&amp;nbsp; I'm using it here simply to demonstrate the point.&lt;br /&gt;&lt;br /&gt;Since &lt;code&gt;foo.pl&lt;/code&gt; isn't modifying the lexical, what variable gets changed?&amp;nbsp; It's actually the global $x, since variables in Perl are global by default.&amp;nbsp; If you were to print the value of &lt;code&gt;$main::x&lt;/code&gt; before and after the &lt;code&gt;require&lt;/code&gt; line, the values would be &lt;code&gt;undef&lt;/code&gt; and 13, respectively. &lt;br /&gt;&lt;br /&gt;Where does 'closure' come into lexical scope, you ask?&amp;nbsp; This is your program:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;sub series {&lt;br /&gt;&amp;nbsp; my $x = 0;&lt;br /&gt;&amp;nbsp; return sub { return ++$x; };&lt;br /&gt;}&lt;br /&gt;my $seq = series();&lt;br /&gt;for my $i (20..24) { print $seq-&amp;gt;(), "\n"; }&lt;/blockquote&gt;Well, this uses a lot more of Perl than the previous examples, but the output is pretty simple: 1 through 5.&amp;nbsp; What's going on?&amp;nbsp; The call to &lt;code&gt;series()&lt;/code&gt; returns an anonymous sub, which we store in &lt;code&gt;$seq&lt;/code&gt; as a code ref; then, we can call it with &lt;code&gt;$seq-&amp;gt;()&lt;/code&gt; where the parentheses are a standard argument list, which is empty in this case.&amp;nbsp; The print is just returning the value of that call, which is the next value of $x.&amp;nbsp; The loop variable runs from 20 to 24 just so you can see that the loop variable is not involved in the printing at all.&lt;br /&gt;&lt;br /&gt;The anonymous sub is allowed to see the &lt;code&gt;my $x&lt;/code&gt; that &lt;code&gt;series&lt;/code&gt; defined, and it's always allowed to see it—even when the sub is passed out to code that &lt;i&gt;cannot&lt;/i&gt; see it.&amp;nbsp; This is what lexical scope is all about: &lt;b&gt;the variables visible to you, based on your location in the source, remain visible to you, and nobody from outside the scope can affect them.&lt;/b&gt;&amp;nbsp; (Unless your variable holds a reference to some data that they can modify.)&lt;br /&gt;&lt;br /&gt;To differentiate variables like the last example's $x which are used in, but not defined within, a specific scope, the inner scope (the anonymous sub stored in &lt;code&gt;$seq&lt;/code&gt;) is said to have "closed over $x", and $seq itself "is a closure" because it has closed over something.&amp;nbsp; The notion of closure separates purely local variables which are re-initialized on each call (as in the first example) from variables that are defined outside of a function, which can be consistent from invocation to invocation.&amp;nbsp; The variable doesn't live &lt;i&gt;within&lt;/i&gt; the function's definition, so it is neither destroyed nor reset when the function returns or is called again.&lt;br /&gt;&lt;br /&gt;There are two last Perl-specific things I want to point out before continuing.&amp;nbsp; For one, even when a local variable is defined, the global is still accessible under the fully-qualified name:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x = 10;&lt;br /&gt;sub foo { my $x = 13; print "my x is $x but yours is $main::x\n"; }&lt;br /&gt;foo();&lt;/blockquote&gt;This will show the values 13 and 10.&lt;br /&gt;&lt;br /&gt;Also, Perl offers the &lt;code&gt;our&lt;/code&gt; keyword to "undo" the effects of a &lt;code&gt;my&lt;/code&gt;:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x = 10;&lt;br /&gt;{&lt;br /&gt;&amp;nbsp; my $x = 13;&lt;br /&gt;&amp;nbsp; print "my x is $x\n";&lt;br /&gt;&amp;nbsp; {&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; our $x;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; print "our x is $x\n";&lt;br /&gt;&amp;nbsp; &amp;nbsp; $x = 42; &lt;br /&gt;&amp;nbsp; }&lt;br /&gt;&amp;nbsp; print "x is $x, I swear\n"; &lt;br /&gt;}&lt;br /&gt;print "x is really $x\n"; &lt;/blockquote&gt;This example will show values of 13, 10, 13, and 42.&amp;nbsp; I've used unnamed blocks here because they're more convenient for this example than trying to nest subs for no reason.&amp;nbsp; &lt;code&gt;our&lt;/code&gt; establishes that, &lt;b&gt;for its lexical container,&lt;/b&gt; the variable should reference the &lt;b&gt;global variable and value.&lt;/b&gt;&lt;i&gt;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Perl 5.10 added a couple of features: one is the ability to use new features added to Perl, and another is the specific inclusion of the 'state' feature.&amp;nbsp; (See &lt;code&gt;perldoc feature&lt;/code&gt; for more in-depth information on features and their usage.)&amp;nbsp; &lt;code&gt;state&lt;/code&gt; variables are similar to &lt;code&gt;my&lt;/code&gt; variables that were declared in a separate scope just outside the function.&amp;nbsp; Only the current function can access it, and its value doesn't get reset when the function is called again.&amp;nbsp; Our example from above with &lt;code&gt;series&lt;/code&gt; and &lt;code&gt;seq&lt;/code&gt; could be written as:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;use 5.010; # activate all 5.10 features&lt;br /&gt;sub serial {&lt;br /&gt;&amp;nbsp; state $x = 0;&lt;br /&gt;&amp;nbsp; return ++$x;&lt;br /&gt;}&lt;br /&gt;for my $i (20..24) { say serial(); }&lt;/blockquote&gt;There are important differences.&amp;nbsp; In this example, there can only be one counter: you wouldn't be able to establish multiple independent sequences with multiple calls into the &lt;code&gt;sub series&lt;/code&gt; that the other example featured.&amp;nbsp; Another important difference is that state variables &lt;a href="http://www.perlmonks.org/?node_id=659342"&gt;can be initialized later&lt;/a&gt; than their traditional lexical counterparts.&lt;br /&gt;&lt;br /&gt;Finally, &lt;code&gt;use 5.010;&lt;/code&gt; is not a typo.&amp;nbsp; It's just the old version number format with 3 digits for the minor version (what perl calls "version" when reporting revision/version/subversion.)&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Dynamic scope&lt;/h3&gt;In contrast to lexical scope, dynamic scope cares nothing for source code layout.&amp;nbsp; Instead, it's affected by the run time.&amp;nbsp; Perl's "global" scope is technically dynamic, and actual dynamic scoping is created by the &lt;code&gt;local&lt;/code&gt; keyword.&amp;nbsp; How does it work?&amp;nbsp; Let's look at an example:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x = 10;&lt;br /&gt;sub foo { local $x = 42; bar(); }&lt;br /&gt;sub bar { print "x seems to be $x\n"; ++$x; }&lt;br /&gt;bar();&lt;br /&gt;bar();&lt;br /&gt;foo();&lt;br /&gt;bar();&lt;/blockquote&gt;This shows the values 10, 11, 42, and... 12!&amp;nbsp; Each time it's called, &lt;code&gt;bar&lt;/code&gt; increments the "global" variable.&amp;nbsp; However, &lt;code&gt;foo&lt;/code&gt; establishes a new value for the variable, for itself and any function it calls, and this variable disappears again when foo returns (technically, the block containing the &lt;code&gt;local&lt;/code&gt; exits).&amp;nbsp; So &lt;code&gt;bar&lt;/code&gt; updated it to 43, but then it and &lt;code&gt;foo&lt;/code&gt; returned, and the original variable (whose value was then 12) came back into effect.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;foo&lt;/code&gt; would not have been able to affect any variables declared inside bar as lexicals with &lt;code&gt;my&lt;/code&gt;, but using &lt;code&gt;local&lt;/code&gt; to change the dynamic scope, &lt;code&gt;foo&lt;/code&gt; can affect &lt;code&gt;bar&lt;/code&gt;'s view of the global variables. This would happen regardless of whether &lt;code&gt;bar&lt;/code&gt; is in a different file. It's also possible to affect another package, if &lt;code&gt;bar&lt;/code&gt; explicitly references the global in &lt;code&gt;foo&lt;/code&gt;'s package--or if the latter calls local on a punctuation variable like &lt;code&gt;$/&lt;/code&gt; which are always forced into a canonical package (&lt;code&gt;main&lt;/code&gt; if I'm not mistaken).&lt;br /&gt;&lt;br /&gt;Dynamic scope is "dynamic" because it can change from call to call, depending on whether any callers in the current call chain have used &lt;code&gt;local&lt;/code&gt; or not.&amp;nbsp; There's nothing stopping a variable from being &lt;code&gt;local&lt;/code&gt;ized several times in the  chain, either.&amp;nbsp; &lt;b&gt;Under dynamic scope, the visible variables can be redefined by action outside your location in the source.&lt;/b&gt;&amp;nbsp; When bar() updates $x, it is not guaranteed to be updating the $x defined at the top of the file.&lt;br /&gt;&lt;br /&gt;Perl's man pages note (or used to note) that if you're unsure, use &lt;code&gt;my&lt;/code&gt; to define a local variable for your subroutine, not &lt;code&gt;local&lt;/code&gt;.&amp;nbsp; This is a helpfully interpreted instruction to use lexical scope.&amp;nbsp; Dynamic scope turns out to be bad for larger systems, as your functions not only communicate through globals, but the "real" variable can be hidden and updated with a "fake" value.&amp;nbsp; Alternatively, forgetting to set (or not knowing that you need to set) a particular "fake" value before calling some subroutine can lead to surprisingly different results from a function call that appears to be the same: the argument lists may be the same, but all of the state the function relies on is not.&lt;br /&gt;&lt;br /&gt;Thus, you most often see &lt;code&gt;local&lt;/code&gt; used inside Perl to redefine one of Perl's special global variables, like &lt;code&gt;$/&lt;/code&gt;, for the duration of a block in order to set it to a known value without disturbing the view of that value from the rest of the program.&amp;nbsp; Such code rarely calls down into other functions, since it needs to have the variable set for its own work.&amp;nbsp; Also, Perl only allows for lexicals that are alphanumeric, so &lt;code&gt;local&lt;/code&gt; is the &lt;b&gt;only&lt;/b&gt; way to apply a temporary value to one of the special variables.&lt;br /&gt;&lt;br /&gt;The last thing to note about globals is that they are actually &lt;i&gt;package &lt;/i&gt;global variables.&amp;nbsp; Take the following code:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$x = 12;&lt;br /&gt;package Foo;&lt;br /&gt;$x = 16;&lt;br /&gt;sub foo { print "Foo's global x is $x\n"; }&lt;br /&gt;package main; &lt;br /&gt;sub print_x { print "main's x is $x\n"; } &lt;br /&gt;Foo::foo();&lt;br /&gt;print_x();&lt;/blockquote&gt;Note that for the sake of illustration, I've crammed everything into one file.&amp;nbsp; You wouldn't do this in real code.&amp;nbsp; An interesting side effect I discovered while testing this example is that the effect of &lt;code&gt;our&lt;/code&gt; actually lasts to the end of file (or the next declaration of the same variable, as discussed below) on my Perl, so that if I define &lt;code&gt;our $x = 16;&lt;/code&gt; inside package Foo, then the print_x subroutine uses it instead of &lt;code&gt;$main::x&lt;/code&gt;!&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Perl scope resolution&lt;/h3&gt;All of this has been implied already, but I want to bring it together here.&amp;nbsp; When an unqualified &lt;code&gt;$x&lt;/code&gt; is used, how does Perl decide which variable that actually means?&amp;nbsp; It simply travels up the lexical scope chain, looking for a &lt;code&gt;my&lt;/code&gt; or &lt;code&gt;our&lt;/code&gt; declaration that applies for that variable.&amp;nbsp; If it is a lexical (declared with &lt;code&gt;my&lt;/code&gt; or &lt;code&gt;state&lt;/code&gt;), then it's done—Perl just uses that variable.&lt;br /&gt;&lt;br /&gt;If there is another lexical of the same name earlier in the current scope, or at a more outer scope, then that earlier/outer variable becomes inaccessible from the later declaration forward.&amp;nbsp; This is known as the later variable shadowing the former.&lt;br /&gt;&lt;br /&gt;If the variable is found and was declared as global with &lt;code&gt;our&lt;/code&gt;, or not found in any lexical scope, then it is considered a global, which is taken from the dynamic scope.&lt;br /&gt;&lt;br /&gt;Using &lt;code&gt;local&lt;/code&gt; does not affect the resolution of a variable.&amp;nbsp; It must be global where the &lt;code&gt;local&lt;/code&gt; is invoked, or else you'll see an error at compile time, "Can't localize lexical variable $x".&lt;br /&gt;&lt;br /&gt;In other words, &lt;code&gt;my&lt;/code&gt; and &lt;code&gt;our&lt;/code&gt; lexically control whether a variable is lexical or global, respectively; &lt;code&gt;local&lt;/code&gt; provides a mechanism to shadow global variables.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Why (and how) "use strict" complains so much&lt;/h3&gt;One goal of &lt;code&gt;use strict&lt;/code&gt; (and the only goal of &lt;code&gt;use strict 'vars'&lt;/code&gt;) is to prevent you from unintentionally using a global variable when you meant to access a lexical one. When strict is in effect, traversing the entire lexical scope chain without finding a declaration of the variable with &lt;code&gt;my&lt;/code&gt; or &lt;code&gt;our&lt;/code&gt; triggers the error instead of falling back to global scope.&amp;nbsp; Fully-qualified variables aren't affected, because the full qualification implicitly means they are globals.&amp;nbsp; Lexical variables do not have a fully-qualified name.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Doing it wrong: "require $target" and "require 'file.pl'"&lt;/h3&gt;In the days before the package and module systems were invented for Perl 5, the way to load a library was through giving the path to the filename as a string to &lt;code&gt;require&lt;/code&gt;.&amp;nbsp; This still works today, but is widely considered wrong, because of several subtle problems that it causes.&amp;nbsp; These problems are probably what led to its replacement with the modern module system.&lt;br /&gt;&lt;br /&gt;The first problem is one of scope.&amp;nbsp; Using &lt;code&gt;require&lt;/code&gt; this way will effectively import &lt;i&gt;everything &lt;/i&gt;in the required file into your own scope, whether you wanted it or not—but only for global variables!&amp;nbsp; Lexicals remain local to their respective files.&amp;nbsp; It's easy to miss the distinction, or worse, have multiple files including the code, &lt;i&gt;some &lt;/i&gt;of which have a global variable defined as lexical.&lt;br /&gt;&lt;br /&gt;The second problem appears to be one of scope as well, but isn't.&amp;nbsp; Constants defined inside a file that has been loaded with &lt;code&gt;require&lt;/code&gt; may not be visible to the file that performed the &lt;code&gt;require&lt;/code&gt;.&amp;nbsp; Perl's constants are &lt;i&gt;really &lt;/i&gt;constant, because they're determined at compile time.&amp;nbsp; If a constant wasn't defined &lt;b&gt;at compile time&lt;/b&gt;, then it becomes a bareword, which may end up being the "constant" &lt;i&gt;name.&lt;/i&gt;&amp;nbsp; And to make this distinction important, &lt;code&gt;require&lt;/code&gt; is not normally executed &lt;b&gt;until run time.&lt;/b&gt;&amp;nbsp; By then, the file doing the &lt;code&gt;require&lt;/code&gt; has already been compiled &lt;i&gt;without &lt;/i&gt;the constants set, and does not get recompiled after the &lt;code&gt;require&lt;/code&gt; completes.&amp;nbsp; One quick patch is to use &lt;code&gt;BEGIN { require "foo.pl"; }&lt;/code&gt; which will force the &lt;code&gt;require&lt;/code&gt; to occur at compile time, which will define the constants in time for them to actually be useful.&lt;br /&gt;&lt;br /&gt;A third problem is one of paths.&amp;nbsp; To follow the real-world example, if a Web server is set up to serve &lt;code&gt;*.pl&lt;/code&gt; as CGI scripts, and this Web server changes directory to the running script, then the scripts loaded with code like &lt;code&gt;require '../lib/site.pl';&lt;/code&gt; cannot readily find the path to require more code from the lib directory themselves.&lt;br /&gt;&lt;br /&gt;A fourth problem is one of Perl's magic: &lt;code&gt;require&lt;/code&gt; includes an entry in &lt;code&gt;%INC&lt;/code&gt; after loading the file, which means that the same file required twice will not be loaded the second time.&amp;nbsp; If you &lt;i&gt;do &lt;/i&gt;try to load the same file twice using the same path name, this is almost certainly not what you want to happen.&amp;nbsp; Likewise, if you happen to load the same file twice by using &lt;i&gt;different &lt;/i&gt;path names, when you expected to be able to load it only once, this can also result in undesirable effects, depending on how the loaded file is written.&lt;br /&gt;&lt;br /&gt;There are probably more problems, but these are the ones I know of.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Doing it properly: "require Module" and "use Module"&lt;/h3&gt;The  &lt;code&gt;use SomeModule&lt;/code&gt; form imposes some restrictions on your code: the name you give it must be a bareword (an unquoted identifier, more or less); it must be available in the include path, &lt;code&gt;@INC&lt;/code&gt;, after converting double-colons to path separators; it must be named with a &lt;code&gt;.pm&lt;/code&gt; extension; and it must define the package SomeModule with an &lt;code&gt;import&lt;/code&gt; subroutine for &lt;code&gt;use&lt;/code&gt; to work.&amp;nbsp; For &lt;code&gt;require&lt;/code&gt;, the import sub isn't entirely necessary, but you'd be crazy not to include it.&lt;br /&gt;&lt;br /&gt;All these restrictions combine to make the code-inclusion mechanism more robust.&amp;nbsp; Modules have a canonical name, so they can't be loaded twice under different paths. This name resolves to the same path in the filesystem, independently of where the inclusion is initiated, which allows included files to include more files without worrying about the current directory or the path to the files being included.&amp;nbsp; The definition of a package name also creates a unique entry in the global namespace for the code, so that it doesn't need to be loaded more than once.&lt;br /&gt;&lt;br /&gt;The &lt;code&gt;import&lt;/code&gt; subroutine allows for a controlled inclusion of symbols into the caller's namespace, instead of dumping everything non-lexical (including all subroutines) into it.&amp;nbsp; &lt;code&gt;use&lt;/code&gt; has the added benefit of implicitly wrapping BEGIN around the inclusion, so that constants defined in the included module are available for use in the caller without them needing to remember to write a BEGIN of their own.&lt;br /&gt;&lt;br /&gt;Incidentally, you now know why constants are defined with &lt;code&gt;use constant&lt;/code&gt;: the &lt;code&gt;constant&lt;/code&gt; pragma wouldn't be able to do its job if it wasn't running at compile time.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Managing the Conversion from "require 'file.pl'" to "use SomeModule"&lt;/h3&gt;However, extolling the virtues of this system doesn't help you much if you have a codebase that relies on &lt;code&gt;require 'foo.pl';&lt;/code&gt; and its implicit export.&amp;nbsp; But, it's possible to write a file which can be easily converted from wrong to right.&amp;nbsp; Start by  organizing the code into a package (copying the file to a '.pm' if necessary), then add some bridge code at the beginning to 'export' subs that simply call into the package:&lt;br /&gt;&lt;blockquote style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;sub frobnicate { &amp;amp;SomeModule::frobnicate; }&lt;br /&gt;package SomeModule;&lt;br /&gt;sub frobnicate {&lt;br /&gt;&amp;nbsp; die "zomg, you didn't code this function";&lt;br /&gt;}&lt;/blockquote&gt;Now, loading this code via &lt;code&gt;require "../path/to/SomeModule.pm";&lt;/code&gt; will create a frobnicate function that simply dies. The &lt;code&gt;&amp;amp;Function;&lt;/code&gt; syntax with the ampersand and without the parentheses (both details are important!) will call function and alias its view of @_ to the current @_; this is usually not recommended, but we're doing dastardly things here, and this is just a shim that's not running any other code.&amp;nbsp; If SomeModule::frobnicate messes up @_, it doesn't affect the unqualified frobnicate, because it never uses the value itself.&amp;nbsp; It just returns the result back out, using Perl's implicit return feature.&lt;br /&gt;&lt;br /&gt;When it comes time to make SomeModule available for &lt;code&gt;use&lt;/code&gt;, the unqualified frobnicate gets deleted, and appropriate Exporter code added into the package.&amp;nbsp; Doing it in two phases like this lets you test that the simple reorganization into a package didn't break any of the callers, and that you have accounted for all the symbols that will need to be exported.&lt;br /&gt;&lt;br /&gt;Then again, all this section may just be pointless, since you could go straight to using use and Exporter.&amp;nbsp; It's not like they're all that different from this hack.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;This must be why Steve Yegge hates everything&lt;/h3&gt;Perl has a rich history starting with being a throwaway language with convenient features for the problems Larry Wall was working on at the time.&amp;nbsp; Since the Perl team has put a lot of work into backwards compatibility and convenience, it has created a lot of subtlety and minor traps.&amp;nbsp; You have to learn a &lt;i&gt;lot&lt;/i&gt; about Perl before everything makes sense, and you're not just sprinkling &lt;code&gt;my&lt;/code&gt; in front of every variable and mumbling, "WTF, why don't I just turn off use strict?&amp;nbsp; This doesn't seem helpful."&lt;br /&gt;&lt;br /&gt;Likewise, the other languages he's shared his hate for have evolved quite a bit over time into their current chimerical forms: Javascript and Python both started with global variables and hacked lexical scope in later.&amp;nbsp; PHP probably gets the same complaint, but they added lexical scoping so recently that he hasn't had either time or inclination to blog about them yet, and we know the story anyway.&amp;nbsp; Lastly, C++ was an experiment in grafting a specific view of OOP into a language that was never meant to have it.&amp;nbsp; (For other views, consider &lt;a href="http://www.paulgraham.com/reesoo.html"&gt;Rees Re: OO&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Lexical scope in other languages&lt;/h3&gt;Variables in Python began as members of one of three scopes: local to the current function, global to the current module, or built-in to Python.&amp;nbsp; The module global scope could be accessed using the &lt;code&gt;global&lt;/code&gt; keyword, which is similar in spirit to Perl's &lt;code&gt;our&lt;/code&gt; keyword.&amp;nbsp; Python added something like lexical variables as the &lt;code&gt;nested_scopes&lt;/code&gt; feature, which allowed for a function nested within some other function to see variables of the outer function that weren't in global scope.&amp;nbsp; However, all variable writing was assumed to be local, so those outer variables were effectively read-only.&amp;nbsp; Some hacks arose around this, like pointing a variable in the outer function to a writable object, allowing for communication via object updates, but in Python 3, the &lt;code&gt;nonlocal&lt;/code&gt; keyword has been introduced to allow for declaring that you want to use a name from the outer (but still not global) scope instead of local scope.&lt;br /&gt;&lt;br /&gt;PHP began with two scope levels.&amp;nbsp; Variables were either global or local to a function, with globals accessible from a function using the &lt;code&gt;global&lt;/code&gt; keyword, or at some point, the &lt;code&gt;$GLOBALS&lt;/code&gt; superglobal was introduced.&amp;nbsp; (Superglobals main difference from globals is that they're pre-defined to be always global, so a function can use them without having to declare them with &lt;code&gt;global&lt;/code&gt; first.)&amp;nbsp; This all changed with PHP 5.3.0, which introduced anonymous functions and the ability to use (close over) variables outside the function.&lt;br /&gt;&lt;br /&gt;Unlike many other languages, however, PHP went its own way on two points: only anonymous functions may close over variables, and the variables to be closed over must also be  included in the function declaration.&amp;nbsp; For example, &lt;code&gt;$onUpdate = function  ($x) use (&amp;amp;$y) { ... };&lt;/code&gt; to allow the function to read and write the value of $y from the outer scope.&amp;nbsp; (The changes are probably justified: regular functions which need state should be objects, since PHP offers class-based OOP, and explicit naming of variables fits the philosophy of "it shouldn't be possible to have bugs because something was in an unexpected scope" that drove PHP's original scope design.)&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;In Conclusion&lt;/h3&gt;Lexical variables are visible based on their position in the source code; in Perl, this is from their declaration forward to the end of their block or file.&amp;nbsp; Lexical variables can be skipped over in favor of a global (package) variable, using &lt;code&gt;our&lt;/code&gt;, which has the same area of effect as defining a lexical with &lt;code&gt;my&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;When a subroutine references a lexical variable in an outer scope, the subroutine is said to be "a closure", and it "closes over the variable".&amp;nbsp; This bears no relation to closing doors, emotional closure, or set theory (where, for instance, integers are closed under addition: add any two integers, and the result is still an integer.&amp;nbsp; This is not true of division.)&lt;br /&gt;&lt;br /&gt;As of Perl 5.10, &lt;code&gt;state&lt;/code&gt; variables are also possible, and are also private and lexically scoped; however, their value is saved between calls into the scope.&amp;nbsp; &lt;code&gt;my&lt;/code&gt; variables would be re-initialized when their declaration occurs. &lt;br /&gt;&lt;br /&gt;The value of dynamically scoped variables are controlled by the call stack; in Perl, this is accomplished by temporarily reassigning a global variable with &lt;code&gt;local&lt;/code&gt;.&amp;nbsp; If you're not using &lt;code&gt;local&lt;/code&gt; on Perl's special interpreter-control variables, then you are (or the code you're trying to affect is) most likely doing it wrong.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;use strict&lt;/code&gt; prevents the fallback from lexicals to globals.&amp;nbsp; If the desired scope isn't selected lexically with &lt;code&gt;my&lt;/code&gt;, &lt;code&gt;our&lt;/code&gt;, or &lt;code&gt;state&lt;/code&gt;, nor fully qualified by a package name (which implies a global variable), then the error is generated.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;require $target&lt;/code&gt; or &lt;code&gt;require "file.pl"&lt;/code&gt; are legacy features that have some unexpected interactions with modern Perl, and some limitations in creating truly reusable modules.&amp;nbsp; Instead, it is much preferred to create a package that can be imported with &lt;code&gt;use&lt;/code&gt;; besides giving the caller control over the imported symbols, this also makes for a more robust module system that doesn't depend on the current working directory to find the modules.&amp;nbsp; (Unless you have &lt;code&gt;use lib '.'&lt;/code&gt;, but that seems inadvisable at best.)&lt;br /&gt;&lt;br /&gt;Python and PHP have similar &lt;code&gt;global&lt;/code&gt; keywords which are similar to &lt;code&gt;our&lt;/code&gt;, and as of Python 3, the &lt;code&gt;nonlocal&lt;/code&gt; keyword allows for a lexical variable to be written to (instead of creating a local variable of the same name).&amp;nbsp; PHP gained closures in version 5.3, but only for anonymous functions, through the optional &lt;code&gt;use (...)&lt;/code&gt; clause on the function declaration.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-2085070703300852692?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/2085070703300852692/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=2085070703300852692&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2085070703300852692'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2085070703300852692'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/02/variable-scope-require-and-use.html' title='Variable scope, require, and use'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-1726494321418715932</id><published>2011-02-19T19:01:00.000-05:00</published><updated>2011-02-19T19:01:43.077-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='wacom'/><category scheme='http://www.blogger.com/atom/ns#' term='customization'/><category scheme='http://www.blogger.com/atom/ns#' term='lucid'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><title type='text'>Changing a Tablet's Active Area in Ubuntu Lucid</title><content type='html'>The following information applies to Ubuntu 10.04 LTS, Lucid Lynx, with xserver-xorg-input-wacom installed to provide xsetwacom.&amp;nbsp; This is about fine-tuning your tablet; if your tablet isn't working at all, you probably need &lt;a href="https://bugs.launchpad.net/ubuntu/+source/linux/+bug/568064"&gt;bug #568064.&lt;/a&gt; &lt;br /&gt;&lt;br /&gt;There used to be a wacomcpl program to graphically configure a Wacom tablet; this quit working with changes to the upstream project and/or the Tcl dependency, so it hasn't been working for me for some time.&amp;nbsp; Before it quit working, I set up a script to call the xsetwacom command-line program with the desired results, so the loss didn't affect me.&amp;nbsp; Mainly, I had adjusted the active area so that tracing a circle on the tablet would result in a circular shape on the monitor.&lt;br /&gt;&lt;br /&gt;With a new monitor came a new need to reconfigure the tablet, without using wacomcpl this time.&amp;nbsp; I ultimately created a couple of formulas to make a strip of the tablet inactive.&amp;nbsp; Without further ado, these are the formulas:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;$x_offset = $w - ($h * $aspect) # narrower monitor&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;$y_offset = $h - ($w / $aspect) # wider monitor&lt;/div&gt;&lt;br /&gt;$aspect is the aspect ratio of the monitor, obtained by dividing where you write the colon.&amp;nbsp; For example, 16:10 = 16/10 = 1.6.&amp;nbsp; Alternatively, you can divide the width in pixels by the height, so a 2560x1600 display has an aspect of 2560/1600 = 1.6.&amp;nbsp; (If you have square pixels, which practically everyone does because they're so convenient.)&amp;nbsp; The monitor being narrower or wider refers to whether the monitor's aspect is lower or higher than the tablet's, respectively.&amp;nbsp; You can calculate the tablet's aspect by dividing $w by $h; obtaining them is the subject of the next section.&lt;br /&gt;&lt;br /&gt;$w and $h come from the actual tablet, which you can find easily enough.&amp;nbsp; In these commands, $T represents your tablet's name, which you can get from `xsetwacom list  dev`.&amp;nbsp; In my case, there's a tool name attached, so it prints "Wacom  Bamboo 4x5 Pen STYLUS" (among other things) but only the "Wacom Bamboo  4x5 Pen" portion is the actual device name.&amp;nbsp; The first command simply resets the coordinates to cover the full tablet, just in case they have been changed. &lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;xsetwacom set "$T" xyDefault 1&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;xsetwacom get "$T" BottomX&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;xsetwacom get "$T" TopX&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;xsetwacom get "$T" BottomY&lt;/div&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;xsetwacom get "$T" TopY&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;$w is BottomX-TopX, and $h is BottomY-TopY.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Armed with this information, you should now choose the correct formula from above, and substitute all the numbers.&amp;nbsp; In my case, the top coordinates are both 0, so BottomX=$w=14720, and BottomY=$h=9200.&lt;br /&gt;&lt;br /&gt;My old monitor was much narrower (at 1280/1024=1.25) than the tablet (at 14720/9200=1.6), so I used the first formula, thus:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;$x_offset = 14720 - (9400*1.25) = 3220&lt;/div&gt;&lt;br /&gt;And to set that value:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;xsetwacom set "$T" TopX 3220&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;My new monitor runs at 1920x1080, which yields 1.7778 for aspect.&amp;nbsp; The monitor is wider than the tablet, so now I need the second formula:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;$y_offset = 9200 - (14720/1.7778) = 920&lt;/div&gt;&lt;br /&gt;Now that the offset is known, it's a simple matter to set up.&amp;nbsp; I just add it to the original TopY value (zero for me, so no different) and set that as the new TopY:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;xsetwacom set "$T" TopY 920&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Altering TopX or TopY means that the inactive portion of the tablet runs down the left or across the top.&amp;nbsp; I don't really care where the dead zone ends up, so I chose the method that results in the fewest calculations needed.&amp;nbsp; You could just as easily set BottomX to &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;BottomX-$x_offset&lt;/span&gt; to move the dead zone to the right side of the tablet, or adjust both TopX and BottomX by half of the $x_offset to keep the active area centered.&amp;nbsp;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-1726494321418715932?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/1726494321418715932/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=1726494321418715932&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1726494321418715932'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/1726494321418715932'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2011/02/changing-tablets-active-area-in-ubuntu.html' title='Changing a Tablet&apos;s Active Area in Ubuntu Lucid'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-4449440297446393621</id><published>2010-12-01T19:59:00.002-05:00</published><updated>2010-12-01T20:09:08.871-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='wallpaper'/><category scheme='http://www.blogger.com/atom/ns#' term='gdm'/><category scheme='http://www.blogger.com/atom/ns#' term='customization'/><category scheme='http://www.blogger.com/atom/ns#' term='gconf'/><category scheme='http://www.blogger.com/atom/ns#' term='lucid'/><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu'/><title type='text'>Customizing Everything I Touch: The Lucid Login Screen</title><content type='html'>This post is purposefully dense, due to my lack of time/typing budget.&amp;nbsp; If you need clarification, please leave a comment.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Directions (shortcut):&lt;/b&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp; you$ sudo -u gdm dbus-launch gnome-appearance-properties -p background&lt;/span&gt;&lt;/span&gt;&lt;b&gt; &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Directions (original): &lt;/b&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp; you$ sudo -u gdm dbus-launch bash&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&amp;nbsp; gdm$ cp wallpaper.png /var/lib/gdm&lt;/span&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&amp;nbsp; gdm$ gconf-editor&lt;/span&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&amp;nbsp; (set /desktop/gnome/background/picture_filename to /var/lib/gdm/wallpaper.png)&lt;/span&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&amp;nbsp; gdm$ exit&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;The changes will take effect once all users log out (or possibly, the user on :0.)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Commentary:&lt;/b&gt;&lt;br /&gt;The directions above copy the wallpaper to /var/lib/gdm (which is the home directory of the gdm user) to protect it against  corruption.&amp;nbsp; If you don't care, you can run the single command `&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;sudo -u gdm dbus-launch gconf-editor&lt;/span&gt;` and set picture_filename to any file on the system.&lt;br /&gt;&lt;br /&gt;In either case, dbus-launch is necessary so that there's a session bus running as the gdm user, which gconf needs to successfully run.&amp;nbsp; Otherwise, AFAICT, it tries to connect to your own session bus, which the gdm user isn't allowed to do.&lt;br /&gt;&lt;br /&gt;This edits the keys in /var/lib/gdm/.gconf, a config source set by /var/lib/gdm/gconf.path, which is wedged into the regular paths by inclusion from /etc/gconf/2/path.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Validity:&lt;/b&gt;&lt;br /&gt;Tested on Ubuntu 10.04 LTS, Lucid Lynx; most likely, this   applies to 9.10/Karmic and later Gnome 2.x-based Ubuntu releases.&amp;nbsp; (IIRC, Karmic is when the new, non-themable GDM landed in Ubuntu.)&amp;nbsp; The general philosophy probably applies to other contemporary Gnome environments, but the details may differ as there's a lot of Debian-ness in the GConf paths.&lt;b&gt; &lt;/b&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-4449440297446393621?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/4449440297446393621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=4449440297446393621&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/4449440297446393621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/4449440297446393621'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/12/customizing-everything-i-touch-lucid.html' title='Customizing Everything I Touch: The Lucid Login Screen'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-2812880238560523679</id><published>2010-10-07T21:01:00.008-04:00</published><updated>2010-10-10T10:59:35.149-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='autostart'/><category scheme='http://www.blogger.com/atom/ns#' term='gnome'/><category scheme='http://www.blogger.com/atom/ns#' term='xdg'/><title type='text'>Autostart in Gnome: the missing docs [Updated 10/10]</title><content type='html'>I've been trying to understand the autostart mechanism in Gnome 2 for a small program I'm working on. &amp;nbsp;This may continue to be a supported system in gnome-3, since it seems to be &lt;a href="http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html"&gt;standardized by freedesktop.org&lt;/a&gt; and not the Gnome "Let's just rewrite around our bugs and drop features" team.&lt;br /&gt;&lt;br /&gt;Without further ado or bitterness, here's a brief but technical dive into the modern autostart system on Gnome 2, as observed on a Lucid Lynx system (originally installed as Intrepid Ibex, I believe.) &amp;nbsp;[&lt;b&gt;Updated:&lt;/b&gt;&amp;nbsp;autostart itself &lt;a href="http://standards.freedesktop.org/autostart-spec/autostart-spec-latest.html"&gt;is also a freedesktop standard.&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;The autostart system is built on the aforementioned XDG Base Directory specification. &amp;nbsp;Inside the config dirs are directories named "autostart", and these directories are filled with .desktop files. &amp;nbsp;By default, a desktop file in one of these directories represents an application to be auto-started. &amp;nbsp;The file may override this by including the line, "&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;X-GNOME-Autostart-enabled=false&lt;/span&gt;". &amp;nbsp;[10/10: Actually, according to the autostart spec, it should use a line "&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Hidden=false&lt;/span&gt;" to accomplish this. &amp;nbsp;Furthermore, an application will not be autostarted by Gnome if the desktop entry contains an &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;OnlyShowIn&lt;/span&gt; key without &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;GNOME&lt;/span&gt; in the value, or a &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;NotShowIn&lt;/span&gt; key which includes &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;GNOME&lt;/span&gt;. &amp;nbsp;This lets KDE add autostarts for its services without disturbing Gnome, because &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;OnlyShowIn=KDE&lt;/span&gt; is set.]&lt;br /&gt;&lt;br /&gt;So, there are places like /usr/share/gnome/autostart, /usr/share/autostart, and /etc/xdg/autostart that contain these apps for the system. &amp;nbsp;Then, ~/.config/autostart directory (where ~/.config is the default value for $XDG_CONFIG_HOME) contains any modifications to the system files made in the Startup Applications preferences window (on the System &amp;gt; Preferences menu).&lt;br /&gt;&lt;br /&gt;I suspect that if your only modification in the preference pane is to turn off autostart for a command, then the system adds a .desktop file for the command to your home dir that disables the autostart. &amp;nbsp;If you then turn the command back on, the system must be &lt;i&gt;deleting your personal .desktop file&lt;span class="Apple-style-span" style="font-style: normal;"&gt;&amp;nbsp;to revert to the system-wide one.&lt;/span&gt;&lt;/i&gt;&amp;nbsp;&amp;nbsp;This accounts for reports of "~/.config/autostart isn't right, it only contains things that don't autostart." &amp;nbsp;I'm pretty sure I looked into this system in the past, briefly, and thought the same thing.&lt;br /&gt;&lt;br /&gt;If you add a custom item like freshwall to the Startup Applications preferences, then it creates a .desktop file in ~/.config/autostart with the configured command, but no disable-autostart line. &amp;nbsp;So this autostart directory must be the correct one. &amp;nbsp;I see that &lt;a href="http://www.dropbox.com/referrals/NTYyOTk1MDE5"&gt;Dropbox&lt;/a&gt; also leaves a .desktop file in there. &amp;nbsp;(And if you sign up through that link, you get an extra quarter gig of space.)&lt;br /&gt;&lt;br /&gt;Obviously, the autostart directories are searched in a specific order, and the first matching filename wins, with the user-specific directories scanned before the system-wide ones. &amp;nbsp;I don't know if all the autostart services run in any defined order, or in parallel.&lt;br /&gt;&lt;br /&gt;Lastly, these autostart entries are hooked into the Gnome session manager. &amp;nbsp;The session manager sets the DESKTOP_AUTOSTART_ID environment variable when launching the desktop, and a session-aware program should use this ID as the client ID when connecting to the session manager. &amp;nbsp;Note that this is not XSMP, but a Gnome-specific service exported via D-Bus on the session bus under the name org.gnome.SessionManager. &amp;nbsp;(There is another post planned on this session manager interface; but if you want to dig into it yourself, I recommend &lt;a href="http://live.gnome.org/DFeet/"&gt;D-Feet&lt;/a&gt;, which is &lt;a href="http://packages.ubuntu.com/lucid/d-feet"&gt;packaged for ubuntu&lt;/a&gt;. &amp;nbsp;[10/10: Ubuntu also documents this interface at /usr/share/doc/gnome-session/dbus/gnome-session.html, but I can't find it anywhere on the Web to provide a link.])&lt;br /&gt;&lt;br /&gt;To summarize: autostart relies on the XDG Base Directory specification; items in ~/.config/autostart override items in the system directories, in accordance with the spec; and a .desktop file containing the&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;X-GNOME-Autostart-enabled=false&lt;/span&gt;&amp;nbsp;line represents a command that will not be autostarted by Gnome. &amp;nbsp;An app that's being autostarted by the Gnome session manager, which wants to connect to it, should use the client ID stored in $DESKTOP_AUTOSTART_ID.&lt;br /&gt;&lt;br /&gt;References for this post:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html"&gt;standards.freedesktop.org/basedir-spec/basedir-spec-latest.html&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://standards.freedesktop.org/autostart-spec/autostart-spec-latest.html"&gt;standards.freedesktop.org/autostart-spec/autostart-spec-latest.html&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://live.gnome.org/SessionManagement/GnomeSession"&gt;live.gnome.org/SessionManagement/GnomeSession&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-2812880238560523679?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/2812880238560523679/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=2812880238560523679&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2812880238560523679'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/2812880238560523679'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/10/autostart-in-gnome-missing-docs.html' title='Autostart in Gnome: the missing docs [Updated 10/10]'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-9086825264365048746</id><published>2010-10-03T20:59:00.005-04:00</published><updated>2010-10-04T20:29:49.755-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='meta'/><title type='text'>Groupthink</title><content type='html'>I don't fit in; I never have. &amp;nbsp;Some of it is by my choice, but in the tech communities that I've found, I &lt;i&gt;still &lt;/i&gt;don't fit in, and I've often wondered why. &amp;nbsp;It seems the short answer is right there in my blog description: "Frequently contrarian and occasionally cynical."&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;I tend to find communities once they get large enough to have formed opinions on things, and then I come in to disagree. &amp;nbsp;&lt;a href="http://sapphirepaw.blogspot.com/2010/08/on-xml-and-data-formats.html"&gt;On XML and Data Formats&lt;/a&gt;&amp;nbsp;was essentially written to disagree with Reddit's opinion that JSON is Super Awesome. &amp;nbsp;Which isn't too bad, in and of itself, because such an extreme opinion on Reddit's part probably &lt;i&gt;is &lt;/i&gt;wrong, even if I have all the wrong reasons as to why. &amp;nbsp;But over time, especially if I want to post frequently, I run the risk of disagreeing just to appear different and unique, even if by chance Reddit or HN happened to be right this time. &amp;nbsp;(These communities tend to produce pressure in that direction anyway, since they hate "Me too" posts.)&lt;br /&gt;&lt;br /&gt;I've also wondered about the tendency for people to disagree with me, when it's a provably correct fact. &amp;nbsp;Someone will ask a question, and get three people answering with (also provably correct) method A. &amp;nbsp;So I come in and mention that they might prefer method B. &amp;nbsp;Finally, someone replies to me that instead of that, they could use method A. &amp;nbsp;&lt;i&gt;Well, obviously; it was mentioned three times already.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;But, I think I've figured out that pattern, too. &amp;nbsp;Where the group consensus is on one thing, then any alternatives (which is me, since I speak up when I have something Different and Awesome to say) threaten the group cohesion, and the outsider must be either driven off, or accept the group's opinion.&lt;br /&gt;&lt;br /&gt;I noticed this happening one day to someone else when I dropped by&amp;nbsp;&lt;a href="http://www.subaruforester.org/vbulletin/f151/when-will-forester-cvt-showrooms-68281/"&gt;a prominent Subaru Forester forum&lt;/a&gt;; except for a Subaru salesman who said that people who drove it loved the CVT, the whole forum piled on to argue that &lt;i&gt;anyone &lt;/i&gt;buying a CVT is a complete and utter moron. &amp;nbsp;Of course, the regular forum members are &lt;b&gt;enthusiasts.&lt;/b&gt;&amp;nbsp;&amp;nbsp;To them, anything but a manual is evil, because they mess up your starting, drifting, and anything else you're doing out there in the mud; meanwhile, everyday drivers are snapping up CVTs off the sales lot. &amp;nbsp;It's not that one point of view is invalid, or that the forum members don't realize they're talking about different scenarios—it's the existing group establishing solidarity against non-enthusiast intruders.&lt;br /&gt;&lt;br /&gt;Experience is funny like that. &amp;nbsp;When it's not personal, it makes the situation so much clearer. &amp;nbsp;And once again, as I gain more experience, I come to appreciate its value more.&lt;br /&gt;&lt;br /&gt;In order to avoid wasting time with them, and to pursue Happiness at the expense of Strife, I have blocked the tech news sites I frequent on my computer. &amp;nbsp;And when I blog in the future, I want it to be something actually interesting and perhaps in-depth, like that post on&amp;nbsp;&lt;a href="http://www.sapphirepaw.org/blog/index.php?/archives/31-The-Future-Toolkit.html"&gt;the future toolkit&lt;/a&gt;&amp;nbsp;at my old blog. &amp;nbsp;Not anti-reddit stuff that they're not going to see, read, or agree with anyway.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-9086825264365048746?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/9086825264365048746/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=9086825264365048746&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/9086825264365048746'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/9086825264365048746'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/10/groupthink.html' title='Groupthink'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-5527835956694264579</id><published>2010-09-18T20:52:00.000-04:00</published><updated>2010-09-18T20:52:18.674-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='schedule'/><title type='text'>Post Schedule Reduction</title><content type='html'>I have decided to post high-quality work when it is ready. &amp;nbsp;Writing weekly is making me rush out junk, which conflicts with my primary goal of improving my writing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-5527835956694264579?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/5527835956694264579/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=5527835956694264579&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/5527835956694264579'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/5527835956694264579'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/09/post-schedule-reduction.html' title='Post Schedule Reduction'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-8317419552599923036</id><published>2010-09-14T19:31:00.000-04:00</published><updated>2010-09-14T19:31:29.196-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='economics'/><category scheme='http://www.blogger.com/atom/ns#' term='ambiguity'/><category scheme='http://www.blogger.com/atom/ns#' term='music'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='ui'/><category scheme='http://www.blogger.com/atom/ns#' term='apple'/><title type='text'>Dealing with Ambiguity</title><content type='html'>Apple and their fans tend to view their products as the top of the market, with price and attitude to match (and this is helped by their competitors trying to undercut them with unrefined but cheap ripoffs). &amp;nbsp;Yet they clearly market heavily, which suggests according to&amp;nbsp;&lt;a href="http://www.sapphirepaw.org/blog/index.php?/archives/3-The-Advertising-Curve.html"&gt;the Advertising Curve&lt;/a&gt;&amp;nbsp;that there's room above Apple for a better product, for even more money.&lt;br /&gt;&lt;br /&gt;Which naturally leads to the question: considering an Apple product like the iPod Touch, what would be better, and in particular, enough better that you could actually sell them?&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;h4&gt;My Ideal Player&lt;/h4&gt;The ideal music player would be able to connect wirelessly to access the music store, in order to buy or retrieve new music. &amp;nbsp;It would support backing up and restoring music. &amp;nbsp;To provide the ultimate choice in music, it would allow me to download a plugin to allow access to any music store, like Amazon or Magnatune. &amp;nbsp;It would be able to be charged by induction. &amp;nbsp;Finally, if it had Bluetooth to support any of these features, it would also support playing audio over Bluetooth.&lt;br /&gt;&lt;br /&gt;In other words, my ideal player hardware is... an iPod Touch with an induction power input. &amp;nbsp;It has all the wireless capabilities, and the rest is software.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;The Problem with Software&lt;/h4&gt;It's always terrible. &amp;nbsp;Today, Exaile is showing me two separate "Unknown" albums in the album list—one which sorts in with U in the normal way, and one which sorts as a separate letter after Z. &amp;nbsp;Also, there has never been a way to roll all the DDR-related albums ("DDR 5th Mix", "DDR PSX Exclusives", etc.) into a single, virtual DDR album.&lt;br /&gt;&lt;br /&gt;But that's complicated stuff. &amp;nbsp;Before settling on Exaile on Linux and Foobar2000 on Windows, I have run into plenty of players that couldn't understand multi-disc collections, either sorting by disc number alone (tracks randomized), or track number alone (alternating discs). &amp;nbsp;Again with media players, the ability to stop playing at the end of the current track is a rare feature. &amp;nbsp;With RSS readers, I quit reading feeds entirely when I quit using KDE, because &lt;i&gt;nothing &lt;/i&gt;else offered a stable, convenient reader with a "Next Unread" key.&lt;br /&gt;&lt;br /&gt;Even the hallowed iTunes from the Temple of Apple falls short of letting a user do what they need to do. &amp;nbsp;If I make a podcast, but don't author it so that iTunes &lt;i&gt;knows &lt;/i&gt;it's supposed to be a podcast, there's no way to convince it to show as a podcast in iTunes or on the Nano. &amp;nbsp;Don't worry; we are smarter than you.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;The Problem with Customers&lt;/h4&gt;People love cheap.&lt;br /&gt;&lt;br /&gt;Even if the iPod Touch is my ideal hardware, I'm still uncertain about the price tag. &amp;nbsp;I could settle for a non-Apple player for that kind of money, and have enough left over for a Kindle. &amp;nbsp;It's sufficiently hard to estimate the value of these things, without access to any of them, that it makes it quite difficult to decide whether I would, in fact, be satisfied with owning one for that price. &amp;nbsp;This estimation of value is not helped by the apparent inability to find what apps are even &lt;i&gt;available &lt;/i&gt;in the app store without already owning the device. &amp;nbsp;It may have an app for what I want (direct purchase from Amazon), but this information is hidden.&lt;br /&gt;&lt;br /&gt;Nor is the value enhanced by the fact that I wouldn't normally be able to run iTunes on my Linux machine, and Apple has probably changed the interface again to break compatibility with Linux iPod software. &amp;nbsp;Assuming, of course, that any free software even &lt;i&gt;supported &lt;/i&gt;the previous iPod Touch models.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;The Sum of the Vectors&lt;/h4&gt;Since people don't like to pay for software, and stable, functioning-as-designed software is expensive enough as it is, it's rare to find anyone putting more effort into their software. &amp;nbsp;When faced with the messy vagueness of the Real World, it's so much easier to file it under Unknown than to work at all on designing or implementing a better solution.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-8317419552599923036?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/8317419552599923036/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=8317419552599923036&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8317419552599923036'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8317419552599923036'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/09/dealing-with-ambiguity.html' title='Dealing with Ambiguity'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-353000558077988673</id><published>2010-09-05T20:57:00.002-04:00</published><updated>2010-09-05T20:57:00.285-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='first mover'/><category scheme='http://www.blogger.com/atom/ns#' term='science'/><category scheme='http://www.blogger.com/atom/ns#' term='markets'/><category scheme='http://www.blogger.com/atom/ns#' term='dvorak'/><title type='text'>Bias</title><content type='html'>The topic of alternative keyboard layouts inevitably comes up from time to time on programmers' forums: is Dvorak really better than Qwerty?&amp;nbsp; Is it objectively proven?&amp;nbsp; If nobody has &lt;i&gt;proven &lt;/i&gt;it, why would anyone switch?&amp;nbsp; Also if it's unproven, why aren't there any satisfactory studies?&lt;br /&gt;&lt;br /&gt;It is this latter point that fascinates me.&amp;nbsp; We have studies by August Dvorak, inventor of the Dvorak keyboard layout, which purportedly show that it's faster than Qwerty.&amp;nbsp; We also have studies by Strong, commissioned by the GSA in 1956, which claim to show the opposite; supporters of the Dvorak keyboard claim that Strong was biased in favor of Qwerty, and most accept that Dvorak may have been biased in favor of his own creation.&lt;br /&gt;&lt;br /&gt;How does one design and execute an &lt;i&gt;unbiased &lt;/i&gt;study in a world where &lt;i&gt;only the biased even &lt;b&gt;care &lt;/b&gt;&lt;/i&gt;about the outcome?&amp;nbsp; If such a study were performed, would the 'losing' side accept the results?&amp;nbsp; Or would the study get ignored (or contested) for all time?&amp;nbsp; If an alternative keyboard were, in fact, proven to be better than the existing choice, how many people would actually switch?&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;h4&gt;Other things to consider besides raw metrics&lt;/h4&gt;Right now, Qwerty enjoys an enormous network effect.&amp;nbsp; It's easiest to be proficient on only one keyboard layout at a given time, and  you can &lt;b&gt;effortlessly &lt;/b&gt;type on everyone else's computer that has that particular layout set.&amp;nbsp; (Qwerty on computers enjoys the additional advantage that if you &lt;i&gt;don't &lt;/i&gt;touch-type, you can still type in the letters you want, because they match the letters printed on the keycaps.)&amp;nbsp; The vast majority of "other keyboards" are Qwerty: not only are most people not typing at the computer frequently enough to switch, but the majority of the ones who &lt;i&gt;are &lt;/i&gt;were trained in Qwerty, and continue to use it.&lt;br /&gt;&lt;br /&gt;Also, the majority of other people can only handle Qwerty keyboards.&amp;nbsp; If I loan my computer to someone, I have to remember to set it up as Qwerty for them, or they'll be really annoyed with me. &lt;br /&gt;&lt;br /&gt;To back up these wild assertions, I can think of only three other people IRL who actually use Dvorak, and one who tried it until it killed his Qwerty skill, which he needed to repair everyone else's computers.&amp;nbsp; I would consider them all techies, although one of them is more aligned in art and cooking than machines.&amp;nbsp; This is why I think it's safe to say that a minority-of-a-minority use Dvorak.&amp;nbsp; (And it's the popular alternative: I'm the only one I know IRL who has even &lt;i&gt;heard &lt;/i&gt;of &lt;a href="http://en.wikipedia.org/wiki/Maltron_keyboard#Layouts"&gt;Maltron&lt;/a&gt;, &lt;a href="http://mkweb.bcgsc.ca/carpalx/?full_optimization"&gt;QGMLWB&lt;/a&gt;, or &lt;a href="http://www.colemak.com/"&gt;Colemak&lt;/a&gt;.&amp;nbsp; If you don't count me telling Eric about Colemak.)&lt;br /&gt;&lt;br /&gt;Another interesting thing a modern study of Dvorak could consider would be coding: in Dvorak, compared to Qwerty or Colemak, the punctuation is rearranged, and programming text relies a lot more heavily on it than English prose.&amp;nbsp; I find Dvorak's choices a lot more suitable for typing arrows in PHP and Perl, since they alternate hands, and don't require a same-hand jump from the number row to the bottom row. &amp;nbsp;Although that means other frequently-used punctuation like brackets and braces move up into the number row, they don't seem to run into the same slowdowns as the arrows did.&lt;br /&gt;&lt;br /&gt;Finally, the actual keyboard in use may affect the results.&amp;nbsp; My wife has difficulty typing on my MS Natural 4000.&amp;nbsp; (I had difficulty typing on it with the Ergonomist Approved Reverse Slope board installed, which also made my wrists hurt worse than they ever had before.&amp;nbsp; Quite possibly, I needed a drafting chair to keep it from pushing my wrists into the sky.)&amp;nbsp; Any modern test of keyboarding prowess is going to have to let people use their own keyboards, or give them time to train exclusively on the standardized keyboard hardware used in the test.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;The market argument&lt;/h4&gt;Another reason Dvorak vs. Qwerty seems to come up a lot is that it is the primary example of a technology that is considered technically superior, that has failed in the marketplace.&amp;nbsp; It's the quintessential example of the first-mover advantage leading to the entrenchment of Qwerty, and of the fact that technical superiority doesn't matter.&lt;br /&gt;&lt;br /&gt;Unfortunately, neither of these theories are laws of the universe.&amp;nbsp; There are plenty of examples of the first mover failing.&amp;nbsp; Just ask Sega.&amp;nbsp; Or perhaps Apple: the company that brought WIMP environments to the masses, only to be out-executed by the Amiga for a while, and then the PC.&amp;nbsp; (Apple got the last laugh vs. the Amiga, though: they &lt;i&gt;successfully &lt;/i&gt;switched to PowerPC, to OS X, and once &lt;i&gt;again&lt;/i&gt; to Intel.&amp;nbsp; They also are managing/have managed the 64-bit transition quite smoothly, at least from my vantage point outside the system.)&amp;nbsp; For all their prowess, though, the only Apple product leading the market years after introduction is the iPod—it's too soon to tell for the iPhone, although their marketshare is uncertain in the face of Android.&amp;nbsp; The iPod was in the unique position of being both desirable and DRM-locked, so again we see other factors at play besides "first" or "best".&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;A prediction&lt;/h4&gt;I doubt that any sound study of Dvorak is going to show such incredible gains in speed as Dvorak's studies did, but on the other hand, I don't think that they're going to show incredible losses, either. &amp;nbsp;Such a study would also need to take self-reported comfort into account, in case Dvorak typists are self-limiting in speed in order to self-limit pain.&amp;nbsp; Life is rarely perfect and simple.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-353000558077988673?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/353000558077988673/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=353000558077988673&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/353000558077988673'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/353000558077988673'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/09/bias.html' title='Bias'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-814804856635554404</id><published>2010-08-29T20:59:00.001-04:00</published><updated>2011-11-14T16:13:26.848-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='templates'/><category scheme='http://www.blogger.com/atom/ns#' term='php'/><category scheme='http://www.blogger.com/atom/ns#' term='pagelib'/><category scheme='http://www.blogger.com/atom/ns#' term='magic'/><category scheme='http://www.blogger.com/atom/ns#' term='history'/><title type='text'>Pagelib Retrospective</title><content type='html'>I have always been in love with magic: one of the first languages I taught myself once I was old enough to understand what I was doing was Perl, and it fit me like a skin-tight sci-fi spacesuit that we thought we were all going to wear in the future, back in the 1980's.&amp;nbsp; Steve Yegge's &lt;a href="http://sites.google.com/site/steveyegge2/ancient-languages-perl"&gt;rant about Perl&lt;/a&gt; was entertaining, but I can't say as I identified with it much, since references weren't hard for me, nor do I have his problem of forgetting everything when I don't use the language for a while.&amp;nbsp; Maybe I don't drink enough.&lt;br /&gt;&lt;br /&gt;Still, being in love with magic and &lt;acronym title="Do What I Mean"&gt;DWIM&lt;/acronym&gt; doesn't mean that there aren't any hazards.&amp;nbsp; It's rather like fairy-tale magic, which can result in nasty, surprising, unexpected consequences if used without the utmost care.&amp;nbsp; This is a story about me exercising a &lt;i&gt;complete and utter lack&lt;/i&gt; of care, utmost or otherwise.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;Pagelib was a templating system I devised, which used straight PHP in the templates, as well as being embedded in a larger PHP system.&amp;nbsp; One of pagelib's design goals was to give the template programmer (also myself, and sometimes Brian) &lt;b&gt;nigh ultimate power!&lt;/b&gt; to create without limits.&amp;nbsp; Because as everyone knows, limitations are absolutely horrible.&lt;br /&gt;&lt;br /&gt;The principle of operation of the system was simple: the caller set up some variables, and maybe a callback if the template being invoked needed one, and then launched the template.&amp;nbsp; Of course, this meant that the caller had to know everything the template could possibly want, as well as whether it would ask for another template to be embedded.&amp;nbsp; If it did, then the callback would need to launch another template with another set of variables.&amp;nbsp; Theoretically, it could also set up another callback to continue to darker depths, but I'm not sure this capability was actually used.&lt;br /&gt;&lt;br /&gt;Meanwhile, I didn't really understand proper template design, so instead of passing in a domain object and letting the template pull data from it, I extracted and computed all the property values that the template would need, and assigned them all to individual variables.&amp;nbsp; Neither "&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;$user-&amp;gt;first $user-&amp;gt;last&lt;/span&gt;" nor "&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;$user-&amp;gt;getFullname()&lt;/span&gt;" would sully &lt;i&gt;my &lt;/i&gt;code!&amp;nbsp; "&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;$user_fullname&lt;/span&gt;" all the way!&amp;nbsp; An unfortunate consequence of this decision was that the code naturally wanted to be "&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;$data['user_fullname']&lt;/span&gt;" because the variables were given to the page as an associative array in &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;$data&lt;/span&gt;.&amp;nbsp; But I didn't like typing &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;$data['']&lt;/span&gt; all the time, so the templates all called &lt;a href="http://us2.php.net/extract" style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;extract&lt;/a&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;($data)&lt;/span&gt; at the beginning.&lt;br /&gt;&lt;br /&gt;Pagelib was also designed as a strictly one-pass system, since I hadn't yet convinced myself of the need for multiple passes to allow fully self-contained modules. As a result, the uppermost caller needed to set up any CSS or script inclusions that any template would need, totally breaking the self-contained modularity that the lower templates were supposed to provide.&amp;nbsp; Which is also (partly) why all the CSS for individual pages ended up in one giant CSS file. &lt;br /&gt;&lt;br /&gt;Finally, I'm pretty sure that pagelib took some measures to work around the fact that calling include() in PHP runs in the context of the code calling include, which means it will have access to &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;$this&lt;/span&gt; or &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;self::&lt;/span&gt; if called from inside a class method.&amp;nbsp; This is why it only offered &lt;i&gt;nigh &lt;/i&gt;ultimate power, rather than the real thing.&lt;br /&gt;&lt;br /&gt;Everything about pagelib makes me cringe today.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Ur Doin it Wrong&lt;/h4&gt;The variables being passed to a pagelib template were effectively hidden: only the template to be called next was being constructed at any time, so variables to nested templates were completely invisible.&amp;nbsp; Even from the point of view of a template, only the variables passed by the given execution path were visible.&amp;nbsp; Of course there was no documentation, because we could always &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;grep&lt;/span&gt; the code... right?&lt;br /&gt;&lt;br /&gt;The power exposed by the &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;doCallback()&lt;/span&gt; system also turned out to be basically useless.&amp;nbsp; In almost every case, the callback was no more complicated than a simple "include page-specific template" instruction, but since this happened in the Ultimately Powerful Callback, the code was essentially six lines of boilerplate everywhere.&amp;nbsp; The template could even pass arguments into doCallback, which would be passed along to the actual callback function, but that turned out to be even more useless.&amp;nbsp; The upper level page knew everything anyway, since it had to set up the callback, so why bother passing a message from it to itself via the template?&amp;nbsp; Yo dawg....&lt;br /&gt;&lt;br /&gt;I'm not entirely convinced that PHP was the best choice of template language, either.&amp;nbsp; Walking that path either requires short tags to be on, or litters the template with &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;lt;?php echo ?&amp;gt;&lt;/span&gt; every time a variable needs printed.&amp;nbsp; And even with short tags, the bracket-question combinations are somewhat difficult to type so frequently.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Pagelib's Legacy&lt;/h4&gt;Ultimately, Pagelib and its larger system became a victim of the &lt;a href="http://en.wikipedia.org/wiki/Software_Peter_principle"&gt;software Peter principle&lt;/a&gt;.&amp;nbsp; Quite scorched, and with newfound humility, I designed a much simpler system (unimaginatively named "Output") for my next and final project with that employer.&amp;nbsp; Output was still a single-pass system, so the caller of a template still needed to know that it had a slot for including another template.&lt;br /&gt;&lt;br /&gt;But this time, the association was done by name, not implicitly in the control flow.&amp;nbsp; Also, Output's variable system was designed that each template had its own set of local variables, and inherited its parents' as well.&amp;nbsp; So when the top-level code assigned variables to a template, it was clear where they would be visible, and that they were being made available.&lt;br /&gt;&lt;br /&gt;Output's simplicity and clarity was the seed of my acceptance of the Zen of Python.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-814804856635554404?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/814804856635554404/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=814804856635554404&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/814804856635554404'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/814804856635554404'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/08/pagelib-retrospective.html' title='Pagelib Retrospective'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-7041763679979495171</id><published>2010-08-22T20:56:00.044-04:00</published><updated>2010-08-22T20:56:00.589-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cross-platform'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><title type='text'>Quickie: The Cross Platform Quandry</title><content type='html'>When an app runs on multiple platforms, it seems like there are two basic designs that they can choose to follow.&amp;nbsp; Either they try to integrate with each platform, or they try to appear the same on each platform.&lt;br /&gt;&lt;br /&gt;It's not clear to me that either of these ways are correct.&amp;nbsp; Obviously, in the case where the differences between the platforms are hidden to the greatest degree possible, the app doesn't match the look and feel of platform-native applications.&amp;nbsp; This is the situation with old Mozilla builds that drew virtually all the UI themselves: the chosen GTK+ theme had no effect on Linux.&amp;nbsp; An app that looks "the same everywhere" can look like it belongs with at most one platform.&lt;br /&gt;&lt;br /&gt;Meanwhile, since Mac users in particular are vocal about their dislike for using non-Mac-like apps, most modern cross-platform apps  have switched to trying to integrate with each platform.&amp;nbsp; Firefox has a different theme and adjusts its button order on the various platforms; likewise, GIMP for Windows switches its button order to match the Windows custom of putting Cancel in the corner instead of OK.&lt;br /&gt;&lt;br /&gt;As a habitual user of Firefox, Gimp, and Pidgin on both Linux and Windows, this second approach turns out to be less-than-ideal for people who frequently switch platforms.&amp;nbsp; In this case, the "same" app behaves differently on the different platforms, and I have to remember which platform I'm on so that I can hit the correct buttons.&amp;nbsp; Of course, this doesn't always happen, so frequently in Windows Gimp, I spend a couple of minutes selecting the perfect color, only to hit Cancel by mistake.&amp;nbsp; Another frequent source of mistakes is the way that Pidgin's tray icon toggles the buddy list with a double-click on Windows, but a single-click on Linux.&amp;nbsp; I'm frequently opening-then-closing Linux Pidgin with a double-click.&lt;br /&gt;&lt;br /&gt;One last problem with writing a cross-platform app for Linux/BSD specifically is that these are not all-in-one platforms.&amp;nbsp; An app like Firefox that's written in GTK and follows the Gnome HIG is &lt;i&gt;still &lt;/i&gt;going to fail to integrate with the KDE desktop on the points where their respective conventions differ.&lt;br /&gt;&lt;br /&gt;Given these difficulties, it's not surprising that most apps aren't cross-platform.&amp;nbsp; The ones which are face the choice of whether they should integrate with platforms and appear different in the details to people who work on multiple platforms, or whether they should look the same everywhere and look alien or second-class on each platform.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-7041763679979495171?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/7041763679979495171/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=7041763679979495171&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7041763679979495171'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/7041763679979495171'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/08/quickie-cross-platform-quandry.html' title='Quickie: The Cross Platform Quandry'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-8343307714643687487</id><published>2010-08-15T20:58:00.001-04:00</published><updated>2010-08-15T20:58:00.199-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='flexibility'/><category scheme='http://www.blogger.com/atom/ns#' term='dsl'/><title type='text'>Perfectly Flexible</title><content type='html'>Sometimes, a stakeholder doesn't like the idea of requiring a Programmer to make lengthy changes to a project's code if the requirements were to change at some future date. The proposed solution is always simple: The System just needs to be more flexible.&lt;br /&gt;&lt;br /&gt;Having been the unsuspecting Programmer on a couple of these projects, I can report that the simple solution does not work.&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;The typical pattern was for the interested stakeholder to think up a few possible directions that they would like to take their business/software in the next version or two.&amp;nbsp; These would be delivered as kind of soft 'keep this in mind' requirements.&amp;nbsp; Doing so necessitated additional flexibility above what the base system would normally require, and the complexity added by this flexibility increased the amount of code to design and write.&amp;nbsp; In effect, a down payment was made on a future version or two, but only the initial version was actually delivered.&lt;br /&gt;&lt;br /&gt;This situation would have been fine, except that when the future arrives, it has the unpleasant habit of taking a completely unanticipated path.&amp;nbsp; Frequently, this would be one that would violate some key assumption that was &lt;i&gt;never &lt;/i&gt;supposed to be possible in the original system. &amp;nbsp;Which, of course, meant that the assumption was coded right into the system with no thought for flexibility at all. &amp;nbsp;The down payments on the other features went to waste, and no time was ultimately saved.&lt;br /&gt;&lt;br /&gt;Of course, the  stakeholder seemed to see this as a failure to be sufficiently flexible.&amp;nbsp; It made me think, though: how much flexibility is required in order to avoid future surprises from causing a major re-engineering effort?&amp;nbsp; What is the ultimate limit that flexibility can be pushed to?&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Lisp again&lt;/h4&gt;It seemed like the best way to make changing the system as easy as possible would be to split the application into layers.&amp;nbsp; In a shopping-cart system, one layer would provide things like invoices, line-items, shipping cost estimation, and so forth; and the next layer up would link these facilities together into a cohesive whole.&amp;nbsp; What would that upper layer consist of?&lt;br /&gt;&lt;br /&gt;Why, that would almost be like &lt;i&gt;source code &lt;/i&gt;to the base-level &lt;i&gt;interpreter, &lt;/i&gt;and all those lovingly-crafted objects like the invoice would appear as something akin to &lt;i&gt;host objects &lt;/i&gt;in Javascript: pre-populated values that provided access to an otherwise unreachable set of functionality.&amp;nbsp; As a speed hack, both layers could hypothetically be in the same language, but then it would take some discipline to keep them cleanly separated.&amp;nbsp; Well, it would've taken my 24-year-old self some discipline, anyway.&lt;br /&gt;&lt;br /&gt;Thinking of these layers as engine and script, I realized I had actually seen this before.&amp;nbsp; These were domain-specific languages, and in a shocking parallel realization, UnrealScript was &lt;i&gt;also &lt;/i&gt;a domain-specific language.&amp;nbsp; That old, oft-repeated advantage about creating mini-languages in Lisp suddenly began to take on real meaning, once I could see that these languages were so useful that they were implemented outside of Lisp, too.&lt;br /&gt;&lt;br /&gt;Worse, the Java+XML world that I was always making fun of, being way too smart to bother with such obviously silly nonsense, began to make sense.&amp;nbsp; In the Kingdom of Nouns, Java classes implemented domain-specific semantics in order to execute an XML script.&amp;nbsp; Java's XML facilities provided a standard, runtime-controllable reader for languages written in XML syntax, just like Lisp provided runtime control for its reader of S-expressions.&amp;nbsp; Stupid Java, not being dumb after all.&lt;br /&gt;&lt;br /&gt;After those revelations, it seemed that the ultimate level of flexibility is also a domain-specific language.&amp;nbsp; As long as the necessary primitives are in place, anything computable can be computed in the language.&amp;nbsp; (Although this is sometimes a uselessly academic distinction, if the operations are too hard or cumbersome to be usable.)&lt;br /&gt;&lt;br /&gt;I didn't get to try out my newly-discovered ideas at the time, and I haven't since; back then, I preferred the devil I knew, and in most of the places I have worked, the management has not been nearly so interested in technical awesomeness as simple, maintainable code.&amp;nbsp; I wonder how things would go if I tried to create a DSL today.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-8343307714643687487?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/8343307714643687487/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=8343307714643687487&amp;isPopup=true' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8343307714643687487'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8343307714643687487'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/08/perfectly-flexible.html' title='Perfectly Flexible'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-8156985605319766404</id><published>2010-08-08T20:55:00.010-04:00</published><updated>2010-08-08T20:55:00.517-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='data'/><category scheme='http://www.blogger.com/atom/ns#' term='dom'/><category scheme='http://www.blogger.com/atom/ns#' term='xml'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='json'/><title type='text'>On XML and data formats</title><content type='html'>In many discussions of XML, there seems to be a faction of programmers who are completely dead-set against XML. &amp;nbsp;They'll insist on JSON, or YAML, or any other cool technology that isn't  supported in that 5-year-old version of whatever language your company is &lt;i&gt;still running&lt;/i&gt;&lt;i&gt;.&lt;/i&gt;&amp;nbsp;&amp;nbsp;The usual complaints leveled against XML-based formats are verbosity and the complexity of the DOM. &amp;nbsp;(Sometimes, leading or trailing whitespace on element contents in pretty-printed XML will bite a project, but this never seems to come up in internet flamewars.)&lt;br /&gt;&lt;br /&gt;The really young, or maybe just incurably naive, programmers will even chime in that anything that can be done in XML can be done, or even done better, in JSON. &amp;nbsp;I even thought this once, until I tried to write a generic XML-to-JSON converter, which showed me how wrong I was.&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;Ultimately, I learned that XML—based on SGML, perhaps with some lessons learned from HTML—is at its heart a &lt;i&gt;document&lt;/i&gt;&amp;nbsp;format, and it makes the most sense when used to mark up &lt;i&gt;document text.&lt;/i&gt;&amp;nbsp;&amp;nbsp;XML tags and attributes contain all sorts of useful metadata, and the angle brackets isolate it from the core text so well that something intelligible may still come out if you strip all the tags and examine nothing but the element content.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Corollary:&lt;/b&gt; if something useful doesn't result after the XML tags are stripped, then XML was not the optimal choice of format. &amp;nbsp;Things like XML-RPC come to mind for this. &amp;nbsp;In those cases, XML was chosen for familiarity more than anything else.&lt;br /&gt;&lt;br /&gt;When I was attempting to write my converter, I started simple. I figured out how to handle &amp;lt;tag attribute="value"&amp;gt;&lt;tag attribute="value"&gt;text content&lt;/tag&gt;&amp;lt;/tag&amp;gt; without collision. In the upper hash table, "tag" would point to another hash table with keys for each attribute, plus a special non-colliding one for the element's content. &amp;nbsp;Then, I asked myself, "What happens if I need to encode '&amp;lt;strong&amp;gt;some &amp;lt;em&amp;gt;great&amp;lt;/em&amp;gt; markup&amp;lt;/strong&amp;gt;'?"&lt;br /&gt;&lt;br /&gt;Answer: My JSON structure would be a reimplementation of the DOM tree. &amp;nbsp;To faithfully represent the XML, I would have to know how many children an element had, what those child types were, and what order they belonged in. &amp;nbsp;To extract a given node's text content, I'd have to visit all the children of that node and join their text content together, in order, just like XML. &amp;nbsp;(If the XML API has an innerText call, it's just a convenient way to ask the API to do the exact same task.)&lt;br /&gt;&lt;br /&gt;I was forced to conclude that, if there was no way to convert XML to JSON other than to create a structure that describes every nuance of the XML source, then &lt;b&gt;XML must be strictly more expressive than JSON.&lt;/b&gt;&amp;nbsp; The fact that you &lt;i&gt;can &lt;/i&gt;represent XML in JSON is only interesting in an academic manner, similar to how you &lt;i&gt;can &lt;/i&gt;write programs with a few fundamental operations—but a high-level language with a broad standard library is generally more productive.&lt;br /&gt;&lt;br /&gt;Once the difference between XML and JSON is understood, then it becomes much simpler to assess a problem and determine the correct approach.&amp;nbsp; Are you sending data structures, such as function names and parameter lists, back and forth?&amp;nbsp; JSON-RPC is the more natural choice.&amp;nbsp; Do you need to mark portions of a text with links, formatting, attribution, or other information that spans arbitrary ranges of the text?&amp;nbsp; XML will probably be easier to handle.&lt;br /&gt;&lt;br /&gt;Occasionally, the best solution may be passing a JSON data structure, with one or more values &lt;i&gt;containing &lt;/i&gt;XML as string data, to be parsed separately from the JSON at the destination.&amp;nbsp; Sometimes the cry for "Only one format!" for simplicity's sake actually makes the single-format solution more complex overall.&amp;nbsp; Maybe there's a cost to loading both XML and JSON libraries, but in general, costs should be cut only when they have been proven to be excessive; otherwise, it is too easy to be bit wise and word foolish.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;&lt;b&gt;Author's Note:&lt;/b&gt; The term "DOM tree" in this post refers specifically to the DOM as a conceptual representation of an XML structure, not the Java-oriented API of the same name.&amp;nbsp; The DOM API is a terrible way to work with the DOM tree in any other language, and the author's mild fondness for the tree should not, in any circumstance, be mistaken for condoning the API.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-8156985605319766404?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/8156985605319766404/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=8156985605319766404&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8156985605319766404'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8156985605319766404'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/08/on-xml-and-data-formats.html' title='On XML and data formats'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3922219755971684412.post-8768979768666813649</id><published>2010-08-01T20:59:00.006-04:00</published><updated>2010-08-02T20:01:39.367-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='fp'/><category scheme='http://www.blogger.com/atom/ns#' term='style'/><category scheme='http://www.blogger.com/atom/ns#' term='oop'/><title type='text'>Functional vs. Object Oriented</title><content type='html'>By the time I reached my first programming job, I had learned a number of languages well enough to program in, and thought of myself as a pretty good programmer.&amp;nbsp; Then I met Brian, who frequently talked about Lisp, but had trouble articulating the advantages himself.&amp;nbsp; (This turns out to be a common problem.)&amp;nbsp; He introduced me to the &lt;acronym title="Structure and Interpretation of Computer Programs"&gt;SICP&lt;/acronym&gt; videos, and I found &lt;a href="http://www.defmacro.org/ramblings/lisp.html"&gt;Slava Akhmechet's "The Nature of Lisp" essay&lt;/a&gt; on defmacro.org at about the same time.&amp;nbsp; Certain things about Perl and Javascript suddenly made sense, filling me with newfound wonder.&lt;br /&gt;&lt;br /&gt;What I thought I knew about &lt;acronym title="Object Oriented Programming"&gt;OOP&lt;/acronym&gt; as taught in college—Encapsulation, Polymorphism, and Inheritance—seemed like a sham.&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;I looked harder at Javascript and came across &lt;a href="http://yuiblog.com/blog/2007/06/12/module-pattern/"&gt;Douglas Crockford's module pattern&lt;/a&gt;, which essentially uses lexical scoping to hide data, instead of an object system.&amp;nbsp; Inheritance, too, turns all weird in Javascript, since it's based on prototypes instead of classes, to say nothing of Perl's build-your-own system with the conventional approach of using a hash to store all the fields.&amp;nbsp; Polymorphism seemed to be nothing different than passing first-class functions.&lt;br /&gt;&lt;br /&gt;Perhaps I should go into more detail about that.&amp;nbsp; With polymorphism, the code actually called when a particular object method is invoked depends on the type of object.&amp;nbsp; Different objects that implement the same method signature are interchangeable, although in statically checked languages, they may need to share a base class or interface.&amp;nbsp; But, what is different about this as compared to passing a &lt;i&gt;function &lt;/i&gt;directly?&amp;nbsp; Both cases store different code-to-be-called into the same variable, and both cases have to agree on the call's signature.&amp;nbsp; Objects can have internal state, but passed-in functions can achieve the same by closing over lexical variables.&lt;br /&gt;&lt;br /&gt;The only thing objects really add above a function is the ability to call several different methods on the object.&amp;nbsp; Even so, this feature is easily replicated by passing a hash table instead of a function, where the keys are strings forming method names, and the values are the corresponding function.&amp;nbsp; Again, some languages provide more guarantees about the structure of an object than the structure of data pretending to be an object, but in others, the two are indistinguishable.&amp;nbsp; You wouldn't be warned until runtime that an object method could not be found.&lt;br /&gt;&lt;br /&gt;Functional and OO programming seemed to be equivalent.&amp;nbsp; I demonstrated this by &lt;a href="http://www.sapphirepaw.org/code/"&gt;producing&lt;/a&gt;&lt;span id="goog_1290357521"&gt;  a primitive object system using only Perl's functional programming  constructs, along with a class whose objects reasonably represented  closures in PHP 5.&amp;nbsp; (This was long before PHP 5.3, where they added  actual closures, and also a __call method to allow objects to pretend to  be  functions.)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I eventually discovered &lt;a href="http://paulgraham.com/reesoo.html"&gt;Jonathan Rees' thoughts&lt;/a&gt; on the meaning of object orientation, which justified my appraisal of OOP-as-it-was-taught-to-me as the Emperor's new clothes.&amp;nbsp; It agreed that &lt;i&gt;that &lt;/i&gt;kind of OOP was not the only possible manifestation, nor was it necessarily the One True Way.&lt;br /&gt;&lt;br /&gt;FP clearly had a different cadence to it than the OOP it was equivalent to, but that was more of a natural consequence of reducing friction between naming and using functions.&amp;nbsp; Things like map  can be written in OO style, but it's cumbersome enough there that nobody does so unless backed into a corner.&lt;br /&gt;&lt;br /&gt;Recently, I've been reading about game coding, and how there is very little in the way of OOP, because of the need to keep caches well-utilized.&amp;nbsp; Iterating through an array of x-values, where each index represents a different game object, ends up being faster than iterating through an array of objects, since there are many object properties that need skipped over.&amp;nbsp; (Also, if objects are of differing length, the accesses are harder for hardware to predict.)&amp;nbsp; Unused properties in a loop represent memory that was fetched and paid for, but wasted.&lt;br /&gt;&lt;br /&gt;Thus, game coding really strikes me as data-oriented programming.&amp;nbsp; There may be an "object design", but memory is laid out for maximum speed and parallelism of &lt;i&gt;data &lt;/i&gt;access.&lt;br /&gt;&lt;br /&gt;Where does "procedural programming" fall?&amp;nbsp; I don't think it's a useful distinction anymore, because I don't think there are any purely procedural languages left in wide use.&amp;nbsp; (Niche use, perhaps, like legacy COBOL systems, or physicists writing FORTRAN.&amp;nbsp; But not for anything in my slice of the programming world, certainly.)&amp;nbsp; Even lowly C offers function pointers, which allow for both OOP and FP idioms, as the case may warrant. Pretty much any GTK+ program uses both styles—FP for event callbacks, and OOP for the widgets (and possibly data).&amp;nbsp; C can't, AFAIK, pass pointers to lexical variables outside of the lexical container, so GTK+ has a userdata parameter threaded through every event call to simulate closure, but other than that, the illusion is fairly complete.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3922219755971684412-8768979768666813649?l=sapphirepaw.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://sapphirepaw.blogspot.com/feeds/8768979768666813649/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3922219755971684412&amp;postID=8768979768666813649&amp;isPopup=true' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8768979768666813649'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3922219755971684412/posts/default/8768979768666813649'/><link rel='alternate' type='text/html' href='http://sapphirepaw.blogspot.com/2010/08/functional-vs-object-oriented.html' title='Functional vs. Object Oriented'/><author><name>sapphirepaw</name><uri>http://www.blogger.com/profile/08959423651720108923</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='http://1.bp.blogspot.com/_-u1Kfz2-_48/TFYtgTTfzjI/AAAAAAAAAAM/kCiXSsRWxQM/S220/deviantID-haircut.jpg'/></author><thr:total>1</thr:total></entry></feed>
