Reverse Proxy with Mojolicious
During the last month I was working on a project for a customer, who wanted a reverse proxy, so he could modify his web pages before delivery.
This posed a very interesting problem which I chose to solve with good old Perl.
Why Perl? Because it’s the absolute right thing to use, when you want to manipulate text!
It’s very fast, very stable, easy on resources and Perl has everything to build a proxy without using any additional software like Apache, Nginx, Squid or the like.
Mojolicious to the Rescue
It gets even easier, when you have an amazing framework like Mojolicious, which I stumbled upon a few months ago!
It calls itself „A next generation web framework for the Perl programming language“. It has no (I repeat that: NO!) mandatory module dependencies, features a full HTTP 1.1 web server (even non-blocking if you like) and also a HTML5/XML client complete with a DOM parser using CSS3 syntax jQuery style for accessing elements.
Additionally, it has a preforking non-blocking I/O server featuring hot deployment called „Hypnotoad„. And a heck of a lot more jokes in the source code comments taken from Futurama.
One of it’s primary goals is to bring back fun into coding web in Perl. And boy, did I have!
no Moose
Mojolicious has a great base class called Mojo::Base. It has all the syntactic sugar one wants, provides all the good switches (warnings/strict etc.), a default constructor and a ‚has‘ accessor creator. Good riddance, Moose, you always bugged me with your weight!
So everything was there for this little project already. I just had to put the bricks together. And smooth it went!
The whole project took under 10 man-days to finish!
Lessons Learned
Configuration Files
One always needs configuration files, at least to store the database credentials somewhere. This application was no different.
When using Catalyst, it has a configuration file by default, which can be in any format one could possibly think of, thanks to Config::Any.
Mojolicious doesn’t bring configuration files by default, one has to build its own code for it. At first I built a class using YAML::Any, because I like YAML for configuration files very much.
But when working with Mojolicious for a while, one’s mindset changes: Why introduce this absolutely unneccessary dependency? Well, with Catalyst, one more to the hundreds doesn’t hurt. But Mojolicious is different: Why not use the provided Mojo::JSON instead?
So I quickly put together a class, which reads the configuration from a JSON file. While it’s not that flexible as YAML, JSON works well enough for the task. Absolute plus: No extra dependency introduced. Smooth.
Charset Guessing
While the DOM parser Mojo::DOM is very complete, I missed one thing, which is automatic charset guessing. Mojo tries to leave it’s fingers off the encoding as good as possible, so it’s easy to just proxy a file unmodified, but if you want to modify the content, you should make sure you know the encoding, so you’re not messing it up.
While it’s not a problem to build one’s own charset guesser, I nevertheless would like to see it in the Framework, because that’s where I think it belongs. (If it’s there already, please put my nose on it, because I couldn’t find it!)
I built the following guesser, which was purely by instinct and is most probably improvable:
sub guess_charset { my ($self, $content_type, $body) = @_; # Avoid undef warnings $content_type ||= ''; $body ||= ''; # HTTP Content-Type header if ($content_type =~ /charset=(.*)$/i) { return $1; } # XML prolog if ($body =~ /<\?xml .*encoding="(\S*?)".*\?>/si) { return $1; } my $dom = Mojo::DOM->new( $body ); # <meta http-equiv="content-type"> my $meta = $dom->at( 'meta[content*="charset"]' ); if ($meta && $meta->attrs( 'content' ) =~ /charset=(.*)$/i) { return $1; } # HTML5 <meta charset=""> $meta = $dom->at( 'meta[charset]' ); if ($meta) { return $meta->attrs( 'charset' ); } return; }
Database Access
I came to love DBIx::Class as an ORM for database access. You need some patience to get acquainted with it, but then its great to work with, because it’s so complete. No comparison to the poor excuse of an ORM called ActiveRecords used in Rails.
The downside is, again, it introduces so many dependencies, you make your customer cry when trying to deploy your application.
Since my customers were no Perl guys, using DBC was out of the question. Instead I wrote a small model class for DB access, which allows named pre-prepared statements and polishes all the rough edges off DBI.
I hate having raw SQL statements between program code, because I so often see how it leads to bad style, SQL injections and inefficient database access. Therefore I put the statements in the config file and pre-prepare them on model construction. I know, there are issues with this, but I mind these. Promised.
I’ve done stuff like this so many times, and it seems goofy, to reinvent the wheel once again, but somehow, there are so many ways to talk with your database that nobody seems to be able to agree on some standards between a full fledged ORM and the absolute lowest level.
Anyway, the one issue I had with Mojolicious in this regard is this:
The Hypnotoad server does preforking which destroys all database connections done on startup, so one needs to connect only after forking.
I came across this wiki page where it is explained, and I at first opted for the DBIx::Connector solution, which doesn’t introduce too much dependencies and sounds like the fire-and-forget solution to all database connection problems.
Somehow, it did solve the problem, but we observed another issue: the connections dropped after some hours of operations, preferably after having some hours of traffic and then some without any.
So I got rid of the DBIx::Connector again and now handle dropped connections manually in the model class. Ironically far more stable and another dependency saved!
Static Routes
When doing a reverse proxy, Mojolicious::Static gets in your way. The nice it is to have some default output, so you have something to start from, you definitely don’t want to get served up the default Mojolicious favicon.
To get around this, I had to build a custom static file serving class, which actually does: nothing.
package Proxy::NoStatic; use Mojo::Base 'Mojolicious::Static'; sub dispatch {} 1;
A little tedious and maybe this could be achieved simpler. However, this was the fastest I came up with.
Bottom Line
Mojolicious is by far the best framework for server side web development I came across in a long time. And I saw some: Spring 3, Rails, Catalyst, CakePHP, Yii, FuelPHP, Zend, CodeIgniter and not to mention the various home made ones I had the „pleasure“ to work with…
I so much enjoyed it and I can’t wait for the next project to use it!
Thank you Sebastian Riedel for giving it to us!
And, btw., Sebastian: Mission accomplished! 🙂
Ha, that’s funny. A little against the spirit of working as dependency-free as possible, but, why not? 🙂
But I guess, I’d rather use Mojolicious::Plugin::YamlConfig, since JSON seems sub-optimal for config file purposes, esp. because of the impossiblity of doing line breaks.
Anyway, in the meantime it seems to me, that the most mojolicousy way of doing configs is using plain Perl files. At least, that’s what I saw once, what Sebastian does. (If I remember correctly, that is…)
There actually is a Plugin called Mojolicious::Plugin::JSONConfig in the base distribution that lets you use a json config file.
Well, I resorted to checking the connection using DBI::ping prior to returning a database handle. If the ping fails, the connection will be re-established (hopefully…). The handle is fetched from this method on every query.
Doing a ping before every query definitely isn’t the most performant thing to do, but more stable.
And until now, this didn’t proof to be a bottleneck.
Thanks for nice article!
May I ask how did you code dropped connection handling in the model class?
I have been successfully using DBIx::Connector so far.