Thursday, April 20, 2006

http compression

One of the things I also use regularly at work is wget. In order to work on compression I needed a tool that would recursively spider a web site and had the appropriate features, but would also support compression. At the time, nothing out there supported this, so a year ago or so I finished a patch to wget to add compression.

You can check out the latest version of wget using the subversion repository outlined here. Apply this patch from the top directory and as long as your OS has zlib support, then you should be able to use the '-z' switch this patch adds to request compressed files from a webserver.


Here's an example how a 39K download turns into a 10K download with compression, and downloads in half the time.

$ ./wget
Connecting to connected.
HTTP request sent, awaiting response... 200 OK
Length: 39599 (39K) [text/html]
Saving to: `index.html'

100%[=======================================>] 39,599 --.-K/s in 0.005s

13:22:32 (8.39 MB/s) - `index.html' saved [39599/39599]

$ ./wget -z
Connecting to connected.
HTTP request sent, awaiting response... 200 OK
Length: 10546 (10K) [text/html]
Saving to: `index.html.1'

100%[=======================================>] 10,546 --.-K/s in 0.002s

13:22:39 (5.62 MB/s) - `index.html.1' saved [10546/10546]

$ diff index.html index.html.1
$ echo $?

Wednesday, April 19, 2006


In my job, I deal with a lot of tcp traffic, esp http, and ipv6.

For awhile I've been using both bozohttpd and Apache.
I used bozohttpd because it's simple, light weight, command line driven and easy to hack on. The big win for bozohttpd was the fact you could drop it into inetd and let inetd take care of the ipv6 compliance side. However, bozohttpd is lacking in several useful features and in many cases is missing some standards compliancy -- so in those cases I used Apache. Everyone here tests with Apache, but I absolutely despise Apache's convoluted "do everything" configuration and setup. It can take me hours to remember, research and setup even simple changes (esp if it requires a missing module!). Compiling Apache can be a royal PITA... Basically, it's too flexible.

Recently I've taken a liking to lighttpd. It's very fast, easily configurable, and restricted enough in it's feature set to allow easy module configuration. It only has one problem. You could use it for ipv6 and not ipv4 or vice versa. Common mistake really, people never take the time and effort to use sock storage structure and properly do a dual stack server, they try to 'hack' their ipv4 server into v6 with #ifdefs, etc. Bad bad bad. I digress.

Adjusting lighttpd to work on v4 and v6 in the same process was easy. Easy that is if you're using freebsd.

sysctl net.inet6.ip6.v6only=0

Then set up lighttpd to serve v6 addressing, and you're set. This basically enables v4 compat ipv6 addressing like ::ffff:, so all their #ifdef'd ipv6 only code still chomps on the numbers just fine and listening on :: still gets you v4 traffic.