Brigades and HTTP 1.1 in an Apache module

Jerry Stratton, June 22, 2008

Still taking baby steps on my journey to an Apache module. This version will add two things, one easy and one hard: it will switch from requesting via an IP address to requesting via a hostname, and it will switch to HTTP 1.1 instead of HTTP 1.0. Why? Currently, the module uses the IP address of the authentication server. That’s okay if the authentication server is a standalone server with only one hostname (or where the hostname that the IP address refers to is the default hostname for the server) but it won’t work with virtual hosts, nor will it work with servers whose IP address is dynamic.

Given the application, that may not be a problem (you wouldn’t want to run an authentication server on a shared host) but if the module is being used more generally to provide an XML response, not being able to query virtual hosts is a big drawback.

Hostname instead of IP address

The current version of the module uses the IP address in one place. The new version will use the hostname in more than one place.

Go ahead and grab 3 mod_external_auth.c from the source archive and save it as mod_external_auth.c.

First, using the hostname instead of the IP address to connect to the remote server is easy: just replace it. Apache’s APR functions will do a DNS lookup automatically if we give apr_sockaddr_info_get a hostname instead of an IP address. Modern servers cache DNS lookups locally, so there should be very little performance hit doing a DNS lookup instead of hard-coding to the current IP address (and requiring a recompile every time the IP address changes).

At the top of the source file, add “char *authHost = "www.hoboes.com";” (or use your own test server, preferably, since there’s no telling what kind of response you’ll get from my test server).

Then, replace your IP address (216.92.252.156 in the example) with authHost. The new apr_sockaddr_info_get should be:

if ((status = apr_sockaddr_info_get(&sockaddr, authHost, APR_INET, 80, 0, request->pool)) != APR_SUCCESS) {

Recompile, and the module should continue working exactly as before. (If you’ve been playing around with the authentication response on the other end, make sure you have it set to “let you in” now.)

HTTP 1.1

Because the module doesn’t currently specify HTTP 1.1, the server assumes HTTP 1.0. This makes programming a whole lot easier: we get the server’s response as one big chunk and we don’t need to bother with http headers. Unfortunately, if we want to be able to handle virtual hosts, the module needs to be an HTTP 1.1 client. The server knows which host should respond to the module’s query only if the module provides the HTTP 1.1 header “Host: ”.

That means the module must provide a couple of headers in its request, and it means the module must handle chunked responses collected over more than one apr_bucket_read.

HTTP 1.1 headers

When we send a Host header, we need to specify that we’re using HTTP 1.1, the Host header, the Content-Length header, and the Connection header (we just want the connection to close and be done with, we don’t want a persistent connection). The headers are going to look like this:

GET /authorization.php?ip=xxx.xxx.xxx.xxx&page=/wherever/somepage.html HTTP/1.1
Host: www.hoboes.com
Content-Length: 0
Connection: close

There are visible and invisible parts to this request. The visible parts are the headers; the invisible parts are the lines between the headers. The lines should be separated by a carriage return and a new line. The last line needs to have two sets of those. And when the script searches, it will need to check for those. Apache defines CRLF for us, and we can concatenate two of them to create the header ender:

#define HEADEREND CRLF CRLF

The line that creates the authRequest gets a lot bigger:

authRequest = apr_pstrcat(request->pool, "GET /authorization.php?ip=", remote_ip, "&page=", uri, " HTTP/1.1", CRLF, "Host: ", authHost, CRLF, "Content-Length: 0", CRLF, "Connection: close", HEADEREND, NULL);

Looking at variables

This is still going to fail. It will fail for one to three reasons: first, it might fail because of a bug in the code; second, it will fail because the new HTTP 1.1 response includes headers, and they won’t parse as XML; finally, it might fail if the response is a chunked response, because the chunk sizes are not going to parse as XML either.

The module needs a way of showing off variables and their values, so that we can debug it more easily. Underneath the logError function, add a logNotice function:

[toggle code]

static void logNotice(char *title, char *message, request_rec *request) {
- ap_log_rerror(APLOG_MARK, APLOG_NOTICE, 0, request, "%s: %s", title, message);
}

If the readResponse does not return APR_SUCCESS, put a notification of the value of authResponse in Apache’s error log:

[toggle code]

if ((status = readResponse(request, sock, xml_response, xml_length)) != APR_SUCCESS) {
- logNotice("authRequest", authRequest, request);
- return logError(status, "receiving XML response", request);
}

If the problem was due to an error in the HTTP 1.1 request, this will log the value of authRequest to Apache’s error log.

Also, in the parseAuthentication function, if it’s going to return an error log the value of the xml first:

[toggle code]

} else {
- logNotice("xml", (char *)xml, request);
- return logError(status, "feeding xml", request);
}

Before going any further, you’ll want to debug the module until the error occurs during feeding and the XML response that is logged to the error_log is what you’re expecting (including HTTP headers and possibly with a set of numbers put in between chunks).

Getting past the headers

There are two steps to parsing an HTTP 1.1 response. The first is to get past the headers and into the body; and the second is to (if necessary) remove the chunking sizes from the body. Unfortunately, my experience is that the remote server’s chunks don’t correspond to the apr_bucket_read iterations.

Getting past the headers is pretty easy. Just look for the first doubled CRLF. Here’s a parseResponse that only gets past the headers (so it will not work with chunked responses).

[toggle code]

//unchunk an HTTP 1.1 response if necessary
static void parseResponse(const char **xml_response, apr_size_t *xml_length, request_rec *request) {
- //get past the headers
- *xml_response = strstr(*xml_response, HEADEREND)+strlen(HEADEREND);
- apr_status_t status = APR_SUCCESS;
- *xml_length = strlen(*xml_response);
- return status;
}

I’m assuming, possibly incorrectly, that apr_brigade_flatten terminates strings.

In getAuthentication, add a parseResponse call after the readResponse call:

[toggle code]

if ((status = parseResponse(xml_response, xml_length, request)) != APR_SUCCESS) {
- return logError(status, "unchunking XML response", request);
}

At this point, if you are not getting chunked responses from your server, it will be working again. You can tell if you’re getting chunked responses by the presence of the header "Transfer-Encoding” and the value “chunked”.

Parsing HTTP 1.1 chunks

When an HTTP 1.1 server decides it needs to chunk its response, it will add a Transfer-Encoding header and set it to chunked. It will send the headers intact, and then send the body as a series of chunks. Each chunk consists of a hexadecimal number on its own line, and the chunk of text. The number is the length of the chunk. The last number is always zero: a zero means that there are no more chunks.

You should be able to see this format in the error log, since the xml parser won’t be able to parse these chunks and will call logNotify before exiting. If you are not getting chunked responses, you’re going to want to force the issue: whatever you’re using as a response, make it bigger until the beginning of the response is a number. For example, I put the Gettysburg address into the one of the XML fields and duplicated it seven times.

First, in order to convert hex to integer, the pow() function is useful. So include math.h at the top of the file:

#include "math.h"

Then, replace the parseResponse function with one that checks for chunking:

[toggle code]

//unchunk an HTTP 1.1 response if necessary
static int parseResponse(const char **xml_response, apr_size_t *xml_length, request_rec *request) {
- //remember the start of headers
- const char *headerStart = *xml_response;
- apr_status_t status = APR_SUCCESS;
- //get past the headers to the beginning of the body (and the first chunk size)
- char *chunkSizeStart = strstr(*xml_response, HEADEREND)+strlen(HEADEREND);
- //look for the presence of chunking
- //this only works because we expect short bodies with well-formed XML
- //HTTP headers are not case sensitive
- if (strcasestr(headerStart, CRLF "transfer-encoding: chunked" CRLF)) {
  - char *chunkStart;
  - int chunkLength;
  - int chunkLengthCounter;
  - char *chunkLengthPointer;
  - int chunkLengthPart;
  - //the good data will be less than the size of headers+body+chunksizes
  - char *goodResponse = malloc(*xml_length);
  - int fullLength = 0;
  - do {
    - //get the chunk length: all chunk lengths begin and end with CRLF
    - chunkStart = strstr(chunkSizeStart, CRLF);
    - //work the chunk size hex string backwards to convert from hex to integer
    - chunkLengthPointer = chunkStart-1;
    - chunkLength = 0;
    - chunkLengthCounter = 0;
    - while (chunkLengthPointer >= chunkSizeStart) {
      - chunkLengthPart = *chunkLengthPointer;
      - if (chunkLengthPart >= 97 && chunkLengthPart <= 102) {
        
        //letter a - f
        
        chunkLengthPart -= 97;
        
        chunkLengthPart += 10;
      - } else if (chunkLengthPart >=48 && chunkLengthPart <=57) {
        
        //number 0 - 9
        
        chunkLengthPart -= 48;
      - } else {
        
        logInteger("bad hex character", chunkLengthPart, request);
        
        return HTTP_SERVICE_UNAVAILABLE;
      - }
      - chunkLength += chunkLengthPart*pow(16, chunkLengthCounter);
      - chunkLengthCounter++;
      - chunkLengthPointer -= 1;
    - }
    - //copy the chunk to the good data
    - if (chunkLength>0) {
      - //logInteger("chunk length", chunkLength, request);
      - chunkStart += strlen(CRLF);
      - //logNotice("chunk", chunkStart, request);
      - strncpy(goodResponse+fullLength, chunkStart, chunkLength);
      - fullLength += chunkLength;
      - //the next chunk size string starts where this chunk ends
      - chunkSizeStart = chunkStart + chunkLength; // + strlen(CRLF);
    - }
  - } while (chunkLength > 0);
  - *xml_length = fullLength;
  - *xml_response = goodResponse;
  - //logInteger("unchunked length", fullLength, request);
- } else {
  - *xml_response = chunkSizeStart;
  - *xml_length = strlen(*xml_response);
  - //logInteger("no need to unchunk length", *xml_length, request);
- }
- return status;
}

That should be it. If there is chunking involved, this function will find each chunk, get its size from the hexadecimal string preceding it, and append it to a clean string (goodResponse).

At this point the module should work with any size response and with virtual hosts. There are a couple of places where it trusts the server not to try to inject bad data (by holding the connection open forever, for example). If you create a module that can’t trust the remote server, you’ll want to be very careful to avoid responses monopolizing your local server’s http processes.

Look for this version of the module in “4 mod_external_auth.c” in the archive.

August 15, 2008: Bug in parseResponse fixed: There was a bug in the way that the parseResponse function counted through chunks; under some circumstances it would count wrong. In the process of fixing it, I also made parseResponse return a status code when it runs into problems. It’s always a good idea to not trust even your own servers. If if they aren’t subject to hacking attempts, error checking helps keeps bugs on one server from cascading down through other servers.

Since it was taking me too much time to get around to writing about the change, I just replaced the code in the parent article and in the archive with the new, good code.

Read the full post and comments

Apache modules tutorial source files (Zip file, 9.4 KB): All of the final example modules are in this archive.
Introduction to Buckets and Brigades: “A bucket is a container for data. Buckets can contain any type of data. Although the most common case is a block of memory, a bucket may instead contain a file on disc, or even be fed a data stream from a dynamic source such as a separate program.”

More Writing an Apache module

Creating a C Apache module on Mac OS X Leopard: From start to finish, how to use apxs to create and install an Apache module written in C that adds a dynamic header to the http headers of all pages on your site.
Calling an external server from an Apache module: This installment in my ongoing Apache module saga will call an external web server, and allow or deny access depending on what that external server replies.
XML and Buckets in Apache Modules: Apache has its own functions for dealing with XML and for drawing down responses over network connections. This module will parse a remote XML response, authorize according to that response, and provide the elements of the response as Apache environment variables.

Comments?

##

Thanks for your examples! It really helped me out.

To make it really pretty, you should use apr_palloc instead of malloc (occured twice, I guess). and strcasestr is not standard compliant, so it doesn't have to be availyble on every OS.

Cheers

Basti at 10:13 a.m. April 28^th, 2011
xwvsK

Your email, URL, and location are optional—but I won’t be able to contact you if you don’t leave a working email. Your email does not get displayed, your URL and location do. Your name is required but may vary as the needs of the day demand, or you can just use the anonymous Hark Thrice name. You can use the following tags: <em>, <a>, <blockquote>. Use them wisely and post intelligently. Comments may take some time to approve, especially if I’m stuck in a Mexican jail.

If you have private comments, or questions about this page, please, leave a message on the Negative Space Comments Page.

Lost?

If you’re looking for something here, use the search box in the navigation to limit your search to this part of the site, or use the Negative Space search page.

Jerry

He says, “Neither in religion nor morality, my friend, lie the hopes of the race, but in education.” This, clearly expressed, means, “We cannot decide what is good, but let us give it to our children.” — G.K. Chesterton (Heretics^•)

Contents of Negative Space™ as a whole Copyright © 1994-2024 Jerry Stratton. Individual copyrights remain held by their respective authors unless they specify otherwise. Site titles, such as Negative Space, Strange Bedfellows, Biblyon Broadsheet, Highland Games, and FireBlade Coffeehouse are trademarks of Jerry Stratton.

Code and code snippets, to the extent that they are copyrightable, may be re-distributed under the terms of the GNU General Public License 3.

Brigades and HTTP 1.1 in an Apache module last modified April 12th, 2011.

Your comment
Your name
Your email
Your web page
Your location

Mimsy Were the Borogoves

Brigades and HTTP 1.1 in an Apache module

Hostname instead of IP address

HTTP 1.1

HTTP 1.1 headers

More than one read

Looking at variables

Getting past the headers

Parsing HTTP 1.1 chunks

More Writing an Apache module

Editorials

Books, Movies, & Music

Technology & Hacks

Food

42 Astounding Scripts

Walkerville Reader

Biblyon Broadsheet

About Mimsy

Comments?

Lost?

Mimsy Were the Borogoves

Brigades and HTTP 1.1 in an Apache module

Hostname instead of IP address

HTTP 1.1

HTTP 1.1 headers

More than one read

Looking at variables

Getting past the headers

Parsing HTTP 1.1 chunks

More Writing an Apache module

Editorials

Books, Movies, & Music

Technology & Hacks

Food

42 Astounding Scripts

Walkerville Reader

Biblyon Broadsheet

Blogroll

Keep in touch

About Mimsy

Comments?

Lost?