Mimsy Were the Borogoves

Hacks: Articles about programming in Python, Perl, PHP, and whatever else I happen to feel like hacking at.

XML and Buckets in Apache Modules

Jerry Stratton, March 23, 2008

My foray into writing an Apache module is moving along. This installment will deal with buckets and with XML.

Buckets

In the previous example, I grabbed the authentication response directly from the socket. If you experimented with that example, you probably realized that it didn’t really work. For yes and no responses it mostly works, but every once in a while a “yes” will come back as a “ye”. Switch to a larger response—such as an XML response with lots of information—and a single call to apr_socket_recv will never pull the entire response.

Apache’s solution to that works a little differently than the libraries in most scripting languages. Instead of a function for pulling in, say, a web page, Apache has a general set of functions for drawing in data and passing data from place to place. It calls this a bucket brigade, although I’m only going to need a single bucket.

Go back to mod_external_auth.c (you can get it from the archive if you want to have the exact version I’m starting from). Take the lines in the “authenticate” function that perform the apr_socket_recv:

[toggle code]

  • item_size = sizeof(response);
  • if ((status = apr_socket_recv(sock, response, &item_size)) != APR_SUCCESS) {
    • return logError(status, "receiving response", request);
  • }
  • apr_socket_close(sock);
  • /* make sure it ends, and chop off carriage return */
  • response[item_size] = '\0';
  • if (item_size > 0) {
    • if (response[item_size-1] == '\n') {
      • response[item_size-1] = '\0';
    • }
  • }

And replace them with:

[toggle code]

  • xmlBucket = apr_bucket_socket_create(sock, request->connection->bucket_alloc);
  • if ((status = apr_bucket_read(xmlBucket, xml_response, xml_length, APR_BLOCK_READ)) != APR_SUCCESS) {
    • return logError(status, "receiving response", request);
  • }
  • apr_socket_close(sock);
  • apr_bucket_destroy(xmlBucket);

Also, add:

  • apr_bucket* xmlBucket;

to the list of variables at the top of the function.

Finally, apr_bucket_read appears to require a “const char” to store the response in. So, in the externalAuthorizer function, change:

  • char response[5000];

to:

  • const char *xml_response;
  • apr_size_t xml_length;

It isn’t an XML response yet, but it will be.

Change the “authenticate” function and parse section to refer to the new variable:

[toggle code]

  • //make call to authentication server
  • if ((status =
    • return status;
  • }
  • //parse authentication response
  • if (strcmp("No",
    • return HTTP_FORBIDDEN;
  • } else if (strcmp("Yes",
    • return OK;
  • } else {
    • return HTTP_SERVICE_UNAVAILABLE;
  • }

And change the authenticate function’s opening line to:

  • static int

Recompile it, remembering that if you’re using a G5 or Intel on Mac OS X you may need to add some extra options:

  1. sudo apxs -c -i -a mod_external_auth.c
  2. sudo apachectl configtest
  3. sudo apachectl graceful

Verify that it works by using “curl --head” and your test server’s URL. When you change the response from “yes” to “no”, the web server should change from “OK” to “Forbidden” as it did in the previous example.

Or at least, it should do so most of the time:

xml_length?

I need to track the length as well as the string. In higher-level languages, a string of text is automatically marked with an ending. C doesn’t really have a “string” variable. Char * is a pointer to the beginning of a string, but it contains no information about the ending of the string. For most C functions, a NULL (0) character marks the ending. But buffers are often not given an ending in C, and in this case the XML response can have extraneous data on the end of it.

In the previous example, I just set the NULL myself, but a const can’t be written to. So I need to track the length of the response as well as the response itself. I’ll be using it in the next step, so if the module mostly works for you right now but occasionally returns a 503 error for no reason, it is probably that there is extraneous data on the end.

Parsing XML

One of the easiest ways of passing complex information across systems is through XML. Apache has its own XML routines to help with that. Go ahead and change your PHP script to provide an XML authorization. Besides Yes or No, provide a department, a job title, and a first and last name. Whatever you want your web pages to know about your logged-in visitors. It should look something like this:

[toggle code]

  • <?xml version="1.0" encoding="us-ascii" ?>
  • <!DOCTYPE Authentication SYSTEM "http://www.hoboes.com/authorization.dtd">
  • <Authentication>
    • <authorization>No</authorization>
    • <user>cpurcell</user>
    • <title>editor</title>
    • <employeeid>875805132</employeeid>
    • <department>Walkerville Weekly Reader</department>
    • <firstname>Carolyn</firstname>
    • <lastname>Purcell</lastname>
  • </Authentication>

Take the “parse authentication response” section of externalAuthorizer and replace it with:

[toggle code]

  • //parse authentication response
  • if ((status = parseAuthentication(request, xml_response, xml_length, &userInfo)) != APR_SUCCESS) {
    • return status;
  • }
  • //check authentication response
  • return checkAuthentication(request, userInfo);

Notice that there’s a new variable. I’m using userInfo to store the XML response as an Apache table (Apache uses tables for things like environment variables and response headers, as seen in the first Apache Modules article). Add the new variable’s definition to the list of variables used by externalAuthorizer:

  • apr_table_t *userInfo;

The parseAuthentication function should look like this:

[toggle code]

  • static int parseAuthentication(request_rec *request, const char *xml, apr_size_t xml_length, apr_table_t **userInfo) {
    • apr_xml_parser *parser;
    • apr_status_t status;
    • apr_xml_doc *parsedDocument;
    • const char *elementValue;
    • apr_size_t elementSize;
    • apr_xml_elem *element;
    • parser = apr_xml_parser_create(request->pool);
    • if ((status = apr_xml_parser_feed(parser, xml, xml_length)) == APR_SUCCESS) {
      • if ((status = apr_xml_parser_done(parser, &parsedDocument)) == APR_SUCCESS) {
        • *userInfo = apr_table_make(request->pool, 0);
        • for (element = parsedDocument->root->first_child; element != NULL; element = element->next) {
          • apr_xml_to_text(request->pool, element, APR_XML_X2T_INNER, NULL, NULL, &elementValue, &elementSize);
          • apr_table_set(*userInfo, element->name, elementValue);
        • }
        • return APR_SUCCESS;
      • } else {
        • return logError(status, "parsing xml", request);
      • }
    • } else {
      • return logError(status, "feeding xml", request);
    • }
  • }

This creates an XML parser object, feeds the xml response into the parser, reads each element, and stores the element and its value into a table (userInfo).

The checkAuthentication function doesn’t have to do anything special. It just checks the userInfo table for the presence and value of “authorization”. It’s basically just a slightly more complex version of the “if else” that checked for a yes or no response.

[toggle code]

  • static int checkAuthentication(request_rec *request, apr_table_t *userInfo) {
    • const char *authorization;
    • if (authorization = apr_table_get(userInfo, "authorization")) {
      • if (strcmp(authorization, "Yes") == 0) {
        • return OK;
      • } else if (strcmp(authorization, "No") == 0) {
        • return HTTP_FORBIDDEN;
      • } else {
        • return logError(0, "authorization not valid", request);
      • }
    • } else {
      • return logError(0, "no authorization found", request);
    • }
  • }

And in order to use the XML routines, include the xml library at the top of the file:

  • #include "apr_xml.h"

Once you recompile and reload the module, it should allow access based on the value of the “authorization” element in the XML file. Obviously you could program the response to provide either Yes or No based on various conditions. As in the previous example, remember that I currently have this set up assuming that it will eventually be asking for a username and password.

Set the environment

Now that the external server is providing more information than just a Yes or No, why not provide this information to web pages? PHP can access the server environment through the PHP: Predefined Variables, and Server-Side Include (SSI/.shtml) files can access them through the var= option.

The environment is just a table in Apache. Just above “return OK” in checkAuthentication, add:

  • apr_table_overlap(request->subprocess_env, userInfo, APR_OVERLAP_TABLES_MERGE);

APR_OVERLAP_TABLES_MERGE tells Apache that if an item in the new table (userInfo) matches an item in the existing table (subprocess_env), to merge them, separating them with a comma and a space. If you would rather have the new item replace the old item, use APR_OVERLAP_TABLES_SET.

Now that the XML response has been added to the environment, PHP and SSI files can access them. Each of these examples (given the XML response above) should say “Hello, Carolyn Purcell!”.

PHP

[toggle code]

  • <?
    • $name = $_SERVER['firstname'] . ' ' . $_SERVER['lastname'];
    • echo "Hello, $name!\n";
  • ?>

SSI

  • Hello, <!--#echo var="firstname" --> <!--#echo var="lastname" -->!

Hiding elements

Whatever gets put into the XML will be available by element name to dynamic web pages on the test server. If there are elements you don’t want to make available, you can use apr_table_unset to remove them. For example, if I had an element named “key” that I needed the module to look at but did not want pages to access, I could have added this line before the apr_table_overlap:

  • apr_table_unset(userInfo, "key");
  1. <- Nisus Clean HTML
  2. Django Usefulness ->