Converting HTML lists to text on the fly

Jerry Stratton, June 25, 2009

About four years ago, I had the epiphany that since programming code is a list of commands, it makes sense to display that list as an HTML list. Before that, I generally presented code using the PRE and/or CODE tags, which was always problematic. Any long lines tended to go off the screen (see Regroup vs. IfChanged in Django templates for an example). Since my purpose in displaying these snippets of code was to show how I was solving some problem, having the code disappear was counterproductive. Switching to HTML lists made the code readable.

The main problem with using HTML lists, however, is that browsers don’t copy lists as indented text. They tend to store the formatted version as indented lists, but the text version is just a jumbled mass of text—ironic, since I moved away from the PRE and CODE tags to avoid displaying a jumbled mass of text.

After a few years, I realized I might be able to solve this with a JavaScript function, but the problem with that is, I knew I would never go back and add the function to all of my previous code snippets. And I kept expecting browsers to copy hierarchical lists hierarchically even in text.

Now that I have a code template tag in Django, it’s easy for me to add new functionality to old code snippets. I’ve written a “simple” JavaScript to add a “toggle code” option on the code snippets in this site. The function went through three iterations.

Converting a list to indented plain text

After my experience with excerpting HTML, it seemed like this would be pretty easy, and it was.

[toggle code]

function copyCode(codeDIV) {
- var codeList = codeDIV.getElementsByTagName('ul')[0];
- var code = getCode(codeList.childNodes, 0, '');
- window.alert(code);
}
//take a list-oriented HTML display and return indented plain text
function getCode(codeElements, codeDepth, leadingTabs) {
- var subCode = '';
- //increase leading tabs if necessary
- if ((codeDepth-1)/2 > leadingTabs.length) {
  - leadingTabs = leadingTabs + "\t";
- }
- //loop through each element
- for (var child=0;child<codeElements.length;child++) {
  - element = codeElements[child];
  - if (element.nodeType == element.TEXT_NODE) {
    - //strip white space from text
    - var elementText = element.data.replace(/^\s+|\s+$/g,'');
    - if (elementText) {
      - subCode = subCode + leadingTabs + elementText + "\n";
    - }
  - } else if (element.tagName != 'SPAN') {
    - if (element.className == 'section') {
      - subCode = subCode + "\n";
    - }
    - subCode = subCode + getCode(element.childNodes, codeDepth+1, leadingTabs);
  - }
- }
- return subCode;
}

This requires a non-UL element in the same DIV as the code. Since I’m using Django, my Django template for displaying code looks like:

[toggle code]

<div class="code">
- <p class="codecopier" onclick="copyCode(this.parentNode)">[copy code]</p>
- {{ code }}
</div>

The only tricky bit is that, at least the way I’ve designed it here, two levels deep is one level of indentation. One level for the LI, and one level for the text node.

Present the code in a plain text window?

My first thought was to keep it simple: open up a text/plain document and display the code there for copying. The Document object has an “open” method that specifically takes the MIME type of the new document. By specifying text/plain, there’s no problem displaying character entities or less than symbols.

[toggle code]

function copyCode(codeDIV, pageTitle) {
- var codeList = codeDIV.getElementsByTagName('ul')[0];
- var code = getCode(codeList.childNodes, 0, '');
- var codeWindow = window.open('', 'hobo.codeWindow', 'location=no, status=no, toolbar=no');
- var codeDocument = codeWindow.document.open('text/plain');
- codeDocument.title = 'Plaintext code';
- if (pageTitle) {
  - codeDocument.title += ' for ' + pageTitle;
- }
- codeDocument.write(code);
- codeDocument.close();
- codeWindow.focus();
}

And the HTML is:

[toggle code]

<div class="code">
- <p class="codecopier" onclick="copyCode(this.parentNode, '{{ page.title }}')">[copy code]</p>
- {{ code }}
</div>

This works great. It opens up a new window that can be both copied and saved, and if it is saved the page title is automatically in the new document. But it only works in Firefox. IE specifically forbids any MIME type other than text/html; it will error out if any other MIME type is requested. Safari (and presumably Webkit) just ignores the MIME type parameter altogether in favor of text/html.

Toggle between readable and copyable display?

Next, I thought about replacing the UL’s outer HTML with the new plain text. Save the original UL list as a property on the DIV object, convert the UL to plain text and set the outerHTML property to be the new plain text (wrapped inside of a PRE). By looking for the saved list property, I can tell whether I need to display the viewable list or the copyable code.

Because the code is being displayed in HTML, however, I also need to escape any ampersands and less than symbols, or the browser will display them as HTML code.

[toggle code]

function copyCode(codeDIV) {
- //if savedList exists, set outerHTML from the PRE to the UL
- if (codeDIV.savedList) {
  - var codePRE = codeDIV.getElementsByTagName('pre')[0];
  - codePRE.outerHTML = codeDIV.savedList.outerHTML;
  - codeDIV.savedList = 0;
- } else {
  - //there should only be one list under this parent.
  - var codeList = codeDIV.getElementsByTagName('ul')[0];
  - var code = getCode(codeList.childNodes, 0, '');
  - if (code) {
    - codeDIV.savedList = codeList;
    - //replace with string doesn't seem to do < or & characters after the first line.
    - var escapedCode = code.replace(/&/g, '&');
    - escapedCode = escapedCode.replace(/</g, '<');
    - escapedCode = escapedCode.replace(/>/g, '>');
    - codeList.outerHTML = '<pre>' + escapedCode + '</pre>';
  - }
- }
}

And for the HTML:

[toggle code]

<div class="code">
- <p class="codecopier" onclick="copyCode(this.parentNode)">[toggle code]</p>
- {{ code }}
</div>

The tricky part here is, at least for me, is that JavaScript’s string replace method description is confusing. It doesn’t really take a string to look for and a string to replace with. It takes a regular expression to look for and a string to replace with. If it gets a string to look for, it converts it to a simple regular expression. That’s important, because regular expression replace normally only replaces the first match. I was almost ready to give up and have this feature work only for Firefox/Gecko users before I realized what the issue was. As a regular expression, I need to append /g to make it global.

But wait!

Now it works in Safari, but it doesn’t work in Firefox. Why? Because outerHTML is not part of the JavaScript standard. I don’t even know where I remembered it from. Maybe I just made it up based on remembering innerHTML, and unluckily it worked. Regardless, if I’m going to have it only work in one browser, it’s going to be the standards-based one, so unless I can find another solution it’s back to document.open("text/plain").

So what about innerHTML? I can’t replace the innerHTML of a UL with a PRE, but I can replace it with an LI that contains a PRE.

[toggle code]

function copyCode(codeDIV) {
- //there should only be one list under this parent.
- var codeList = codeDIV.getElementsByTagName('ul')[0];
- //if savedList exists, set innerHTML from the PRE version to the list item version
- if (codeDIV.savedList) {
  - codeList.innerHTML = codeDIV.savedList;
  - codeDIV.savedList = 0;
- } else {
  - var code = getCode(codeList.childNodes, 0, '');
  - if (code) {
    - codeDIV.savedList = codeList.innerHTML;
    - //replace with string doesn't seem to do < or & characters after the first line.
    - code = code.replace(/&/g, '&');
    - code = code.replace(/</g, '<');
    - code = code.replace(/>/g, '>');
    - codeList.innerHTML = '<li class="copyable"><pre>' + code + '</pre></li>';
  - }
- }
}

And for restoration purposes, instead of saving the entire UL, it saves only the innerHTML. This works in Safari and Firefox, so presumably in all Webkit and Gecko browsers.

Only include code.js once

If you view source on this page, you’ll see that the copyCode and getCode functions are included via a file called code.js. That file is included in front of the first code snippet on the page, but not in front of other code snippets.

I’m doing that by setting a codeSnippetCount property on the page itself in the code templatetag:

[toggle code]

page = context['page']
if hasattr(page, 'codeSnippetCount'):
- page.codeSnippetCount = page.codeSnippetCount+1
else:
- page.codeSnippetCount = 1

Then, in the template, I check for “isequal page.codeSnippetCount 1”, and only display the SCRIPT tag if that’s true.

[toggle code]

{% ifequal page.codeSnippetCount 1 %}
- <script type="text/javascript" src="{{ centralstart }}/library/scripts/code.js"></script>
{% endifequal %}

This was a meandering journey through writing a fairly simple JavaScript function, wasn’t it?

Code formatting Django tag: xml.dom.minidom makes it easy to format code snippets into lists.
Regroup vs. IfChanged in Django templates: Django’s regroup template tag can filter the “by” portion before regrouping to, for example, provide an alphabetical headline.

More JavaScript

Catalina vs. Mojave for Scripters: More detail about the issues I ran into updating the scripts from 42 Astounding Scripts for Catalina.
Why I still use RSS: I still use RSS because connections regularly fail, especially to Twitter.
JavaScript for Beginners revised: I’ve completely revised my JavaScript for Beginners tutorials to be more in tune with modern JavaScript, and to provide more useful examples in general.
JavaScript for Beginners update: The JavaScript tutorial has been updated by introducing loops earlier, and in the first section.
Webmaster in a Nutshell: Without doubt the best reference work for webmasters that you’ll find. It contains the “reference” part of most of O’Reilly’s web-relevant nutshell books. You can find references for HTML 3.2, the CGI standard, JavaScript, Cascading Style Sheets, PHP, the HTTP 1.1 protocol, and configuration statements and server-side includes for the Apache/NCSA webservers.
One more page with the topic JavaScript, and other related pages

Comments?

The undiscovered comment form, whose bourn no poster returns.

Your email, URL, and location are optional—but I won’t be able to contact you if you don’t leave a working email. Your email does not get displayed, your URL and location do. Your name is required but may vary as the needs of the day demand, or you can just use the anonymous Hark Thrice name. You can use the following tags: <em>, <a>, <blockquote>. Use them wisely and post intelligently. Comments may take some time to approve, especially if I’m stuck in a Mexican jail.

If you have private comments, or questions about this page, please, leave a message on the Negative Space Comments Page.

Lost?

If you’re looking for something here, use the search box in the navigation to limit your search to this part of the site, or use the Negative Space search page.

Jerry

“Rigid and inhuman” computer systems are the creation of rigid and inhuman people. — Ted Nelson (Computer Lib)

Contents of Negative Space™ as a whole Copyright © 1994-2024 Jerry Stratton. Individual copyrights remain held by their respective authors unless they specify otherwise. Site titles, such as Negative Space, Strange Bedfellows, Biblyon Broadsheet, Highland Games, and FireBlade Coffeehouse are trademarks of Jerry Stratton.

Code and code snippets, to the extent that they are copyrightable, may be re-distributed under the terms of the GNU General Public License 3.

Converting HTML lists to text on the fly last modified October 5th, 2009.

Your comment
Your name
Your email
Your web page
Your location

Mimsy Were the Borogoves

Converting HTML lists to text on the fly

Converting a list to indented plain text

Present the code in a plain text window?

Toggle between readable and copyable display?

But wait!

Only include code.js once

More JavaScript

Editorials

Books, Movies, & Music

Technology & Hacks

Food

42 Astounding Scripts

Walkerville Reader

Biblyon Broadsheet

About Mimsy

Comments?

Lost?

Mimsy Were the Borogoves

Converting HTML lists to text on the fly

Converting a list to indented plain text

Present the code in a plain text window?

Toggle between readable and copyable display?

But wait!

Only include code.js once

More JavaScript

Editorials

Books, Movies, & Music

Technology & Hacks

Food

42 Astounding Scripts

Walkerville Reader

Biblyon Broadsheet

Blogroll

Keep in touch

About Mimsy

Comments?

Lost?