Mimsy Were the Borogoves

Hacks: Articles about programming in Python, Perl, Swift, BASIC, and whatever else I happen to feel like hacking at.

Regenerate multiple files

Jerry Stratton, October 6, 2021

Bit pipes: Piping letters through a filter.; scripts; programming

This is sort of a minor codicil to a handful of much more interesting posts about automating the generation of Daredevils RPG character sheets over on the Biblyon Broadsheet.

At a role-playing game convention recently, I had a whole bunch of pregenerated characters, and regularly needed to recreate their character sheets from source files after making changes to either the characters or to how I interpreted the character generation rules. I used a Daredevils pregen calculator to do this. But while the calculator could take multiple inputs and produce a single file with every character in it, I ended up choosing to have a separate output file for each character. It was easier to find any particular character that way.

This meant having to rerun the script for multiple characters every time I made changes. That’s obviously a target for automation. What I needed was a script to check each source file against its generated character sheet—and against the ever-changing daredevils script—and run the script on that source file if it’s newer than the sheet or if the destination is older than the script that generated it.

The “file” command could probably handle this for me. But it is an arcane command which is not amenable to obviousness. It produces long command lines with obscure switches that require, for me at least, reading through the man page every time I need to alter it. I need more obviousness in my automations.1

So I wrote a script that takes files on the command line, a directory for the output files, and a script to run on each file.

The basic requirements are:

  1. a bunch of input files, such as my character stats files for Daredevils;
  2. the same number of output files in a separate but common directory, with the same name (sans any extension) and the extension “.txt”;2
  3. a script that runs on the input files and produces text to standard output; the standard output for an input file is piped to the corresponding output .txt file.

The script checks whether the input file is younger than the output file before running or if the output file is older than the script. In the former case, the data has changed and the output needs to be recreated. In the latter case, the logic has changed, and the output needs to be recreated.

I haven’t needed a --force option yet, but it’s an obvious potential add.

[toggle code]

  • #!/usr/bin/perl
  • # run a script on every file on the command line, piping it to a different folder and the same name.txt
  • # if the script or the file has changed since the last run
  • # Jerry Stratton astoundingscripts.com
  • use File::Basename;
  • while ($option = shift) {
    • if ($option eq '--help') {
      • help();
    • } elsif ($option eq '--commit') {
      • $commit = 1;
    • } elsif (-d $option) {
      • help("Two directories specified ($outputDirectory:$option)") if $outputDirectory;
      • $outputDirectory = $option;
    • } elsif (-f $option) {
      • $files[$#files+1] = $option;
    • } elsif ($scriptPath = `which $option`) {
      • help("Two scripts specified ($script:$option)") if $script;
      • chomp $scriptPath;
      • $script = $option;
    • } else {
      • help("Unknown option $option");
    • }
  • }
  • help('No output directory specified') if !$outputDirectory;
  • help('No script specified') if !$script;
  • help('No files specified') if !@files;
  • foreach $source (@files) {
    • ($sourceName, $folder, $extension) = fileparse($source, qr/\.[^.]+/);
    • $outputPath = "$outputDirectory/$sourceName.txt";
    • next if -f $outputPath && -M $source >= -M $outputPath && -M $outputPath <= -M $scriptPath;
    • $source = quotemeta($source);
    • $outputPath = quotemeta($outputPath);
    • $command = "$script $source > $outputPath";
    • if ($commit) {
      • print `$command`;
    • } else {
      • print "$command\n";
    • }
  • }
  • sub help {
    • my $message = shift;
    • print "$0 [--help] [--commit] <outputDir> <script> <filename> [filenames]\n";
    • print "Run <script> on every <filename>, piping output to <outDir> as a .txt file.\n";
    • print "\t--help:\tthis help text.\n";
    • print "\t--commit:\tperform script.\n";
    • print "\toutputDir:\tthe directory to pipe output to, using the source file's filename.txt.\n";
    • print "\tscript:\tthe script to run on each file, whose output is piped.\n";
    • print "\tfilenames:\tthe filenames to run the script on, and whose name produces the output filename with .txt as the extension.\n";
    • print "\n$message.\n" if $message;
    • exit;
  • }

If the input filename is X.ext, the output filename is X.txt. Since the entire purpose of this script is to regenerate the output files, any existing files in the output directory will be erased if they are younger than an input file with the same name. Take that as a warning: run this script carefully. That’s why I have the --commit switch; the first time I run the script, I want to see what it’s going to do. Only then do I actually recreate the destination files.

Here’s an example of how I used it for the Kolchak game:

  • pipewalk characters/*.txt data daredevils --commit

The character source files were all in a folder called “characters”. I sent the character sheets to a folder called “data” (because they were meant for pulling in to Scribus to create character sheet PDFs). And the script to be run on each source file is daredevils.

The script does some sanity checking. It makes sure that the files exist. It makes sure that the output directory exists. It does not create the output directory if it doesn’t exist, since the entire point of this script is regenerating existing files. In the main use case so far, the Kolchak game, I’d already created individual files one by one as I built up my stable of pregenerated characters.

And, finally, it checks that there is a script with the specified name somewhere on your path. So you can use this for built-in commands as well. Suppose you’re keeping a collection of text files reversed. You might use this command for it:

  • pipewalk *.txt redrum rev

The script detects that the rev command exists; it detects that the redrum folder exists; and it runs every file ending in .txt in the current directory through rev and outputs the results to redrum.

The script doesn’t care about command-line switches or full paths; as long as which can find the part before the space, it will count as a script.

  • pipewalk *.txt yells "~/bin/case --upper"

This runs every .txt file in the current directory through a custom script called case, with the switch --upper.

Each of those examples needs --commit to actually regenerate the files.

  1. I have the same problem with awk. For me it’s usually easier in the long run to write a simple Perl script to handle the cases awk is great at.

  2. Obviously the latter could be altered using a command-line switch if it became necessary.

  1. <- Caption this!
  2. Premature optimization ->