Perls Before Swine: Creating files

  1. Custom search
  2. Perls Before Swine
  3. SQL database

Unix-like operating systems provide an easy means of creating files from any program that has an output. Often, you won’t even need to worry about creating files, you’ll just redirect to a file and let the operating system handle it for you.

./show --exact --artist foreigner --format raw songs.txt > foreigner.txt

Because you can pipe directly from one program to another on the command line, you sometimes won’t even need to create files to store temporary data. If you want to count up how many songs Foreigner has in songs.txt, you can:

./show --exact --artist foreigner songs.txt | wc -l

Or, one of my favorites,

./show --exact --artist foreigner songs.txt | rev

But sometimes we do need to create our own files, and Perl makes this easy. Suppose we wanted to be able to create multiple files, perhaps one for each album, or one for each artist?

We can add a switch for this easily enough.

} elsif ($switch eq "export") {
$exportField = shift;
if (!grep(/^$exportField$/, @validFields)) {
print "\nI can only export by $validFields.\n\n";
help();
exit;
}

This switch is exactly like our sort switch. It accepts a valid field; if the user tries to export by something other than a valid field, the script will warn them and exit.

If the data is being sorted, we are going to have to wait until the end to export the files. So to make it easier, we’ll simply always wait until the end to export the files. This lets us re-use some of the code for sorting. Change:

if ($sortby) {
$matches[$#matches+1]{'text'} = $text;
$matches[$#matches]{'sort'} = $$sortby;
} else {

to:

if ($sortby || $exportField) {
$matches[$#matches+1]{'text'} = $text;
$matches[$#matches]{'sort'} = $$sortby if $sortby;
$matches[$#matches]{'file'} = $$exportField if $exportField;
} else {

The script will now remember the matches if either $sortby or $exportField has something in it. We only store the ‘sort’ association if $sortby has something in it, and we only store the ‘file’ association if $exportField has something in it. If $exportField is “album” and $album is “Head Games”, ‘file’ will associate with “Head Games” for this record.

So now we need to change the code that deals with @matches. Change this:

} elsif (@matches) {
@matches = sort byCustom @matches;
foreach $match (@matches) {
print $$match{'text'};
}
}

to:

} elsif (@matches) {
@matches = sort byCustom @matches if $sortby;
foreach $match (@matches) {
if ($exportField) {
$filename = $$match{'file'};
#open the file if we haven't already
if (!$files{$filename}) {
if (!open($files{$filename}, ">$filename")) {
print "Unable to open $filename: $!\n";
exit;
}
}
$filehandle = $files{$filename};
print $filehandle $$match{'text'};
} else {
print $$match{'text'};
}
}

#close all open files
foreach $filehandle (values %files) {
close($filehandle);
}
}

Note that in the second line we now only sort if $sortby has something in it. Otherwise, there’s nothing to sort on.

We’ve added a new section for “if ($exportField)”, so that if $exportField has something in it we will print to a file instead of to the “standard output” (usually the screen).

Before writing to a file, the file has to be “opened”. We need to get a “handle” on the file. Since we need to have a number of files opened it makes sense to store the file handles in an array. This script stores them in an associative array called %files, associating them with the filename.

Before opening the file with that filename, the script checks to see if there is already a handle associated with that filename in %files. The script only opens the file if there is not an existing handle associated with that filename.

If a file needs to be opened, the script opens it within an if, so that if there’s an error opening the file it can print an error and exit. Perl always stores the most recent error in a special variable called “$!”. So, if there’s a problem opening $filename, we have the script print “Unable to open $filename” and then “$!”. The error message is often very useful. For example, if you don’t have permission to open a file, the error message will say this.

The important new part is “open($files{$filename}, ">$filename")”. The open subroutine accepts two parameters. The first is the variable where we want to store the handle to the file. The second is the name of (or path to) the file we want to open. If we want to be able to write to the file, we need to prepend a greater than symbol to the filename. (We can also append to files by prepending two greater than symbols to the filename.)

So, if the script can successfully open the file, we now have a handle to it in $files{$filename}. All that remains is to get it (with “$filehandle = $files{$filename}”) and print to it.

If you look at some of the previous print commands, they have multiple variables or multiple pieces of text, separated by commas. Print can accept any number of pieces of text, separated by commas. However, if the first variable is not separated by the rest of the variables or text by a comma, print assumes that this is a handle to a file, and redirects its output to that file handle.

That’s why there is only a space between $filehandle and $$match{'text'} in “print $filehandle $$match{'text'}”.

Finally, after looping through all matches, we grab every value out of %files—each of which is a file handle—and close that file. The phrase “values %files” is the same as “keys %files” except that it gets a simple array of %file’s values, rather than a simple array of %file’s keys.

Perl will close files for us automatically as soon as the script ends or exits. But I like to close them explicitly as soon as they are no longer needed. Otherwise they hang around, open, until the script ends. Here that’s not a big deal but later on we might alter this script and add more functionality at the end. If that functionality involves opening files too, we might run up against the operating system’s limit: most operating systems limit the number of files any one program can open.

Having done all of this, we can now grab, say, all albums by foreigner and create a separate file for each one:

./show --exact --artist foreigner --export album songs.txt

Of course, you’re going to want to make sure that no album has the same name as a file you don’t want to erase: every time Perl opens a file, it will happily erase an existing file with the same name. We’ll see if we can do something about that in the next section.

And, of course, add this to the help subroutine:

print "\t--export <$validFields>: export to files named after the specified field\n"

You probably don’t want to play around too much making export files. It will be very easy to create hundreds of files in your current directory. We’ll fix this next.

  1. Custom search
  2. Perls Before Swine
  3. SQL database