From: R. Russell Allbery <[r--a] at [CS.Stanford.EDU]> Newsgroups: rec.arts.comics.info Subject: Comics database Date: 6 Feb 1994 18:27:42 GMT Okay, here you go. Sorry about the delay. First is a description of my database, and then I attached all the conversation on rac.misc about building a database, as well as some information I got from comp.databases. -------- A * indicates that a field is a key field. N are numeric fields, A are alphanumeric fields of size give by the number after the A, D are date fields, and $ are money fields. Comics Table ------------ Code & N* This table is linked to Stories Title & A40 by the Code field and the Volume & N Book Code field in Stories. Number & A30 Version & A10 Printing & N Market & A6 Publisher & A25 Date & D Format & A20 Color & A5 Number of Pages & N Cover Price & $ Condition & A4 Book Value & $ Continuity & A40 Appearances & A60 Notes & A60 Stories Table ------------- Book Code & N* The Book Code and Story Code Story Code & N* fields link this to the Jobs Story Type & A15 table. Page Number & N Story Grade & A2 Art Grade & A2 Jobs Table ---------- Book Code & N* The Creator Code links this Story Code & N* to the Creators table. Creator Code & A8* Job & A10* Creators Table -------------- Name Code & A8* Full Name & A40 --------- From: [s--g] at [rigel.statoil.no] (Stig Tollefsen) Newsgroups: rec.arts.comics.misc Subject: Comic Databases Date: 24 Aug 93 14:29:22 GMT This is the revised and updated version of my contribution. This document includes tables for characters and continuity, and takes into account some of the suggestions made by other good people. My apologies for not posting it earlier, but I've been frightfully busy! Be sure to post suggestions and criticism. Just a few notes on the comics database tables: Primary key: Title Volume Issue No Version [Distribution suggested by Scott McMahan - put it in if you care about the distinction.] Print No Title is the exact title as written in the indicia. Volume is the series number - Hawkworld v1 is the Truman mini prestige while Hawkworld v2 is the 32 issue ongoing. There is a problem with such a construct though - how to number successive runs of titles published by different companies, like the numerous The Shadow titles. Also, some people collect comics from more than one country - how do you deal with say the American Aliens series vs. the UK one? Thoughts on this would be appreciated. Issue No. needs to be a string because of peculiar issue numbers like double numbering (Cerebus 112/113) and the notorious Limited Collectors' Edition from DC with numbering starting with C-. Version is for issues with variant covers etc. The string can be an explanation like 'gold cover' or an established convention as with X-Men v2 #1A-E. Print No has to go in the key for value reasons; note that a reprint does not have to be identical to the first print; it does however need to have the same title/number in the indicia. Thus neither Marvel's Milestone editions nor DC's Silver Age Classics series qualify as reprints; they are separate titles. With the latter, convenience would force us to treat 'DC Silver Age Classics' as the title with say 'House of Secrets 92' as the Issue No. There is a drawback with using a relational design; it does not easily enable us to follow a series through name changes (Moon Girl/Moon Girl and the Prince/A Moon, A Girl, A Romance/Weird Fantasy/Weird Science-Fantasy/Incredible Science-Fiction is a good example ;-)). Neither does it let us model the branching or merging of titles (Weird Science and Weird Fantasy becoming Weird Science-Fantasy is an example of a merger, All-Star Comics becoming All-Star Western and much later becoming All-Star Comics again, resuming the old All-Star Comics numbering, is an example of a split). I have done an object-oriented model of this, if anyone's interested I'll post it. In addition to this, the main table could contain: No. of pages Cover price Format Size Date Year of Publication I know values are a big deal for you speculators out there - a separate table can be connected with the key of the previous as a foreign key, giving: [key] Number of issues Grade Current value Purchase price This would require one entry for a purchase of 10,000 Youngblood 1s in NM, with Current value steadily decreasing :-P, but would require several if these were spread over different grades. Alternatively, have a value breakdown table showing the percentwise value of each grade (M 120%, NM 100% VF 70% etc.). These need to be tagged, because this breakdown is different for different classes of comics (a NM $10 silver age comic in FN is worth less than a current NM $10 comic in FN, for some reason). This table could be keyed with a string saying 'current', 'silver age' etc. The above table would then be broken into two: [key] Number of issues Grade Purchase price Grade breakdown [key] Current NM value If you're collecting comics from more than one country, you obviously need a currency conversion table as well, which is kind of complicated, because you would need to keep each occurence in the database if you want to calculate profits! For you Britons out there, this means that your purchase of Spawn 1 at GBP 1.25 would show a profit as US values rise, but you don't want that profit to waste away if the pound ever gets stronger against the dollar. It's the exchange rate at purchase that counts. And yes, I know you have your own guide but there are other nationalities out there. Now to publishers - this could be put in the main table, but if you want to be fair to publisher collaborations like 'Marvel and DC Present Featuring the Uncanny X-Men and the New Teen Titans' (wheee!!), you need a separate table: [key] publisher This combined would constitute this table's key. It is a bit cumbersome, and one of the reasons I'm not too fond of a relational design... If you want to get really advanced, there is an additional problem: Vertigo is part of DC Comics which is part of Time-Warner. All three publish comics (at least I _think_ Time-Warner has published a few mass-market paperback editions, but you get the point). Do we want to distinguish Vertigo from DC, but all the same keep the information that any Vertigo book is also a DC publication? If we do, then we need some way to model it. For my o-o design I did a recursive container-class construct, for a relational design it has to look something like this: publisher sub-publisher (with NULL as default) Any query involving publisher would then have to be cross-checked against this table to pick up sub-publishers as well. Date: Dates have to be defined in a separate table or as an abstract type, if your package supports it. We need support for: Weekly comics Bi-Weekly comics Monthly comics Bi-Monthly comics Quarterly comics Annuals At least. In some cases, it is desireable to sort comics by time of publication. This necessitates the following information to be kept: Key: Frequency (Monthly, Weekly...) Max_in_year (12 for monthly, 53 for weekly...) Sequence (5 for May, 18 for Late September...) Name ('January', 'Early August'...) Name would not be used for weekly comics, so give a NULL for those. Additionally, a Semi-Monthly type could be added, meaning 24 issues a year, as opposed to Bi-Weekly which really means 26 issues a year. Quarterly comics are a bit vague; each season should correspond to three calendar months, but there is no consensus on which ones, and subsequently this often varies. Note that this table is mostly constant, and would better be implemented as an abstract type. Year of publication should go in the main table. Format: It's a matter of taste whether one wants size to be part of Format or not. I'll say 'Standard Format Magazine' is valid and therefore put Size on its own, like this: Format Key: Name Paper Quality Binding Method Cover Stock Size Key: Name Width Height It is not appropriate to put colour in either the main table or the Format table; format is independent of colour, you can have b&w prestige format books as well as colour ones. Also, part of a book can be b&w and part colour, so it's best to put this in with the story information. Contents: A comic does not always contain only one story with only one each of writer, artist, editor, what have you. Sometimes a comic contains art pages (pin-ups), text pages, editorial material and (shudder) ads. Most comics also have a cover. In addition, individual stories are often reprinted. You don't want to duplicate story information, so that rules out a direct link (in database terms, what we have here is a many-to-many relationship). Thus: Contains: [main table key] [contents key] These two together are the key for this table. Contents: [main table key] gives comic where first published Type (Story, Pin-Up, TextPage, Cover...) Sequence Name Number of pages Comment for special information. Ex: 'Superman dies in this issue'. The first three are the key. Type and Sequence are supposed to convey information like 3rd story, 2nd cover etc. This is needed because stories and other material cannot be keyed by title - titles are often reused, and some stories are untitled (I don't think there is a single titled story in LSH v4, well 'The Quiet Darkness' maybe). Give default of 'Untitled' to Name. Of course, stories are often subdivided. A storyline can span several issues, and a story can have several chapters within one issue. These can be entered as usual in the Contents table, but left out of the Contains table. In this way, the Watchmen storyline can be shown to include all 12 (comic-book story) chapters/sub-stories plus the text pages. A separate table must be used to link these: Storyline: [contents key for storyline] [contents key for chapter] Queries must be written to use the contains table when traversing comic->contents, but search storyline table when traversing contents->comic. The possibilities are many and varied. Creators: [contents key] Job Start page default 1 End page default number of pages (If you are really hot on this idea, you can have start/end panels also here) Name Job would be penciller, inker, writer, plotter, painter, colourist, whatever. If you want to do searches on creators by surname, an abstract type with Surname, First Name, Intitial or something of the sort is required. There are several points to make here. I have used a lot of mostly undefined abstract types; these are trivial with a good package but may cause problems. Also, what to do with aliases? Graham Ingels and Ghastly are the same person, when you do a search on one, do you automatically want the other? Then you need a separate table which cross-references all aliases. What about Louise Jones/Louise Simonson? A good comics database should handle this. Finally, keys become quite large here. If your package can generate unique aliases, use them. Another desireable feature is the ability to create sort algorithms for the abstract types. Ideally this model needs an object-oriented DBMS; I have implemented some of it using ONTOS and C++, it really does the job much better. I guess my point here is that with all things taken into consideration, the average user just can't build a satisfactory solution himself. I'll say a commercial product would be beneficial. Comments? Characters: Modelling characters is similar to modelling creators. For example, Dick Grayson, Jason Todd and Tim Drake are different characters, but they are all (or have been) Robin. Similarly, Iron Man, Captain Marvel, Thor, Doctor Fate, Batman even - have all had different people behind the mask. When we ask for Batman, we want all Batman appearances - no matter who's behind the disguise, but asking for Bruce Wayne should exclude all of AzBat's. To do this, an equivalence table can be constructed: Alias: character alias start [contents key] end [contents key] This will have to be tied up with a continuity table (see later). Without the start and end attributes, there would be no way to distinguish Tim Drake's appearances before he became Robin from those after. We need to link a character to a story: [contents key] appearance type character Continuity: Few things make fans more angry than bad continuity. With the wonders of database technology, you too can become a continuity wizard! A few points must be made: Even though two connected stories may have continuity conflicts, this doesn't necessarily mean that continuity's screwed, only that it has branched into different paths. For our purposes though, we have to assume that when one story makes a reference to another, they belong to the same continuity. The destruction of New Genesis in the Hunger Dogs was later retconned away. This is valid; continuity before Hunger Dogs branches into two different continuities: the Hunger Dogs one and the current one. Cosmic Odyssey belongs to the Hunger Dogs continuity since it mentions the destruction of New Genesis. The inconsistency happens when current continuity makes references to Cosmic Odyssey! You can't have your cake and eat it too. Continuity cannot be dictated from the editorial regime; once the stories are written, they define continuity by themselves. By writing down all outside references in each story, a continuity tree can be constructed. Note that if stories A and B both refer to story C, and story A also refers to story C, the last reference is redundant, and should be thrown out. Note also that by nature references are made to stories previous in (virtual story) time, while we need to traverse continuity in the opposite direction, from start to finish. This must be done in the database package using some sort of rule. If story A refers to story B, then (part of) story B takes place before (part of) story A. The key of story A is then inserted into story B's continuity table. There are of course some difficulties with this. A story X using flashbacks could have the framing sequence referring to story W, while the flashback sequence occurs way back before story A (assuming linearity A-X). You then get a closed loop, and this has to be resolved by the means of database rules. Time travel stories also make life interesting (remember Mr.Z meeting Superman *again* in Superman 53(?), while Supes never saw him before? Later on, in the 'Time and Time Again' storyline, Supes meets Mr.Z again in Germany during WWII, and this is Mr.Z's first meeting with Supes? Closed loop, folks). The table itself is simple: Continuity: [story key] Remember that you don't have to put in tables you'll never use, and if you do, they won't take up disk space unless you fill them in. Other fancy stuff can easily be added, like picture files for covers (assuming you have access to a scanner), but this will inevitably eat up your hard drive. I can't help thinking that CD-ROMS full of already keyed-in data would be a good thing for both us and the publishers (done correctly, they could avoid all continuity glitches). Comments are appreciated, and criticism too; I'm not perfect. Stig Tollefsen E-mail: [s--g] at [vega.data.st.statoil.no] "It's a long hard road, and a full hard drive" - "Networking", Warren Zevon =============================================================================== From: [s m d] at [floyd.brooks.af.mil] (Sten Drescher) Newsgroups: rec.arts.comics.misc Subject: Re: Comics Databases Date: 6 Aug 93 05:16:04 GMT Scott> Stig Tollefsen ([s--g] at [rigel.statoil.no]) wrote: Scott> And a newsstand/direct field. This, as well as printing #, can all be covered by a single version field. Scott> The emphasis has been on speculating so far, but we need Scott> to include some creator data. Which, of late, has been real motivation for speculation. I don't want value data in the database for speculation, I need it so I know how much to insure my collection for. Scott> Personnel table Scott> [key] Scott> Writer: Scott> Penciler: Scott> Inker: Scott> Colorist: Scott> Letterer: Scott> Editor: Already suggested. However, a fixed list like this won't work - what do you do with plotters, scripters, 'breakdowns', co-writers, multiple stories, 'anniversary' issues where the creative team changes every 4 pages, etc. Scott> Instead of that huge key, it might be better to implement a Scott> book table: Scott> Title: Scott> Publisher: Scott> Unique Code: Scott> Series type: ongoing, mini-, one-shot, special, graphic novel etc. Scott> And use Unique Code, Vol, Issue as the key. Normalize it a little. Scott> And you could add notes to the book table about the book in general. Scott> Ex: Scott> Title: Avengers Scott> Publisher: Marvel Scott> Uniq Code: AVE Scott> Series: ongoing Well, the unique key should be automatically generated. A good hash function should be able to handle it, but I would prefer the table to be more like: Title: The Legion of Super-Heroes Volume: 3 Series Type: ongoing Publisher: DC Sort Key (never seen, generated): legionofsuperheroes_3_ongoing Hash Key (never seen, generated): 13642 The primary key for the file would be the hash key (I'm assuming a 64K limit on titles ;-), with secondary keys based on the sort key, and the publisher and sort key. The sort key would be generated by stripping leading any a/an/the, whitespace, punctuation, and converting Roman and Arabic numerals to the appropriate sort value (will have to prompt about this - more in a sec) in the title, and concatinating that with the volume number and the series type, separated by underscores. The number conversion would need to be manual, because of the widely differing situations: Generation X: no conversion Robin II: II -> 2 Robin 3000: 3000 -> threethousand etc. Scott> Maybe even an equivalent and/or became table... Scott> BECAME table Scott> West Coast Avengers : Avengers West Coast Scott> Classic X-Men : X-Men Classic Scott> Coke : Coke Classic The became table is good, but it needs to handle splits and merges (Dr Strange + Cloak & Dagger => Strange Tales => Dr Strange, Sorcerer Supreme + Mutant Misadventurtes of Cloak & Dagger). It will, if you don't insist on unique keys. Scott> So that only one record for a title would be in the book table -- Scott> and use the current name for the book... Bad move! How would you have dealt with the situation above? Just because the numbering isn't carried on, doesn't mean that the storyline isn't continued, and that's the primary purpose of the become table. Also, I believe that several of the early Marvel superhero titles started sharing a pre-hero title, then split into multiple titles, each picking up the numbering of the former title. I know the Hulk did, and I think Iron Man, Captain America, and Thor did the same thing. -- Sten Drescher [s m d] at [floyd.brooks.af.mil] #include =============================================================================== From: IN%"[m c tajdi] at [vaxd.dct.ac.uk]" 17-AUG-1993 08:44:11.53 Subj: Paradox: Need help implementing a database design >Here's the problem: Each comic book can have one or more "stories," where a >story can be anything from the entire book to a single page gag. Each of >these stories are done by creators (usually writer, penciler, inker, letterer, >and colorist, but this can change), and the creators can change from story to >story. I want to be able to keep track of all of the creators who worked on a >given comic, and also keep track of which story they worked on. There are so >many different possible divisions of labor, as well as the possibility of four >or five pencilers in a single story, that I can't use Writer, Penciler, etc. >fields. Instead, I would like to have a name field and a job field so I can >enter the name of the job. Essentially, I think there would be a table of >stories for each comic entry, and a table for each story listing all of the >creators. I have no idea how to do this. > >Worst case scenarios: Around six stories in a single comic, each with a >different set of creators. Up to forty creators on a single story with an >anthology story (20 or so writers and 20 or so artists). These extremes are >what makes this difficult. Something like this perhaps........ TABLE FIELDS TYPE comment Comic Comic# A5* - unique identifier (allows letters). Title A30 \ _ /and any other info you keep Date D / \ for each comic. Story Story# A5* - unique identifier. Title A30 - and any other info.... Comic# A5 - link to 'Comic' table. Person Person# A5* - unique identifier. Name A30 - and any other info..... Occupation A20 - eg writer Job Job# A5* - unique identifier. Person# A5 - link to 'Person' table. Story# A5 - link to 'Story' table. This is just off the top of my head, no guarantees..... The links enable you to join all these tables when doing queries. Let me know if you have any problems. Alex =============================================================================== From: IN%"[T--M] at [SRC.SBS.utah.edu]" "Tim Ma" 17-AUG-1993 18:13:40.46 Subj: RE: Paradox: Need help implementing a database design Interesting idea. Just off the top of my head, but a hierarchical design seems to be a possible solution. Work from the large entity to the smaller entities. So, from COMIC BOOK --> STORY --> CREATORS (3 tables) Table1: COMICS - this table should describe the comic book as a whole Fields: [ID] - an arbitrary number for ID purposes (unique) [Publisher] - Example: DC Comics [Title] - Example: Superman [Number] - Example: 241 [Date] - Date released or something I'm not quite sure how comic books are identified, so you could add whatever fields you think are necessary to help you identify the book as a whole. Perhaps [Series] = Superman and [Title] = "Death of Superman" or whatever. Do comics have actual titles like that? You will want to index the [ID] field in this table because: (1) ensure it is unique (2) speedier lookups Table2: STORIES - identifies stories within each book Fields: [ID] - should correspond with ID of the whole book [Story #] - (see notes below) [Story Title] - (see notes below) [Type] - (see notes below) The [Story #] is just another arbitrary number that identifies the story within each comic. The logical choice would be just starting from 1 and working your way up. The number should be unique *within* the book, but may be duplicated across comic books. Example: [ID] [Story #] ---- --------- 1001 1 first story in comic 1001 1001 2 second story in comic 1001 1001 3 etc ... 1002 1 first story in comic 1002 1002 2 etc ... [Story Title] and [Type] are just identifiers. Again, you know more about the field that I so you could probably think of better fields to identify stories with. Maybe page numbers (?). I put [Type] because you referred to "single page gags". Perhaps there are other descriptors. You will want to index both [ID] and [Story #] in this table. Table3: CREATORS Fields: [ID] - corresponds with both of the above tables [Story #] - corresponds with STORY table [Name] - name of creator [Title] - writer, inker, penciller, whatever ... You may want to index the first 3 fields here. This will ensure that you have a unique comic-story-creator value. Unless of course, the same person does the writing, inking, pencilling, etc... You have some leeway with this table. It just depends on how you want to break it down and identify the creators. So you can see, CREATORS duplicates the [ID] and [Story #] fields in STORY, but it doesn't duplicate the story title, story type, etc ... With these 3 tables, you should be able to find a comic book using any of the above fields (fields from all 3 tables). Suppose you wanted to see a list of the comic books (title and individual stories) "Bob Jones" had a hand in, but only if they were published by DC Comics. In Paradox, you do the ASK command to query the tables. Pull all three tables up using ASK and do the following: COMIC --- ID ---- Publisher ----- Title ----- Number --- | | | | | | a | DC | X | X | STORY --- ID ---- Story # --- Story Title -- | | | | | a | | X | CREATORS --- ID ---- Story # ------ Name ------- Title -------- | | | | | | a | | Bob Jones | | ** I'll use an 'X' instead of a check mark since the check mark is an extended ASCII char and probably won't make it thru the mail gateway. [F6] generates a checkmark in Pdox. ** The 'a' is a query-by-example character. The character itself is arbitrary as long as it's the same for each table, just hit an [F5] and type out a character. It should show it highlighted. Pdox manual has more. So for your output you would get a table in the following format: ANSWER --- Title ---- Number --- Story Title ------------ | | | | | Batman | 102 | Tales of ... | | Batman | 105 | Batman eats Robin ... | These would be all the comic books published by DC where "Bob Jones" had a hand in creating some of the stories. kinda rough, but like I said, it's just off the top of my head. You'd have to work out a way of getting data in there by creating a multi-table form, but that's not too bad. hope this helps, tim ma associate programmer university of utah email: [t--m] at [src.sbs.utah.edu]