Encoding Peanuts


As I’ve told you, this class got its start with Amy Schulz Johnson asking my colleagues in ODH to help her do some analysis of her dad’s work. Those colleagues created a corpus of Peanuts by harvesting data from GoComics.com. The GoComics team has created transcripts of what happens in the comics, which are tremendously helpful. But…it turns out they’re not always exactly correct. Just take a look at Schulz’s comic for Schulz’s comic for 23 January 1952 and GoComics’s corresponding data transcript:

Image of 23 January 1952 Peanuts strip

Schroeder sits at the piano playing. Charlie Brown comes up to him and says, “Say, that’s sensational, Schroeder…what is it?” He looks to the reader confused as Schroeder replies, “Grasse Sonata fur das Hommer-Klavier” As Schroeder resumes his playing, Charlie Brown walks away. The latter says, “Sometimes I feel like I don’t belong around here!”

So it’s close…but it’s maybe not close enough. There’s a lot of information that isn’t captured in this transcript: what music Schroeder is playing and all its attendant information; what happens in which panel; the question mark that Charlie Brown “speaks”; the McCloudian transitions (see chapter 3) across the gutters; and how Schulz signs his name. And that’s just the first five things I thought to type! What’s more, there are typos! It’s very clear in the comic that Schroeder says “Grosse Sonata für das Hammer-Klavier” and not “Grasse Sonata fur das Hommer-Klavier”! Apparently whoever transcribed this comic wasn’t used to reading Fraktur.

We can do better than this! Creating marked-up editions of Peanuts strips will provide us the chance to more accurately represent the strip, as well as enriching it in the specific ways that we care about: settings, place names, weather, activities, the Bechdel test, and pretty much anything else you can imagine. Creating such editions is time-consuming and requires serious attention to detail. But the opportunities for research that we gain with this process makes it worth the investment. To quote John A. Walsh, encoding Peanuts will “facilitate both close and distant reading strategies” (par. 17).

Nitty Gritty

Each of us will markup one month of strips from each of the volumes we’re reading for class. That’s 61 strips each of you, when all is said and done. (I get more because I’m special.) And then you’ll write a reflection on the project…because of course you will.

This assignment has three due dates. The first month of encoded strips is due on 18 March. The second month of encoded strips and your reflection is due on 8 April. The reflection on the whole assignment is due on April 11.

Before you read further, it’s worth saying that this assignment looks…overwhelming. We will be covering all of this in class and will have some time in class where we are working together to tag things to get you used to what you’re doing. It’s true that this is an assignment that will take you a significant amount of time, especially at first. But you can do it, as long as you don’t leave it to the last minute.


  • Bills: January 1961 (31), March 1981 (30)
  • Harper: March 1961 (30), January 1981 (31)
  • Jeanfreau: April 1961 (30), May 1981 (31)
  • McFarland: May 1961 (31), April 1981 (30)
  • Pearce: June 1961 (30), July 1981 (31)
  • Croxall: February 1961 (28), February 1981 (28), June 1981 (30)

Get Oxygen

Download Oxygen XML Editor and sign up for a free, 30-day trial. You will receive a license key via email and you will need to use it with your individual copy.

Create the strip file

In Oxygen, open your personalized strip template from our repo and copy all of its text (Ctrl/Cmd-A, then Ctrl/Cmd-C). Then open a new document in Oxygen and paste the template (Ctrl/Cmd-V) into it. Do NOT start editing the template itself.

Name the file as follows: peanuts-[four-digit year]-[two-digit month]-[two-digit day].xml. For example, the first strip, which is from 2 October 1950, would have the following filename: peanuts_1950_10_02.xml.

Save the file in the encoded-strips folder in your local copy of the GitHub repo.

Encode the strip

Encode the strip as follows. If you run into questions, be sure to refer to our “Peanuts Encoding Editorial Decisions” document. Or watch the (coming-soon) following video!

Don’t be alarmed at the size of this list. For most of the strips we encounter, many of these possibilities simply won’t occur.


Since we will each have our own personalized template to work from, we won’t have to make too many changes to the header. However, you will need to update your template when we move from the 1961-1962 volume to the 1981-1982 volume (specifically <title> and <date> on line 40).

For each strip, however, you still need to make the following changes in the header:

  1. <title> (line 11): Change the title to have the correct date in YYYY-MM-DD format
  2. <idno> (line 35): Set the date to the proper YYYY-MM-DD format. Please make sure you don’t put a space between Peanuts and the year.
  3. <title level=’a’> (line 40): Change the date of the strip to the proper date in YYYY-MM-DD format.
  4. <change> (line 68): for the value of @who, use your r-name from the taxonomy. For the value of when, record the date you are doing the edits using YYYY-MM-DD format. You’ll note that the description of the change is “Initial encoding.” If what you’re doing does not match that description, replace that phrase with a short description of what you’re doing.
  5. <graphic> (line 74): update the value of @url to the proper date for your comic. Open the URL in a browser (hold Ctrl/Cmd while clicking on it) to make sure it resolves properly.

N.B. Please note that you can accomplish these first three by using Find/Replace (Ctrl/Cmd-F) and replacing YYYY-MM-DD with the proper date and then clicking Replace All.
GIF showing find and replace


Most of the work that you will do for each strip is in the <body> portion of the document. I suggest working on one aspect of the strip at a time (e.g., speech balloons or transitions) rather than moving through the strip like you are reading it.

Do the following 13 (!?!) things for each strip:


1. <head> (line 80): Change the date within the element to (D)D [full month name] YYYY format.

Weather, Activities, Settings, Holidays, Visual Aspects, Bechdel Test, and Story Arcs

2. <div type=”panelGrp”> (line 82): For each strip, tag weather, activities, settings, holidays, visual aspects, story arcs, and the Bechdel test using the @ana element. For a full list of the different options, see our Editorial Decisions Document.

Number of Panels

3. The template has a default of 4 panels. If there are more or less panels, add or subtract <cbml:panel> elements and number them properly using the @n attribute.

Characters in Panels

4. <cbml:panel> (lines 84, 91, 96, and 101): For each panel, set the value of @characters that appear in that panel to the shortcodes of character names from the taxonomy. Separate multiple character names with a space, e.g. characters="#c-cb #c-lucy #c-violet".

Description of the Panel’s Action

5. <note> (lines 85, 92, 97, and 102): For each panel, set the value of @resp to your r-name from the taxonomy. Within the element, write a brief description of what happens in the panel. See the 1950 and 1982 examples in the xml-test folder within the repo for ideas.

Speech and Sound

6. <cbml:balloon>: Use this element to transcribe each speech act. Change the value of the @type attribute to another choice if it’s not a normal speech balloon. Set the value of the @who attribute to the appropriate character name from the taxonomy.

  • By default, the template is set up for a single speech balloon per panel. If there are multiple speech balloons in a panel, add additional <cbml:balloon> elements.

7. If characters are referred by name or by nickname (e.g., “Chuck”) in a speech bubble, wrap the name in a <persName> element and use a @ref attribute with the value set to the taxonomy reference; for example, <persName ref=”#c-cb”>.

8. If real people are referred to by name (e.g., Beethoven, Peggy Fleming), tag them using a <persName> element and use a @ref attribute with the value set to a URI from an authority file like VIAF. For example, <persName ref=”https://viaf.org/viaf/32182557/”>Beethoven</persName>. You can often find such URIs under the Authority Control section of a person’s Wikipedia entry.

9. If real places are referred to by name (e.g., New York City), tag them using a <placeName> element and use a @ref attribute with the value set to a URI from an authority file like GeoNames.

10. <sound>: Use this element to encode sounds that are not contained in a speech balloon (see this example). This element should go on a new line within the <cbml:panel> and not within the <cbml:balloon>. This element should also be used to record onomatopoeic sounds within speech balloons.

Diegetic Text

11. Text that would be visible to characters within the strip, such as words on signs should be tagged in the panel with a <floatingText> element according to the guidelines in our editorial decisions document.


12. <cbml:panel> (lines 91, 96, and 101): For each panel except for the first, label the transition according to McCloud’s definitions by setting the value of @ana to the tr-name from the taxonomy. Refer to the editorial decisions document for specific rules that we have adopted for making these decisions. If you are uncertain about a transition, bring it to class and we will discuss it together.

Signature and Date

13. Look for Schulz’s signature and date and move the <docAuthor> and <docDate> elements (in panel 4 of the template) to the proper panel. If Schulz writes his name differently, update it in <docAuthor>. If he renders it in a different font or style, encode this with <hi rend=”[description]”> using the guidlines from our editorial decisions document. For the date, write it exactly as Schulz does (e.g., 1-15 or 1/15).

Anything Else

We will collectively decide if there’s anything else we would like to tag in the strips and how we will do this. We should not let ourselves be governed by the narrow-mindedness of the 2020, 2021, or 2022 classes and their boneheaded professor!

Commit and push the strip

After you’ve finished your work on one or more strips, commit and push the changes you have made to the repo using the GitHub Desktop client. You might want to create a different commit for each strip as you work. Before you can push your changes, you might need to fetch origin to get changes that others have made.

Reflect on the Assignment

After you have completed your two months of strips, you will write a three-page reflection on the assignment. Some questions to consider in your reflection are:

  • John Walsh describes encoding as “a form of discovery, or prospecting” and calls it both “a form of reading and writing.” Do I agree with him?
  • What did I learn about digital humanities from this assignment?
  • What did I learn about Peanuts from this assignment?
  • What did I learn about research from this assignment?
  • What would I change about this assignment to make it more relevant, informative, enjoyable, challenging, or interesting?

The reflection must be submitted as PDFs on Learning Suite by 11:59pm on Wednesday, 13 April. Please name your file "lastname"_encoding_reflection.pdf (e.g., croxall_encoding_reflection.pdf).


This assignment is worth 250 points. Each month of encoded strips will be worth up to 110 points. For each month, I will randomly select four of the daily strips you encoded and one of the Sundays. Each of these five strips will be worth up to 22 points distributed as follows:

  • 5 points: I will check the strip to see if it is well-formed (per XML standards) and if it validates (against the TEI schema). Basically, if it gets a green square in Oxygen, you will be good to go.
  • 2 points: I will check to see if your TEI header contains all the required information.
  • 2 points: Using your URL from the <facsimile>, I will check the encoded strip’s text against the original text to make sure it is all correct.
  • 13 points: I will check the <text> to see if you have correctly tagged everything listed above, as well as anything else we determine as a class as we are working.

The remaining 30 points for the assignment will be for the reflection paper. You will (primarily) be graded on your discussion of what you’ve learned, as well as (secondarily) grammer, mechanics, and all that jazz. Please note that I’m not grading you based on what you’ve learned but instead on your discussion of it. This means that the point of this reflection paper isn’t for you to tell me how great the assignment is.


I designed the first version of this assignment in 2018 after discussions with Elli Mylonas, TEI aficionado and colleague extraordinaire, and I made some important revisions in 2019. It was updated considerably in 2020 for Peanuts, drawing particularly on the work of John A. Walsh and his Comic Book Markup Language. Working from Walsh’s article in Digital Humanities Quarterly, Elli created the basic strip template and taxonomy, and she has helped me refine it further. In 2021, 2022, and 2023, I made a number of updates based on how what we have learned, trying—believe it or not—to make this all easier to understand. A big thanks go to Ashlin Holbrook for re-formatting and organizing the editorial guidelines into a more usable form!