White Paper

The Evolution of Gender in DC Comics 1935-2013

Comic books are deeply ingrained in the fabric of American culture creating household names like Superman and Wonder Woman for over 80 years. For my final project, I decided to visualize DC Comic book character data. I chose the books because they laid the foundation for the characters as we know them now in TV and movies. I wanted to take a deeper look at the origins of what we are consuming as a culture. In the beginning, I had a few questions in mind. How has DC characters and universes changed over time? Do so comics reflect or comment on our society? Have they historically? My initial assumption was that yes comics though based in other worlds are reflective of the one we live in I also wondered why I thought that to begin with? Did I truly see comics as being super diverse or was it simply good marketing? This was when I had to make an important distinction for myself. I was looking at the comic books themselves. The origin stories, not the cartoons I grew up watching. I enjoyed Batmanthe Animated Series and Batman Beyond as a kid. Both shows featured many female characters and black characters. I had to remind myself that these assumptions are based off my experience of the cartoons, not my experience with the texts I was analyzing.

I chose DC comics because my favorite superhero/villain combo (Batman and Joker) is part of the DC universe. My dataset comes from a FiveThirtyEight article on gender in comic books. They used the wikia fandom databases for DC and Marvel to create the datasets they used in their article. The data included the month and year of the first appearance of each character, the number of times the character appeared in comic books, the gender of each character, the sexual orientation of characters if it was available, and much more. The dataset also included the urls to the wikia fandom database ID page for each character in the dataset.

I started cleaning my data in Google Sheets by removing the ID numbers and urls. In Tableau, I cleaned the character names and did calculations to separate the number of appearances by gender. Some names included the universe they were from and others included the public identity of the characters in parentheses. I split the field to get the superhero name alone. I also created a few calculated fields to separate the male appearances from female appearances. Early on I had to decide if I wanted to use the hair color and eye color data in my visualizations. Since I was looking at representation, I thought race could be another area to look at. I quickly dispatched that idea since many of these characters live on other planets and range from humans with purple eyes and purple hair, to robots and aliens.


I created four visualizations. The first shows the names of female characters based on the amount of times they appear in DC Comics. I limited the amount of names, so only women who appeared fifty times or more between 1935 and 2013 show up in the word cloud. The size and color is the names are related to how many times the character appeared. At first, I tried to show this information as a bar chart where the length of each bar shows the number of appearance for all female character and the color represented each female character. It was the prettiest graph that I made that week but it had some major flaws. The issue with the bar chart was there were too many women and not enough colors. Because of this characters were missing from the bar chart. I realized this during the pin up when we couldn’t find Wonder Woman. There were too many women to make this meaningful. It was also difficult to find the names of specific characters because my did not have a search bar. I did not want to cut the idea of showing the female characters visually so a word cloud seemed to be a good alternative. Due to the high number of female characters I had to limit the word cloud to 50 appearances or more in order for the visual to have meaning. Otherwise, all the female character names were all about the same size and color.

The second visualization looks at appearances over time two ways: gantt chart and line chart. The gantt chart at the top shows each year that there were new characters. The colors are linked to gender. My tooltips show the character that debuted that year and has the most appearances between 1935 and 2013. For example, Batman first appears in 1939 and between 1939 and 2013 he appears 3093 times in the comic books. The line chart shows the number of appearances of characters in superhero comics between 1935 and 1954. This chart steered the project and broadened my view on the story I wanted to tell. In my first iteration of this chart it showed all the years from 1935 through 2013. When I went back to revise my visualizations, the first thing I learned was the importance of understanding the aggregation tool you use. I had created a line chart that showed the number of appearances by gender over time between 1935 and 2013. I thought that I created a graph of each character’s popularity overtime when in fact I created a graph that shows the number of appearances all characters of that gender in that given year. This meant that my annotations in my explanation of the first of the graph was incorrect. During the pin up in class one of the questions that came up was what happened in the 1950s to cause such a big dip in character appearances. The gantt chart showed a similar dip. There were no new female characters for a few years in the late 1940s and early 1950s. I then decided to limit my line chart to the points where comic book appearances reach an all-time high during World War II and fall to their lowest point in the late 1940s. This choice led me down a rabbit hole of Wikipedia Pages, DC Comic fan pages, and DC official pages to look into the history of why there’s such a dramatic drop in character appearances.

The decrease in superhero appearances was due to oversaturation of the market.In the late 1940’s, people had lost their interest in superhero comics. Comic book publishers focused on genre comics instead like sci-fi and horror. In the early 1950s superhero comics received media criticism for violence and sexual imagery. In addition to government pressure Frederic Wertham, a well respected child psychiatrist, wrote Seduction of the Innocent a book condemning the violence, drug use, and sexual imagery in comic books. In 1954 the US Senate Subcommittee on Juvenile Delinquency subpoenaed comic book publishers for public hearings. Werthem testified at these hearings citing specific comics as a major cause of juvenile delinquency. Wertham’s book along with his testimony during the hearings delivered a blow to
superhero comic book popularity in the mid-1950s. Many publishers would go under.
For fear of government censorship the Comics Code Authority (CCA) was created as a way for the comic book industry to self regulate the content of comic books. The CCA lasted well into the 80s, but comics became disconnected from societal issues. A counterculture of “underground comix” rose in popularity during the 1960s and 70s that included violence, drug use, and sex, topics that we’re no longer present in mainstream comic books. Many underground comix were self-published and featured socialcommentary on issues like feminism, environmentalism, and gay liberation.

By the late 70s mainstream comic book interest was waning and the way comic books are sold goes through a major change. Direct market distribution allows for the proliferation of underground comix. In this new system comic book store owners can buy comic books directly from the publisher at a discounted rate, and get books to customers faster than traditional distribution routes. This new system provided an opportunity for new comic book publishers to enter the business, and allowed for the growth of independent publishers and self-publishers by creating a system that targeted its retail audience. Underground comix fizzle out in the mid 1970s but their impact is lasting. In the 1980’s, the CCA codes change and allow some violence and drug use. That interesting history sparked the inspiration for my third visualization. I created a bar chart that showed the number of new female characters that appeared each year from 1935-2013.

For my third visualization, I used the count of first appearance and year of first appearance to create a bar chart. I then used color to show the sexuality of the characters. For this chart the missing data told the story. I left the nulls in because they were important. They told the story of female characters and showed sexuality trends in comic books. My bar graph shows the first appearances of homosexual and bisexual female characters in 1985. I believe, the popularity of underground comix, laxer codes, and other social movements of the time correlate with the rise in new female characters and sexual minority characters in the mid-to-late 1980s.


My fourth and final visualization is interactive. I chose to make a bar chart that showed the top three female and male DC characters. I defined top three as the characters that had the most appearances. Below that chart people can type in a character and see how often they showed up in the comics. The tool tips include the characters alignment (good or bad), first appearance, and if they had a secret identity.

Once I knew all this information, the annotations became difficult. There was so much I wanted to say and so little space. I kept going back to Tufte’s rules about labels. I had to drill down to the most important pieces and leave the rest for my blog post. I chose the government committee and CCA because they were direct causes of change in comic book content. The Wertham book though influential had a more indirect effect on the comic book industry. I did the same for the bar chart. I chose to highlight the underground comix counter culture over the direct market distribution because the ideas and creativity that mainstream comics incorporated cam from the change in content on the market on the change in distribution.

My data set did have its limitations. I would have liked to show how many years each character was in comics, maybe see where some characters overlapped with others, but I did not have that information. When cleaning my data, I removed the urls for the ID pages of the characters. In making my last two visualizations I thought I missed an opportunity to allow my audience to do their own research on their favorite DC character or one they had never heard of before. FiveThrityEight described the appearances column of the data as “ how many times the character appeared in comic books”. Did this mean that a character appeared several times in one issue of a comic book? Our did it mean only one in each book? Characters were categorized as good, bad or neutral. How was that decided? It reminded me of the class we did where we had to categorize celebrities. I would not be surprised to find out that one good deed could make a bad character come up as neutral in my data. Lastly my data stopped in 2013. There may be new female, genderless, transgender, bisexual, or homosexual characters that are not accounted for in my dataset. It could have been interesting to see if a momentous event like the legalization of same sex marriage in 2015 had an impact on new characters in DC comics.

The process to these data visualizations was quite the journey. I initially chose this topic because it was easily recognizable and lighthearted. As I continued to work with my dataset my project became so much richer than I could have imagined. I began by looking at gender over time in comic books and ended up looking at the history of censorship in America, capitalism and conglomerates, and the role counterculture has in progressing mainstream culture. I got a lot of the story through research in part because my data was limited, but by the same token those limitations in the data set me on the perfect path.


Works Cited

“Batman Publication History.” DC Database, dc.fandom.com/wiki/Batman_Publication_History.

“Comics Code Authority.” Wikipedia, Wikimedia Foundation, 31 May 2019, en.wikipedia.org/wiki/Comics_Code_Authority#1960s%E2%80%931970s.

“DC Comics (Publisher).” Comic Vine, comicvine.gamespot.com/dc-comics/4010-10/.

“Direct Market.” Wikipedia, Wikimedia Foundation, 1 Dec. 2018, en.wikipedia.org/wiki/Direct_market.

FiveThirtyEight. “FiveThirtyEight Comic Characters Dataset.” Kaggle, 26 Apr. 2019, www.kaggle.com/fivethirtyeight/fivethirtyeight-comic-characters-dataset.

“Golden Age of Comic Books.” Wikipedia, Wikimedia Foundation, 27 June 2019, en.wikipedia.org/wiki/Golden_Age_of_Comic_Books.

“Underground Comix.” Wikipedia, Wikimedia Foundation, 14 May 2019, en.wikipedia.org/wiki/Underground_comix#Recognition_and_controversy_(1972%E2%80%931982).

Leave a Reply

Your email address will not be published. Required fields are marked *