Blog Post 2 Redux

I chose to redo this blog post since I worked on it a week where I had NO time. I ket the visualizations the same for the most part and spent more time cleaning the data than anything else. I used the entire music library that is on my laptop.  make that distinction because my iTunes are split pretty equally between an old I-Mac and my laptop. I got this laptop in 2017 when my old one got wet and fried ( R.I.P.!)  and started it with a backup of that computer. I got my first laptop in 2011.  My new laptop did not have the same amount of space as the old one so I ended up with my two iTunes libraries. The dataset I used included 2349 songs.

Originally I wanted to answer the question “how diverse is the music I listen to?” I thought of diversity geographically and wanted to map out the artists in my library by birthplace. I was interested in where my music was coming from and how music and culture can cross international borders. That topic was too big for the time contraints, but my tiny initial inquiry proved to be fruitful in answering that question. A dance/electronic song in Japanese that I enjoy was made by a young woman in Brooklyn, and a few Afrobeats artists I like are London born and bred.

I kept this in mind while cleaning my data. I spent the most time working on the genres. I had many songs with the same genre that was spelled incorrectly or all-caps while others were sentence case. I spent a lot of time going through and making sure the genres were uniform. I made a decision to only make changes to songs that already had a genre. If the song did not have a genre I left it blank, because as it showed in my first version of this project the null values told the story. I think the nulls show the change in technology over time and changes in the ways that we acquire music. In the mid to late 2000s most of my music was imported manually whereas now much of my music new music is added at the click of a button. In addition, it appears with the title, artist, genre, and much more.

 

A similar feeling happened when I was working on the genres. First was my conundrum on how to divide rap, hip-hop, and R&B. These genres are often used together to describe something.  I also listen to a lot of alternative R&B/Soul music that might be a mixture of all three of those things. I decided to separate hip hop and R&B because I felt that having a hip hop and R&B category in addition to  Hip-Hop / Rap and R&B / Soul would be redundant. I then had to decide if I was placing a song in a certain genre because of the artist or because of the song itself. Different artists dabble in different genres and try new things, which made it a bit difficult to choose  who goes where. I decided to go by the song which then allowed the genre to give some context back to the artist. Some artists came up in more than one genre and I left them in those different genres. For example Goldlink is a rapper /soul artist, so one of his songs may show up in the R&B/ Soul genre while another song shows up in the Hip-Hop/Rap category.

For the bubble chat below I decided to show the total number of plays per genre. I deliberately left the null values in the chart and used color and size to make it stand out.  

 

The second thing I spent a lot of time cleaning was the artist field. I wanted to make sure that the artist name in each field was the main person on the track, not including featured artists. As I was doing this I could feel myself stripping away the depth some of the songs. Collaborations between artists of different genres became just the artist who came first on the list.

 

The third thing I wanted to show was the change in my taste over time. That was one of the charts I wanted to make, but struggled with in my first try. To do this I used a dispersion plot to show when certain genres are added to my iTunes library and when they stop being added. The genres are order by the total amount of plays. My last try Easy Listening came up as a shocking top genre I listened to. Here it’s a bit lower on the scale. That being said Easy Listening is comprised of one Andrea Bocelli album. That album is the most played album in my entire library! I think this shows my habit of listening to him as a way to calm down before bed at night or as a relaxer during my morning commute. I left the null values in the chart because again I felt they told the story.  The nulls are at the top of the list. In the tooltips, I included the number of plays of that genre from the time it was added to my Itunes library to the day I downloaded the sheet (June 9, 2019). It was interesting to look at the dramatic drop in the null values. To better show it I made a line chart of just the null values.

 

In 2013, songs without a genre had over 6,000 plays and by 2018 songs without a genre had only three plays. I think this speaks to what I mentioned earlier with the changes in how we consume music. I though the use of streaming services to add music to my library as opposed to adding things in manually was a factor in the decrease of nulls. I tried checking my Apple account to find out  when I first signed up for Apple music.Apple doesn’t allow you to see when you started the subscription, but the first Apple music receipt I found in my email was sent in 2017. By then most of my new music was coming in with genres, so I don’t think Apple music truly correlates with what I’m seeing in the line chart.

 

White Paper

The Evolution of Gender in DC Comics 1935-2013

Comic books are deeply ingrained in the fabric of American culture creating household names like Superman and Wonder Woman for over 80 years. For my final project, I decided to visualize DC Comic book character data. I chose the books because they laid the foundation for the characters as we know them now in TV and movies. I wanted to take a deeper look at the origins of what we are consuming as a culture. In the beginning, I had a few questions in mind. How has DC characters and universes changed over time? Do so comics reflect or comment on our society? Have they historically? My initial assumption was that yes comics though based in other worlds are reflective of the one we live in I also wondered why I thought that to begin with? Did I truly see comics as being super diverse or was it simply good marketing? This was when I had to make an important distinction for myself. I was looking at the comic books themselves. The origin stories, not the cartoons I grew up watching. I enjoyed Batmanthe Animated Series and Batman Beyond as a kid. Both shows featured many female characters and black characters. I had to remind myself that these assumptions are based off my experience of the cartoons, not my experience with the texts I was analyzing.

I chose DC comics because my favorite superhero/villain combo (Batman and Joker) is part of the DC universe. My dataset comes from a FiveThirtyEight article on gender in comic books. They used the wikia fandom databases for DC and Marvel to create the datasets they used in their article. The data included the month and year of the first appearance of each character, the number of times the character appeared in comic books, the gender of each character, the sexual orientation of characters if it was available, and much more. The dataset also included the urls to the wikia fandom database ID page for each character in the dataset.

I started cleaning my data in Google Sheets by removing the ID numbers and urls. In Tableau, I cleaned the character names and did calculations to separate the number of appearances by gender. Some names included the universe they were from and others included the public identity of the characters in parentheses. I split the field to get the superhero name alone. I also created a few calculated fields to separate the male appearances from female appearances. Early on I had to decide if I wanted to use the hair color and eye color data in my visualizations. Since I was looking at representation, I thought race could be another area to look at. I quickly dispatched that idea since many of these characters live on other planets and range from humans with purple eyes and purple hair, to robots and aliens.


I created four visualizations. The first shows the names of female characters based on the amount of times they appear in DC Comics. I limited the amount of names, so only women who appeared fifty times or more between 1935 and 2013 show up in the word cloud. The size and color is the names are related to how many times the character appeared. At first, I tried to show this information as a bar chart where the length of each bar shows the number of appearance for all female character and the color represented each female character. It was the prettiest graph that I made that week but it had some major flaws. The issue with the bar chart was there were too many women and not enough colors. Because of this characters were missing from the bar chart. I realized this during the pin up when we couldn’t find Wonder Woman. There were too many women to make this meaningful. It was also difficult to find the names of specific characters because my did not have a search bar. I did not want to cut the idea of showing the female characters visually so a word cloud seemed to be a good alternative. Due to the high number of female characters I had to limit the word cloud to 50 appearances or more in order for the visual to have meaning. Otherwise, all the female character names were all about the same size and color.

The second visualization looks at appearances over time two ways: gantt chart and line chart. The gantt chart at the top shows each year that there were new characters. The colors are linked to gender. My tooltips show the character that debuted that year and has the most appearances between 1935 and 2013. For example, Batman first appears in 1939 and between 1939 and 2013 he appears 3093 times in the comic books. The line chart shows the number of appearances of characters in superhero comics between 1935 and 1954. This chart steered the project and broadened my view on the story I wanted to tell. In my first iteration of this chart it showed all the years from 1935 through 2013. When I went back to revise my visualizations, the first thing I learned was the importance of understanding the aggregation tool you use. I had created a line chart that showed the number of appearances by gender over time between 1935 and 2013. I thought that I created a graph of each character’s popularity overtime when in fact I created a graph that shows the number of appearances all characters of that gender in that given year. This meant that my annotations in my explanation of the first of the graph was incorrect. During the pin up in class one of the questions that came up was what happened in the 1950s to cause such a big dip in character appearances. The gantt chart showed a similar dip. There were no new female characters for a few years in the late 1940s and early 1950s. I then decided to limit my line chart to the points where comic book appearances reach an all-time high during World War II and fall to their lowest point in the late 1940s. This choice led me down a rabbit hole of Wikipedia Pages, DC Comic fan pages, and DC official pages to look into the history of why there’s such a dramatic drop in character appearances.

The decrease in superhero appearances was due to oversaturation of the market.In the late 1940’s, people had lost their interest in superhero comics. Comic book publishers focused on genre comics instead like sci-fi and horror. In the early 1950s superhero comics received media criticism for violence and sexual imagery. In addition to government pressure Frederic Wertham, a well respected child psychiatrist, wrote Seduction of the Innocent a book condemning the violence, drug use, and sexual imagery in comic books. In 1954 the US Senate Subcommittee on Juvenile Delinquency subpoenaed comic book publishers for public hearings. Werthem testified at these hearings citing specific comics as a major cause of juvenile delinquency. Wertham’s book along with his testimony during the hearings delivered a blow to
superhero comic book popularity in the mid-1950s. Many publishers would go under.
For fear of government censorship the Comics Code Authority (CCA) was created as a way for the comic book industry to self regulate the content of comic books. The CCA lasted well into the 80s, but comics became disconnected from societal issues. A counterculture of “underground comix” rose in popularity during the 1960s and 70s that included violence, drug use, and sex, topics that we’re no longer present in mainstream comic books. Many underground comix were self-published and featured socialcommentary on issues like feminism, environmentalism, and gay liberation.

By the late 70s mainstream comic book interest was waning and the way comic books are sold goes through a major change. Direct market distribution allows for the proliferation of underground comix. In this new system comic book store owners can buy comic books directly from the publisher at a discounted rate, and get books to customers faster than traditional distribution routes. This new system provided an opportunity for new comic book publishers to enter the business, and allowed for the growth of independent publishers and self-publishers by creating a system that targeted its retail audience. Underground comix fizzle out in the mid 1970s but their impact is lasting. In the 1980’s, the CCA codes change and allow some violence and drug use. That interesting history sparked the inspiration for my third visualization. I created a bar chart that showed the number of new female characters that appeared each year from 1935-2013.

For my third visualization, I used the count of first appearance and year of first appearance to create a bar chart. I then used color to show the sexuality of the characters. For this chart the missing data told the story. I left the nulls in because they were important. They told the story of female characters and showed sexuality trends in comic books. My bar graph shows the first appearances of homosexual and bisexual female characters in 1985. I believe, the popularity of underground comix, laxer codes, and other social movements of the time correlate with the rise in new female characters and sexual minority characters in the mid-to-late 1980s.


My fourth and final visualization is interactive. I chose to make a bar chart that showed the top three female and male DC characters. I defined top three as the characters that had the most appearances. Below that chart people can type in a character and see how often they showed up in the comics. The tool tips include the characters alignment (good or bad), first appearance, and if they had a secret identity.

Once I knew all this information, the annotations became difficult. There was so much I wanted to say and so little space. I kept going back to Tufte’s rules about labels. I had to drill down to the most important pieces and leave the rest for my blog post. I chose the government committee and CCA because they were direct causes of change in comic book content. The Wertham book though influential had a more indirect effect on the comic book industry. I did the same for the bar chart. I chose to highlight the underground comix counter culture over the direct market distribution because the ideas and creativity that mainstream comics incorporated cam from the change in content on the market on the change in distribution.

My data set did have its limitations. I would have liked to show how many years each character was in comics, maybe see where some characters overlapped with others, but I did not have that information. When cleaning my data, I removed the urls for the ID pages of the characters. In making my last two visualizations I thought I missed an opportunity to allow my audience to do their own research on their favorite DC character or one they had never heard of before. FiveThrityEight described the appearances column of the data as “ how many times the character appeared in comic books”. Did this mean that a character appeared several times in one issue of a comic book? Our did it mean only one in each book? Characters were categorized as good, bad or neutral. How was that decided? It reminded me of the class we did where we had to categorize celebrities. I would not be surprised to find out that one good deed could make a bad character come up as neutral in my data. Lastly my data stopped in 2013. There may be new female, genderless, transgender, bisexual, or homosexual characters that are not accounted for in my dataset. It could have been interesting to see if a momentous event like the legalization of same sex marriage in 2015 had an impact on new characters in DC comics.

The process to these data visualizations was quite the journey. I initially chose this topic because it was easily recognizable and lighthearted. As I continued to work with my dataset my project became so much richer than I could have imagined. I began by looking at gender over time in comic books and ended up looking at the history of censorship in America, capitalism and conglomerates, and the role counterculture has in progressing mainstream culture. I got a lot of the story through research in part because my data was limited, but by the same token those limitations in the data set me on the perfect path.


Works Cited

“Batman Publication History.” DC Database, dc.fandom.com/wiki/Batman_Publication_History.

“Comics Code Authority.” Wikipedia, Wikimedia Foundation, 31 May 2019, en.wikipedia.org/wiki/Comics_Code_Authority#1960s%E2%80%931970s.

“DC Comics (Publisher).” Comic Vine, comicvine.gamespot.com/dc-comics/4010-10/.

“Direct Market.” Wikipedia, Wikimedia Foundation, 1 Dec. 2018, en.wikipedia.org/wiki/Direct_market.

FiveThirtyEight. “FiveThirtyEight Comic Characters Dataset.” Kaggle, 26 Apr. 2019, www.kaggle.com/fivethirtyeight/fivethirtyeight-comic-characters-dataset.

“Golden Age of Comic Books.” Wikipedia, Wikimedia Foundation, 27 June 2019, en.wikipedia.org/wiki/Golden_Age_of_Comic_Books.

“Underground Comix.” Wikipedia, Wikimedia Foundation, 14 May 2019, en.wikipedia.org/wiki/Underground_comix#Recognition_and_controversy_(1972%E2%80%931982).

Trends in DC Comics 1935-2013

I started this project last week as a look at the way gender has played out over time in DC comics. As I continued working with my dataset I found it showed me much more than just trends in character gender. It told the story of a publisher adapting to the interests of its viewership, it showed me social trends in America over the last 80 years and it showed me that one seeming small question can lead to many others tht you could never have thought of when the process began.

I chose DC comics because my favorite superhero/ villain combo (Batman and Joker) are part of the DC universe. My dataset is  part of FiveThirtyEight’s article on gender in comic books. They used the wikia fandom databases for DC and Marvel to create the datasets they used in their article. The data covers 1935- 2013.

I think this project would be interesting to everyone from hardcore comic book fans to social scientists. Superheroes are part of the fabric of American culture and many parts of our live consciously and subconsciously. I think of the staple summer blockbuster superhero film to phrases used everyday like “captain-save-a-hoe”.

Seeing as these books laid the foundation for the other mediums (tv and film), I wanted to take a deeper look at the origins of what we are consuming as a culture. There are many iterations of these characters and universes. How have they changed over time? Do they reflect or comment on our society? My assumption initially was that the answer to both questions was yes. I also wondered why I thought that to begin with? DId I truly see comics as being super diverse? Was it good marketing? This was when I had to make an important distinction for myself. I was looking at the comic books themselves. The origin stories,  not the cartoons I grew up watching. I enjoyed Batman the Animated Series and Batman Beyond that ran in the 90s. Both featured many female characters and black characters. I had to remind myself that these assumptions are based off my experience of the cartoons, not my experience with the text I was analyzing.

The visualization below shows the names of female characters based on the amount of times they appear in DC Comics. I limited the amount of names, so only women who appeared 50 times or more between 1935 and 2013 show up in this word cloud.

 

Dark Days for Comic Books

The second visualization looks at appearances over time. The  Gantt chart at the top shows each year that there were new characters broken out by gender. The tooltips show character has the most appearances between 1935 and 2013 and debuted that year.  For example, Batman first appears in 1939 and between 1939 and 2013 he appears 3093 times in the comic books. The second chart shows the number of appearances of characters in superhero comics between 1935 and 1954. This time frame is the Golden Age of DC Comics. This chart demonstrates the rise and fall of the golden age, where comic book appearances reach an all-time high during World War II and quickly fall in the late 1940s. This was due to oversaturation of the market. In the late 1940’s people had lost their interest in superhero comics. Comic book publishers focused on genre comics instead like sci-fi and horror.

In the early 1950s superhero comics received media criticism for it’s violence and sexual imagery. In addition to government pressure Frederick Wertham, a child psychiatrist wrote Seduction of the innocent A book condemning the violence, drug use, and sexual imagery in comic books. The popularity of this book along with the hearings delivered a blow to superhero comic book popularity in the mid-1950s. In 1954 the US Senate Subcommittee on Juvenile Delinquency subpoenaed comic book publishers for public hearings. Many of these Publishers would go under. For fear of government censorship the  Comics Code Authority (CCA) was created as a way for the comic book industry to self regulate the content of comic books. The CCA lasted well into the 80s but comics became disconnected from societal issues. A counterculture of “underground comix” rose in popularity during the 1960s and 70s that included violence, drug use, and sex, topics that we’re no longer present in mainstream comic books.

 

By the late 70s mainstream comic book interest is waning and the way comic books are sold goes through a major change. Direct Market Distribution allows for the proliferation of underground comix. In this new system comic book store owners can buy comic books directly from the publisher at a discounted rate, and get some too customers faster than traditional distribution routes. Selling this way got around the rules of the CCA. The codes are also changing By the 1980’s the CCA allowed some violence and drug use. The amalgamation of laxer codes and this Counter Culture That Grew and fizzled out allowed mainstream Comics to adopt and integrate some of these Concepts into their mainstream comics. I think this correlates with the rise in new female characters in the mid-to-late 1980s.

 

 

Your Turn! Look up some of your favorite characters! You might find something unexpected.

The last visual is for my viewers. Hi! Please feel free to look up your favorite DC characters and see how popular they were. In addition to tooltips include their first appearance, whether or not they’re alive, and how they align.

 

Blog Post 3: DC comics

The below visualizations take a look at the way gender has played out over time in DC comics. I originally wanted to compare DC and Marvel comics but found it would probably be best to start with one franchise create the visualizations I wanted and replicate those for the other later on if I still feel it’s relevant to the project. I chose DC comics to start because my favorite superhero/ villain combo (Batman and Joker) are part of the DC universe. My dataset is one part of FiveThirtyEight’s article on gender in comic books. They used the wikia fandom pages for DC and Marvel to create the datasets they used in their article.

I think this project would be interesting to everyone from hardcore comic book fans to women and gender studies students. Superheroes are ingrained in American culture. So much so that every new movie release breaks box office records. I am not particularly a fan of superhero films but somehow always find myself watching one in a theater because several of my friends “NEEDED” to see it.

Seeing as these books lay the foundation for the other mediums (tv and film) I thought it could be good to take a deeper look at the framework for what we as a culture are consuming. Although housed in different universes comics often have something to say about current events. That contributes to why Adolf Hitler is a comic book character. This led me to the question of has this tradition continued? How are characters in these stories changing to represent a larger spectrum of readers? It made sense to analyze gender and sexual preference since other factors like race and age since we are not dealing with our concepts of time and space.My first visualization however is set in our ideas of time and space.

The visualization below shows the first appearance of characters by gender from 1935-2015. It shows that women have been featured in comic books about the span of time as men. It also shows an emergence of genderless characters and one transgendered character.

 

 

The second visualization looks at appearances over time. I tried to answer the question “How many times did male characters appear in the books? Female?” This line chart shows a dramatic spike in appearances around 1987. In the late 80s starting in 1986 the “new earth” universe in DC comics began. It included crossovers which is probably why the overall  appearances of characters increased The new earth universe is the mainstream one that most of us know. New Earth era of comic books ran from about 1986-2011.

The third visualization looks at the appearance of female characters over time. I colored these by name of character. I thought it was interesting to look at often each character appeared. By doing it this way there are two levels of visuals. The first you automatically see in the bar chart and the layered information given by the colors and tool tips.

Quantified Self: Me and My Itunes Library

This blog post is a story of time: too little time and too much time. This week we were working with quantified self data. I enjoy listening to music and have spent a lot of time adding new music to my library and creating playlists. For this project I decided to analyze the data from my iTunes library to get a sense of what genres of music I listen to the most and which artists.

Originally I wanted to answer the question “how diverse is the music I listen to?” I thought of diversity geographically and wanted to map out the artists in my library by birthplace. I was interested in where my music was coming from and how music and culture can cross international borders.. I quickly found that although it was a cool project (that I will probably work on anyway) given the time constraints for this week I would have to focus on something else. Although I did not create the mapping visualization my tiny initial inquiry proved to be fruitful in answering that question. A dance/electronic song in Japanese that I enjoy was made by a young woman in Brooklyn, and a few Afrobeats artists I like are London born and bred.

Switching gears I began thinking about my taste over time. Based on the data I’ve had itunes since 2010. I’m sure my taste in music has changed over that last 9 years. The wealth of information threw me a bit off course since I could see so many possibilities and had so little time. Itunes has a wealth of information on my listening habits. It tracks everything from the date a song was added to the number of times it’s been skipped, the bit rate of each song and much more. Then began my time issues.

Itunes has information down to the minutes about when you added music, modified it, last played it, and Tableau returned every instance a song was played down to the minutes and I had trouble figuring out how to shift that to hours. I was interested in the time of day that I added music to my iTunes. I know I’m a night owl and wanted to see if there was a correlation between the time and number of songs added to Itunes.  

In the end I worked on visualizations using the genre and number of plays. I felt it could help with the answer to what I listen to often. That was when I realized a big hole in my data. Many of the songs in my library had null values for genre. I think this is a commentary on me as a user. I often look for music by song title or artist so those fields are pretty up to date. Genre however seems to be the one I was least worried about. Possibly because the song title and artist would dictate to me what the genre is.

 

 

 

The second visualization

 

 

 

Viz 2

https://public.tableau.com/shared/W9TKSQXRH?:display_count=yes&:origin=viz_share_link 

 

Viz 3

 

 

Drinking Water Complaints in Queens 2014-2018

Growing up in New York City, I remember teachers talking about how good the water quality was in New York City. I had no real concept of why this was important or why the adults around me seem so excited about it. I thought that clean water was an expectation. It was something that the government is supposed to provide. Today as I look around the world at the many places that don’t have clean drinking water, both abroad and at home, I have a better understanding of what the adults then were talking about. Many in this country are still fighting for the right to clean drinking water: activists in Flint Michigan, indigenous people in South Dakota, and teachers documenting the lack of access to drinking in New York City public schools.

Using the 311 data I set out to analyze which Queens neighborhoods had the most complaints about drinking water between 2014 and 2018. I also wanted to know if drinking water complaints were increasing or decreasing over time. I was interested in researching neighborhood specific records for a few reasons. This data visualization could be useful for community board members, local politicians, parents and families, people looking to move to Queens, public health services agencies in NYC, and people already living in Queens. I felt the best way to illuminate any possible issues would be through specificity. If for example, I am someone currently living in Queens who wants to move to a different part of Queens, it would be more beneficial to know statistics for the new neighborhood I’m looking at than to know the statistics for Queens as a borough. Queens was the easiest borough to show by neighborhood because each neighborhood is label as its own city in the 311 data. 

I have created several visualizations. The first is a bar graph showing the number of complaints each neighborhood made over a four-year period 2014 through 2018. I felt a bar graph was the best way to show comparison between neighborhoods. In the first visualization, I removed any of the neighborhoods that had two incidents of contaminated water or less. I felt that two occurrences over four years did not show a possible chronic issue. This also allowed me to show all the neighborhoods at once with no scrolling.

The second visualization shows the number of complaints made each year by each neighborhood.

The third visualization shows the trend of complaints about drinking water over time. For this visualization I used a line graph make a timeline showing changes in the number of complaints each year from 2014-2018. It shows that overall the amount of drinking water complaints spiked in 2016 and have been declining since.

The last visualization shows the trend in drinking water complaints by month. This line graph shows the total number of complaints made each month during the 4 year time frame. June has the highest number of complaints. Maybe that is due to construction starting up in the summer months.

Looking to the future of this project I’d be interested to see if there was any relation between drinking water complaints and median income, or neighborhood demographics.  I noticed that Flushing and Jamaica Queens have the highest instances of illness due to drinking water. Flushing has a large Asian immigrant population. I wonder if that has any effect on my findings. The complaints do not give much con. Maybe there were factors happening in the community like a broken pipe or construction that contributed to the complaints. There is no way to know that information based on my current data.