Tuesday, April 23, 2013

Capturing and Characterizing Instagram

My previous posts have talked about my fascination with both Twitter and Instagram. 

But how does one analyze the behaviors of a community consisting of 90 million users who post 40 million images each day?

Last February, I tried a little experiment to see if there was a way to use Twitter to "capture" the nature of Instagram images. For about 30 minutes, I simply watched what got Tweeted with the #instagram hashtag. Inspired by what I was seeing, I collected some of the coolest images so I could go back and view them later. In order to group the images  I liked, I re-tweeted those Instagrammers' Tweets that caught my eye, adding my own hashtag #15minutesofinterestinginstafindsontweetdeckfeb192013 to each of my retweets. I used an app called Tweetdeck, which allows me to see the Tweets of several different hashtag groups on one screen, each group in their own column. With Tweetdeck and my ridiculously long hashtag, I now had a collection of my favored Instagram images that had been shared on Twitter! Later I went back to my hashtag collection, nicely grouped together in a Tweetdeck column, and looked at what I all had gathered.  By clicking on any Instagram image appearing in the Tweeted (and now retweeted) message, I could see any image full size and displayed on a webpage (an Instagram page).  I shared some of my findings about Instagram in my Feb 17, 2013 blogpost and in one of my presentations at the Annual conference of the NAEA in March 2013. My snapshot of Instagram suited my purposes at that time. I selected what interested me and shared my findings.

What if someone wanted a more definitive "snapshot" of Instagram? How might one collect a more representational sample of Instagram images (images that were posted to Twitter)? And what procedures might one employ to understand kinds of images one finds on Instagram.  Answering this question beyond the obvious is no easy task, since by recent estimates nearly 40 million images are posted in Instagram each day. 

Today's post poses one possibility. I'm calling this procedure An Instagram Snapshot in Time (on Twitter). This procedure relies on the fact that Instagrammers appear to like Twitter, which is to say, they like having an audience. I've had to modify my strategy since last February, since Instagram images no longer show in Twitter streams. 

Here are my recommendations.

Data Gathering: Image selection
  1. Establish a set time span for collecting images tweeted with the #instagram hashtag. For example, 6:00 a.m. until 6:20 a.m. on a certain day (I used Feb 20 in my example).
  2. Retweet each #instagram Tweeted post, giving your retweet your own hashtag.  I recommend a descriptive hashtag with a date such as #IG_feb20_6am
  3. Work fast. Instagrammers are uploading a lot of images and Tweeting about them. The Twitter stream will move rapidly. If you miss retweeting an #instagram Tweet, grab the next one. 
  4. Once your timed session has concluded, you can see all of your Retweets together in a Tweetdeck column or on your regular Twitter web page. Learn to find and display tweets with a specific hashtag to see your retweets. Twitter and Tweetdeck will display your specifically hash-tagged retweet, filtering out everything else in that display. Retweets will display this way only for about a week.
  5. After about a week, to see your specifically hash-tagged retweets, you will have to go back to all of your tweets, and look for those that you retweeted on a specific day and time. 
  6. Next - capture the URL for every image that was tweeted with the #instagram hashtag during the timespan you retweeted them. Follow the link from each original #instagram Twitter post (that you had retweeted) to each URL. The URL will most likely be a web link to the original Instagram image. Copy and paste these URLs (from Twitter) onto a list you create in MS Word. Using your list, you can now go back and look at the images that are found at these URLs. Your Browser History will also have these URLs.
  7. Repeat the procedure at noon, 6:00 p.m., and midnight so that the Tweeted Instagram images are more likely to have come from IGers living in different time zones (IGers are from all over the world).
Organization of Data: Create a Data Archive 
  1. Go to each URL (web page containing the Instagram image) given in the Tweets during your time span (and that you have now either have in your browser history or listed by URL on a list that you created). 
  2. Capture or download the image and accompanying text appearing on each Instagram web page. 
  3. Create a labeling system for each image as you capture and archive it, for later analysis. For example, relabel each image file with an easy to understand label such as IG-1_feb20_6am.jpg; IG-2_feb20_6am.jpg , etc...  The underscore symbol makes the file label easier to read. 
  4. Create a list of the Instagrammers whose image you have archived, identifying who created IG-1_feb20_6am.jpg, who created IG-2_geb20_6am.jpg, etc.  Use their IG names.
  5. Create a short profile for each Instagrammer, following their Tweets and Instagram uploads to their profiles, and then to their websites or other social media accounts. Obtain as much information as possible by reading their profiles and following their links.  Who are these folks?
Creating an Image Archive through Online Curation

Once Instagram images have been identified for inclusion in a study, it would be great to be able to see the entire set all together in one place. There are may ways to curate a collection of images found on the Internet these days.  One of my favorite tools for curating collections is Pinterest.

To create a collection online of the Instagram images that had been tweeted with the #instagram hashtag, I spent about 20 minutes retweeting these #instagram tagged tweets so I could go back to my Twitter account later to see what all I had. Then, on my Twitter page of my own Tweets, I followed each of my retweets to each IGer's instagram image. Once I was on the site where their Instagram image was displayed, I pinned each image to a Pinterest board. Under each pinned image, I also included the number of likes and comments each image had generated within about a month since the IGer first posted the image. Instagram is dynamic. Anyone that sees an Instagram image may continue to like and/or comment on the image long after the image is originally uploaded to Instagram. As a result, Instagram likes and comments are not a static number, and the number of likes and comments I provided under each image in my Pinterest collection may change over time. Therefore, I recommend doing this procedure a week later in order to capture likes and comments. After about a week, there will likely be few additional likes or comments for a specific image.

Below is a screen capture of part of an Instagram image archive I created in Pinterest from Instagram images that had been tweeted with the Twitter hashtag #instagram on Feb. 20, 2013.  I called my Pinterest collection (aka "pinboard") "An Instagram Snapshot in Time, February 20, 2013". The use of the term "snapshot in time" is intended to convey that this was a strategy that captured Instagram posts to Twitter at a particular time. My entire Feb. 20, 2013 collection is now available in one Pinterest pinboard at http://pinterest.com/edelacruz/instagram-snapshot-in-time-feb-20-2013/

Figure 1: An Instagram Snapshot in Time, February 20, 2013

Data Analysis: What can we say about Instagram imagery and their creators? 

Instagram images are anything goes! They will be fun to describe, code, and sort. The process is inductive, but systematic. There are numerous writings in research methods discourse that explain how to analyze qualitative data in order to make sense of it (Guba & Lincoln, Patton, Maxwell, etc). Commonalities across those writings about data analysis methods include closely reading the material gathered, describing the material, coding it, categorizing and grouping similar content together, identifying patterns that emerge, and explaining what's there based on these aforementioned procedures. 

At the onset of any study of Instagram, I'd be curious to know what these images are about and what aesthetic qualities they have. What is their subject matter? What are their aesthetic qualities? How many Instagram images are totally abstract, partly abstracted, or realistic? How many are "selfies"? How have they been manipulated using apps and filters? How many are closeups, mid-range, or far away shots? 

In attempting to answer these kinds of questions I would use procedures that come from art criticism and art history. Art critics and art historians often talk about subject matter, meanings, visual qualities, processes, conventions of style, the context in which the art form was produced, biographic information about the artists, and intended audiences.  Researchers in engineering and empirical aesthetics have also attempted to describe images and (using statistical methods popular in social science research) explain how they are understood or appreciated by audiences, but their methods are not particularly robust. Art history and criticism use empirical methods that are qualitative, narrative, and more authentic and convincing: description, image analysis, identifying conventions of style, and finding out about the creator of the image. Using art criticism strategies, I recommend describing Instagram the images with short phrases or single words that identify subject matter, compositional strategies, and aesthetic qualities in each image. Each image will have multiple descriptors (which can later function as codes for analysis purposes). A table of images and their descriptions is a must. The table would include the image label and identify its creator. Possible areas for description and analysis might include but are not limited to the following:
  • subject matter
  • composition, framing strategies, and depth of field
  • formal qualities such as color, pattern, lighting, or texture
  • degrees of realism or abstraction
  • kinds of transformations made to the photo through apps and filters
  • locations where the pics were taken (if discernable or described)
  • the content of the text that gets tweeted with the image
Like any qualitative research, content that seems important enough to become a descriptive category should be shaped by the data itself (the images). The descriptors heading the rows in the Table 1 below are just a starting place.  
    Table 1. Describing Characteristics or Attributes of Instagram Images

                                   Data Analysis: Coding of Instagram Images

The above descriptions of Instagram images should be coded for later analysis. Each Instagram image should be coded with multiple characteristics, including a code for the each of the major categories that you establish as important (my initial codes are derived from my Feb. 15, 2013 informal study, and are shown in the shaded column below). Using my coding, a cityscape could be coded as SM-A to indicate: Subject Matter=Cityscape. Using my coding, a “selfie” (self-portrait) could be coded SM-E(self) to indicate Subect Matter = Portrait(Selfie).

Table 2. Coding Attributes of Instagram Images and Posts

Data Analysis: Frequencies of Attributes

Once attributes of individual images are described and coded using the symbols given in Table 2, a frequency table for each attribute (such as subject matter or compositional strategy) may be constructed to find out how common an attribute is across multiple images.  For example, how common are cityscapes or sporting events? How common are dynamic diagonally composed images? How common are color images vs. black and white images?  How common are filters and apps used to modify images?  Using MS Excel, a frequency table would need to be created for each attribute that seems important. (Table 3 shown below is hypothetical, and was created in MS Word.)

Table 3. Instagram Subject Matter Frequency Table
Data Reduction

Based on analysis of about 50 Instagram images that I had collected from Tweets during a specific time on Feb 20, 2013 (see my Pinboard or Figure 1 above) , I found some trends that I have also seen in my other Instagram forays. Commonly posted subject matter included exterior and interior scenes, friends and self-portraits, food and clothing, events or sports, pets and favorite personal objects, signage, closeups, and interesting abstractions. Figure 2 below gives examples of commonly found subject matter from my Feb. 20 collection. Clearly, at the time of this snapshot in time, Instagram conveys the experiences, values, and stories of their creators through various themes and contents.  I have never been a universalist in my understanding of art forms, and I always consider art in the context of its cultural context and artists' intentions. But there's just something about these individual Instagram images that transcends their creators.

Figure 2: Subject Matter Trends from February 20 Instagram Archive

Who are all these people anyway?

Any characterization of the Instagram community has to include descriptions the Instagrammers themselves. Who are these individuals? Where are they from? How many are male/female? What other demographic data can one attribute to these Instagrammers? In my Feb. 15 study, I was able to find out quite a bit of information about the Instagrammers by following their tweets back to their Twitter and IG accounts and looking at their profiles. In some cases this also included following links to their Flickr accounts, and even to their webpages. I talked about this strategy briefly in my post about my NAEA presentation.

Some additional digging (following links back to their websites) would help in development of a separate table of characteristics of IGers.  Such a Table (see Table 4 below) would provide a "snapshot description" of these folks, and allow for coding of attributes given in each of the columns for later analysis. Additional attributes would emerge and be added in new columns, as study and analysis of these IGers commences.

Table 4. Characteristics of Instagrammers 

Neither image analysis nor describing characteristic of those Instagrammers whose images are selected would be a fast process. But using such systematic procedures such as this "snapshot in time approach" gives substance and breadth to the researcher's explanations.


Limitations of findings from this approach include the fact that we don't really what percentage of IGers (Instagrammers) also Tweet their images to Twitter, but I can say with confidence, it's a whole bunch. Other limitations include obvious questions about "which" moment in time best represents the images shared in the Instagram community, and whether or not 20 minutes is sufficient time to gather images.