Colors of Street Photography from Reddit
Color composition analysis with PRAW and ColorThief
Inspired by a recent article “Average Jeans Color by State, 2020” by Khyatee Desai, I wanted create a color based analysis of photographs. More specifically, I wanted to utilize the ColorThief library to analyze colors of street photography. My goal was simple: collect street photographs and create color palettes of them to quickly analyze and draw any insights on the colors of photos. I decided to gather my data from one of my favorite subreddits: r/streetphotography with Reddit’s PRAW API.
Methods
First, to access Reddit through PRAW you will need to get API access from reddit. You can check out this wiki here to register and get more information. After you have your API keys initializing the PRAW library is very easy.
import praw
reddit = praw.Reddit(client_id=api_id, # your API ID
client_secret=api_secret, # your API secret
user_agent=agent) # unique name for app
Accessing the individual subreddit is done through subreddit method. Within the subreddit method you there are many search/sort methods and I’m using the top method for my search.
subreddit = "streetphotography" #subreddit name after r/ in url
time = "week" # time frame to complete the search
limit = 10 # number of posts to grab
top_posts = reddit.subreddit(subreddit).top(time, limit=limit)
The subreddit method has few attributes that are relevant to this project like title and url. Unfortunately, the ColorThief library cannot process online images directly, so the url will have to be converted to readable local data. I tackled this by converting the image within the url to binary data using the io and urllib library. Below is a brief example of how I accomplished the conversion and obtaining the color palette.
from urllib.request import urlopen #gets url data
import io #converts to binary
from colorthief import ColorThief #gets palettesrgblist = [] #to store all rgb datafor post in top_posts: #iterate through top posts
post_url = urlopen(post.url) #gets url data
post_bin = io.BtyesIO(post_url.read()) #coverts to binary
ct = ColorThief(post_bin) #initilze colorthief with binary
rgb = ct.get_palette(color_count=10, quality=10) #get palette
rgblist.append(rgb) #append and store rgb data
Last, to plot the rgb colors in matplotlib it’s easier to convert the rgb data to webcolors. I used webcolors library’s rgb_to_hex method.
import webcolors #library for conversionhexcolors = [webcolor.rgb_to_hex(color) for color in rgblist]
I visualized the hex colors through matplotlib and found some very interesting results.
Monochromes
My first instinct was to check for monochromatic photos. As you might be able to guess, it’s very easy to determine a monochromatic photo by looking at the color palette.
A color palette can also describe a monochromatic-esque photo. The palette to the left shows that the photo is comprised mostly warm beige-brown tones. Although it’s technically a color photo we can see that the photo is like a sepia tone monochromatic photo.
Negative Space
ColorThief selects colors from the most dominant presence first. Therefore, observing white or black as the first tone in the color palette can translate well to the negative space of the photograph.
Of course this is not a perfect analysis as this requires the photo to contain significant and uniform negative spaces. Nonetheless, it was very interesting to be able to pick out the negative space using the palette as the guide.
Computer and human don’t see eye to eye
This part of the analysis had me scratching my head for awhile. Often, the color palette will contain colors that seem to be none existence or only exist in tiny details. I realized that this is due to the different perception in image interpretation between a computer and a person. We perceive images as a whole and we are able to analyze details like color within the context of the artform. However, a computer must do this by the pixels. This is actually one of the more challenging aspects of image classification in machine learning. Take the image below for an example.
From the photograph below, could you pick out the last 3 colors (grey-green, grey, and orange) of the palette?
The grey-green and grey color are mixed in with the asphalt color, which is very hard to perceive until very closely zoomed in. The orange is part of the detectible warning pavers (bumpy pavers for visually impaired) and the fading old paint. Us humans really wouldn’t think beyond that the asphalt is black (or shades of grey) and the pavers are yellow.
Let’s take a look at another example. When you look at the photo below what color seems to be the dominant photo? I would argue that it’s red accented by warm tones like yellow and beige and perhaps even green.
The code, however, seems to say that the dominant colors are from the colors of the buildings, black, beige, and red-brown. The red is ranked second lowest in this palette.
Overall, I hope you learned something from this unconventional data analysis. I certainly had a lot of fun developing this method and got to learn a lot about parts of my interest that I did not know before. If you would like to check out the reddit PRAW api or ColorThief, please feel free to check out my github repo for reference. Last, but not least please check out the embedded links on photograph credit to see more from the photographers!