2011 01 22
Posting frequency


Posted by in: Odds and ends

Wow, no posts since November! You all know intuitively that the we post a lot less frequently these days. What you’ve hitherto lacked, however, is a chart setting it out for you:

Yeah, it’s not the world’s greatest looking chart, but you get the idea. (I’ve never used matplotlib before. I’m guessing a log scale on the y axis might have helped.) By the way, that little spike in early 2004 is misleading. Somehow a bunch of posts from that period went missing, and I haven’t tracked down yet where they got to.

To make the chart, I just exported our published posts since April 2004 from WordPress and then ran this script on the xml. You need to install matplotlib first.


from xml.etree import ElementTree as ET
import datetime
import sys

from matplotlib import pyplot as plt
import pylab


def get_dates(filepath):
    dates = []    
    with open(filepath) as f:
        doc = ET.parse(f)
        root = doc.getroot()
        for pubDate in root.findall('channel/item/pubDate'):
            date_string = pubDate.text.replace(" +0000", "")
            date = datetime.datetime.strptime(date_string, "%a, %d %b %Y %H:%M:%S")
            dates.append(date)
    return dates


def get_intervals(dates):
    intervals = []
    for i, date in enumerate(dates):
        if i == 0:
            continue
        delta = date - dates[i-1]
        intervals.append((date, delta.days))
    return intervals
                         

def plot_intervals(intervals):
    dates = [date for date, value in intervals]
    values = [value for date, value in intervals]
            
    plt.plot_date(pylab.date2num(dates), values, linestyle='-')
    plt.title("Time Away From You Over Time")
    plt.xlabel("Date")
    plt.ylabel("Days Between Posts")
    plt.grid = True
    plt.show()


def main():
    dates = get_dates(sys.argv[1])
    intervals = get_intervals(dates)
    plot_intervals(intervals)


if __name__ == '__main__':
    main()

Howls of outrage (4)