I had an idea for basic webapp that pulled two bands from a list and had the users compare them. For a while my old roommate and I would do this when we got burritos. Sometimes asking which is worse can be an interesting question compared to which is better even if used for the same overall purpose of ranking things.

Here is the link to the live site and here is the link to the Github Repo.

This was my first experience with MongoDB and NoSQL databases in general. There has been a lot of criticism of MongoDB specifically and earlier versions had some bugs that could cause data loss in the right situation. Since this was just for my own experimentation I wasn't worried about any potential data loss and the flexible document structure was really simple to get started storing my information compared to setting up relational database tables.

I used Last.fm's API to populate my database with some basic information for the most popular 1,000 bands in the getartists.py script. The beginning of the script I use Requests to put together my API request in the payload variable and send the GET request.

import requests
import pymongo
from keys import keys

API_KEY = keys['lastfm_key']
API_SECRET = keys['lastfm_secret']

payload = {'method':'chart.gettopartists', 'limit':'1000', 'api_key':API_KEY, 'format':'json'}
r = requests.get('http://ws.audioscrobbler.com/2.0/', params=payload)

The script then goes through the returned list and removes some of the irrelevant information and creates a list of dictionarys conataining the information for each artist.

artists = []

for item in (r.json()['artists']['artist']):
    del item['streamable']
	del item['mbid']
	item['votes'] = 0
	for picture in item['image'][:]:
		if picture['size'] == 'extralarge':
			item['pic'] = picture['#text']
	del item['image']
	artists.append(item)

Last the script connects to the MongoDB database and, since there is nothing previously in the database, sets up the database and collection and then inserts the information. MongoDB JSON-like document structure made it really simple to submit the information as python dictionaries.

connection = pymongo.Connection()
db = connection['worstbandever']
collection = db['artists']

posts = db.artists

posts.insert(artists)

Once the database has been built, there is no further need for the getartists.py script. Similarly, resetvotes.py was a short script I wrote to clear out the voting information from the database that I used mostly during development and generally doesn’t need to run. Like the name suggests, the script connects to the database and resets the votes to 0.

import pymongo

connection = pymongo.Connection()
db = connection['worstbandever']
collection = db['artists']

posts = db.artists
posts.update({}, {'$set': {'votes': 0}}, multi=True)

The API and other key information are stored in the keys dictionary in keys.py (which is in the .gitignore) and can be called from the other scripts as needed. keys_exampple.py is a blank template that can be filled out and renamed.

keys = dict(
	consumer_key = '', # Copy information from Twitter API
	consumer_secret = '',
	access_token = '',
	access_token_secret = '',
	flask = '', # Fill in with random string
	lastfm_key = '', # Copy information from Last.fm API
	lastfm_secret = ''
)

Among the other data, the site displays the number of unique listeners for each artist on Last.fm and the number of times the artists song has been played. updatestats.py is set as a cron job to keep the information relatively up to date, without pulling it each time the artist is loaded in the app.

The two functions, get_info and update_database build and submit the GET request for the Last.fm API and then update the stats in the MongoDB database. get_info uses a list of artists pulled from the existing database rather rather then requesting the top 1,000 artists from Last.fm so that the statistics being updated are for the same artists as the ones being used for the site.

Since separate API calls are being made for each artist, they are spaced out with a 5 second wait in between to distribute the load.

import requests
import pymongo
import time
from keys import keys

API_KEY = keys['lastfm_key']

connection = pymongo.Connection()
db = connection['worstbandever']
collection = db['artists']

def get_info(artist):
	payload = {'method':'artist.getinfo', 'artist':artist, 'api_key':API_KEY, 'format':'json'}
	r = requests.get('http://ws.audioscrobbler.com/2.0/', params=payload)

	listeners = (r.json['artist']['stats']['listeners'])
	playcount = (r.json['artist']['stats']['playcount'])

	return listeners, playcount

def update_database(artist, listeners, playcount):
	posts.update({'name': artist}, {'$set': {'listeners': listeners}})
	posts.update({'name': artist}, {'$set': {'playcount': playcount}})

posts = db.artists
artists = []

for post in posts.find():
	artists.append(post['name'])

for artist in artists:
	listeners, playcount = get_info(artist)
	update_database(artist, listeners, playcount)
	time.sleep(5)

tweet.py is another short script that is called by the main script when a new band accumulates the most top votes. The Twitter API information is loaded from keys.py and the tweet is sent using the Tweepy module.

import tweepy
from keys import keys

consumer_key = keys['consumer_key']
consumer_secret = keys['consumer_secret']
access_token = keys['access_token']
access_secret = keys['access_token_secret']

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)

def send_tweet(text):
	api.update_status(text)

worstband.py is the main script for Flask which is served by Gunicorn through an Nginx
proxy. After loading the required modules and defing the connection to the database the script loads the artists list with all the artists from the database.

import flask
import pymongo
import tweet
from random import choice
from bson.objectid import ObjectId
from os import urandom
from keys import keys

connection = pymongo.Connection()
db = connection['worstbandever']
collection = db['artists']

posts = db.artists

artists = []

for post in posts.find():
	artists.append(post)

Functions are then defined for selecting two unique artists from the full list (get_artists), updating the plays and listeners count for an artist from our local database in case this information has been updated by updatestats.py (update_stats), checking the artist with the most number of votes in the database (check_top), and sending a tweet announcement using tweet.py (send_tweet).

def get_artists():
	artist1 = choice(artists)
	artist2 = choice(artists)
	while artist2 == artist1:
		artist2 = choice(artists)

	return (artist1, artist2)

def update_stats(artist_id):
	artist = posts.find_one({'_id':artist_id})
	return (artist['playcount'], artist['listeners'])

def check_top():
	for post in posts.find().sort('votes', -1).limit(1):
		top = post
	
    return top

def send_tweet(artist):
	tweet.send_tweet(artist[:116] + ' is now the worst band!')

Flask is then initialized and the routes for the main page are defined. GET requests sent to the main page generate a new page with two artists for comparison and checks if a session id exists and if not adds a cookie with a random generated number for the session id.

The site has a vote button for each artist which when clicks sends a POST with the voted artists BSON ObjectID and if the session id matches updates the vote information in the database or returns a 303 redirect to the main page. A new page with two new artists is then generated.

app = flask.Flask(__name__)
app.secret_key = keys['flask']

@app.route('/', methods=['GET','POST'])
def main():
	if flask.request.method == 'POST':
		if 'session_id' in flask.session:
			artist = flask.request.form['button']
			full_id = ObjectId(artist)
			posts.update({'_id': full_id}, {'$inc':{'votes': 1}})
			if check_top == artist:
				send_tweet(artist['name'])

		return flask.redirect('/', code=303)

	artist1, artist2 = get_artists()
	playcount1, listeners1 = update_stats(artist1['_id'])
	playcount2, listeners2 = update_stats(artist2['_id'])

	if 'session_id' not in flask.session:
		flask.session['session_id'] = urandom(24)

	return flask.render_template('main.html', artist1=artist1, artist2=artist2, playcount1=playcount1, playcount2=playcount2, listeners1=listeners1, listeners2=listeners2)

A GET request sent to /top will return a list of the artists with the 10 most votes.

@app.route('/top')
def top():
	top = []
	for post in posts.find().sort('votes', -1).limit(10):
		top.append(post)

	return flask.render_template('top.html', top=top)

if __name__ == '__main__':
app.run()

Sessions and the object id are used to reduce increase the difficulty for fraudulent voting a little. Any POST requests with a valid session id and the correct object id will still be counted. More rigerous anti fraud measures would require registering and logging in users before they could vote and rate limiting the number of requests from each user. These seemed like overkill for the subject matter and would make the site unnecessarily burdensome on the user.

templates/main.html and templates/top.html are used by worstband.py to render the pages. Jinja2 templating defines the variables from worstband.py

templates/main.html displays the information for each artist and has a voting button which generates the POST with the artists BSON Object Id to send to the site.

<!doctype html>
<html lang="en">
	<head>
		<meta charset="utf-8">
		<meta http-equiv="X-UA-Compatible" content="IE=edge">
		<meta name="viewport" content="width=device-width, initial-scale=1">
		<title>Worst Band Ever</title>
		<link href="{{url_for('static', filename='css/bootstrap.min.css')}}" rel="stylesheet">
	</head>
	<body>
		<div class="container">
			<div class="row">
				<div class="col-lg-3">
				</div>
			<div class="col-lg-9">
				<h1>Who Sucks More?</h1>
			</div>
			<div class="row">
				<div class="col-lg-6">
					<h2> {{ artist1.name }}</h2>
					<img src="{{ artist1.pic }}" class="img-responsive"></img>
					<h5><a href="{{ artist1.url }}">{{ artist1.url }}</a></h5>
					<h5>{{ playcount1 }} plays by {{ listeners1 }} wieners</h5>
					<form method="post" action="/">
<button value="{{ artist1._id }}" name="button" class="btn btn-default" type="submit">Vote</button></form>
				</div>
			<div class="col-lg-6">
				<h2> {{ artist2.name }}</h2>
				<img src="{{ artist2.pic }}" class="img-responsive"></img>
				<h5><a href="{{ artist2.url }}">{{ artist2.url }}</a></h5>
				<h5>{{ playcount2 }} plays by {{ listeners2 }} wieners</h5>
				<form method="post" action="/">
<button value="{{ artist2._id }}" name="button" class="btn btn-default" type="submit">Vote</button></form>
			</div>
		</div>
		<br>
		<br>
		<div class="row">
			<div class="col-lg-3">
			</div>
			<div class="col-lg-9">
				<h4><a href="top">Current Top Worst Bands</a></h4>
			</div>
		</div>
		<br>
		<br>
		<br>
		<br>
		<br>
		<br>
		<div class="row">
			<div class="col-lg-12">
				<a href="https://twitter.com/worstbandever_" class="twitter-follow-button" data-show-count="true" data-size="small">Follow @worstbandever_</a>
				<br>
				<h6>Contact: worstbandevernet at gmail</h6>
				<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script>
				</div>
			</div>
			<br>
			<br>
			<div class="row">
				<div class="col-lg-12">
					<h6>Powered by: Ubuntu MongoDB Nginx Python Gunicorn Flask</h6>
				</div>
			</div>
		</div>
	</body>
</html>

templates/top.html shows a page of the artists with the most number of votes. {% for artist in top %} iterates through each of the artists in the top 10 list, building their row in the table.

<!doctype html>
<html lang="en">
	<head>
		<meta charset="utf-8">
		<meta http-equiv="X-UA-Compatible" content="IE=edge">
		<meta name="viewport" content="width=device-width, initial-scale=1">
		<title>Worst Band Ever</title>
		<link href="{{url_for('static', filename='css/bootstrap.min.css')}}" rel="stylesheet">
	</head>

	<body>
		<title>Worst Band Ever</title>
	<center>
	<h1>Worst Bands</h1>
		<table class="table table-striped">
			{% for artist in top %}
			<tr>
				<td>
					<center><h4><a href="{{ artist.url }}">{{ artist.name }}</a></h4></center>
				</td>
			</tr>
			{% endfor %}
		</table>
	</body>
</html>

The biggest area for improvement would be the page design and the HTML in the templates. Bootstrap was used for themeing but the implementation could be improved, the collouming in particular.

Besides the aesthetics of the site more metrics could be implemented besides just a simple vote. The database could be expanded to keep track of votes between each comparison separately. Added metrics would also justify expanded for viewing the data and chart comparisons.