1 of 18

Medium Rare

A Scalable Publishing Platform

Chris Giuffrida & Thomas Krill

2 of 18

Project Idea

3 of 18

Idea

  • Build a lightweight version of popular self-publishing site, Medium
  • Benchmark performance of an unscaled version of the site
  • Utilize cloud technology, namely, AWS, to scale site to 100x the initial benchmark

4 of 18

What is Medium Rare?

5 of 18

Functionality Overview

Functionality Overview

6 of 18

Architecture

7 of 18

Unscaled Architecture

8 of 18

Flask Application Structure

  • routes.py
    • Based on URL, sends user to correct page generated from appropriate template
  • database.py
    • Abstracts away the interaction with the MongoDB database; will later use DynamoDB
  • forms.py
    • Assists in collecting information from forms and manages form verification
  • models.py
    • Defines the “User” class for controlling access across the application
  • /templates
    • Contains webpage templates to be filled out by information from the Database
  • /static
    • Stores resources needed by the website, e.g., JS scripts, CSS files

9 of 18

Example Routing Function

@app.route('/article/<article_id>', methods=['POST','GET'])

@login_required

def blog_post(article_id):

try:

article = db.get_article(article_id)

suggested_articles = db.get_suggested_articles(article_id)

result = db.increase_article_views(article_id, 1)

except Exception as ex:

print(ex)

else:

if article:

article["date"] = "{:%A, %B %d, %Y, %I:%M %p}".format(article["date"])

paragraphs = article["text"].split("\r\n\r\n")

article["paragraphs"] = paragraphs

session["current_article"] = article_id

return render_template('blog-post.html', article=article, suggested_articles=suggested_articles)

return redirect(url_for("index", page="1"))

10 of 18

MongoDB Article and User Collection Structure

Article:

{

"_id": ObjectId("an-auto-generated-hash"),

"title": "Article Title",

"text": "Article Text",

"path": "/static/images/image-name.jpg",

"date": ISODate("YYYY-MM-DDTHH:MM:SS.XXXX"),

"reads": X,

"upvotes": X,

"downvotes": X,

"voters": {"Voter Username": true},

"author": "Author Username"

}

User:

{

"_id": "Username",

"username": "Username",

"email": "username@example.com",

"password": "pbkdf2:sha256:a-secure-hash"

}

11 of 18

Scaled Architecture

12 of 18

Benchmarking

13 of 18

Challenge: Picking a Testing Framework

CasperJS

-Needs node environment to execute

-Lots of dependencies

-Not well suited for running on a pool of machines (Condor)

PhantomJS

-Can run on Condor by sending over PhantomJS executable

-Difficult to write sequential testing programs with asynchronous JavaScript

Selenium

-Can be compiled into single JAR and run on any Condor machine with Java

-Intuitive to write test programs that require sequential page navigation

14 of 18

Selenium Tests

  • Increasing number of concurrent users until performance dips
  • Each concurrent user starts executing Selenium program at same time
  • Each concurrent user runs for same amount of time, currently 10s
  • Metrics
    • Throughput (requests / seconds) = requestsCompleted / timeTaken
    • Latency (seconds) = (timeTaken * numUsers) / requestsCompleted
  • Orchestrated using Condor and a Python Work Queue script

15 of 18

Initial Results

16 of 18

What’s Left?

17 of 18

Work Left to be Completed

  • Finish using custom test suite to benchmark application
  • Create Selenium programs that simulate a larger set of application workflows
  • Implemented the scaled version of the application

18 of 18

Thank you!