Kenneth Friedman
Can we design a website to last 15 years?

In the Computer Science department at MIT, many classes have their own, custom built website. Each class has its own quirks and intricacies that necessitate a special website with particular features.

One of these classes is 6.UAT, the MIT class on technical public speaking. I have been a teaching assistant for this class four times in the past five semesters. MIT classes usually have a number to represent the classes: 6.002, 6.034, etc. But some classes have letters as well. Strangely, the 'UAT' name is either a random three letter string, or an acronym that has been lost to the mists of time. Even Anne Hunter doesn't know why it's called "UAT." At MIT EECS, if Anne Hunter doesn't know, you can be sure no one else does either. It's a class in teaching computer scientists how to present their work to a range of audiences with different levels of experience. And, like those many other courses in MIT EECS, 6.UAT has its own custom website.

But the current 6.UAT website is unique in one way: it was made 15 years ago. In the sea of ever changing frameworks and full website redesigns, the 6.UAT website has been chugging along with only minor tweaks each semester since 2004.

But it's finally time for a change. This semester, I re-wrote the website from scratch. While designing the new site, I had one key goal in mind: can we design a website that will last another 15 years — without needing a full re-write.

Class management software is surprisingly complex. The site needs student logins, staff logins, attendance management, grades management, and assignment uploading, just to name a few.

Can we build a website that will last until 2034?

In the post below, I will explain the major design decisions for the website, with a focus on longevityNote: the absurdity of thinking hard about writing code to last a mere 15 years is not lost on me. The ancient Roman engineers were able to build bridges that still exist, thousands of years later. But the work of software "engineers" can barely last a decade. It's not an impossible task, and luckily there are some good ideas about how to fix it. and long-term maintainability.

The Stack

The first and most important decision in designing a website is the "tech stack," the set of preexisting software, tools, and hardware needed to run and maintain the site.

I took inspiration from Maciej Ceglowski — of Pinboard fame — who describes his tech stack:

There is absolutely nothing interesting about the Pinboard architecture or implementation; I consider that a feature!

Programming Language

To maintain the website for the next 15 years, future teaching assistants (TAs) need to be able to easily edit the site. The primary role of the TAs in this class is to help run recitations — not to code, so we can't assume future TAs will have website development experience.

The best chance of future TAs being able to modify the site is to use the most common programming language. The beginner programming courses at MIT currently teach Python. So to maximize the chances that future TAs will be able to edit the site, we wrote the back end in Python.

Python 2 will not be maintained past 2020, so of course we wrote it in Python 3.

The two most common back end Python frameworks appear to be Django and Flask. Because Flask is 'lighter' (in the sense that it's easier to get started, and is easy to understand) — we chose Flask.

Database

For a database, we decided to use SQLite. SQL-style databases are very common, and 6.UAT site won't need anything more complicated (if the site is written well!). We chose SQLite as the flavor of SQL because it is very easy to setup, Python has a built-in support, and Flask does not care about database choices. Particularly nice, the SQLite developers have a note about longevity on their About page:

The future is always hard to predict, but the intent of the developers is to support SQLite through the year 2050

If SQLite is designed and motivated to make it to 2050, it should definitely get us through 2034.

Hardware / Servers

On the topic of hardware and the physical server where the website will live, our options were very limited. The finance-minded parts of the MIT administration prefer one-time purchases over recurring costs. So when it came to deciding between cloud storage and our own physical server... our decision was made for us.

I'm still not sure if there are strong arguments either way. What is more likely to make it to 2034: any individual cloud service provider, or a reliable, SSD based Unix box?

But Will It Scale?

That appears to be the programmer's favorite question. To consider whether the technical resources will work, we have to consider the constraints of the number of people who might be using the site.

Because 6.UAT is highly interactive, each recitation is capped at 8 students per section, where each student is staffed by a TA and faculty member. So there are obvious restrictions on the student count based directly on the number of TAs. But we can relax that restriction to consider a better maximum number of users. There are about 1,100 students per year of MIT undergrads. Assuming every undergrad begins to major in computer science, and every undergrad on campus all takes the class in the same year, that's still less than 5,000 students. With some back of the envelope math, it's very clear that Flask and SQLite could handle a user-base of 5,000 users.

Looking at more realistic numbers, there were about 200 students that took the class last semester. With a mere 200 students, scaling is a non-issue for 2019 hardware.

Front End

The front end is very simple to describe: no dependencies!

Instead of having to teach future TAs a front end web framework, we simply use raw HTML and CSS.

During the first iteration, using raw HTML/CSS is somewhat frustrating. It would have been much faster on the first pass to make the front-end look nice with a modern front end framework.There is a secondary effect here: not using a front end framework forced us to think of the simplest possible way to represent any given page. There is no room for complexity-creep because it would be too hard to implement. But this will save time in the future: TAs will not need to learn a new framework, they won't have to deal with odd bugs in the frameworks, and — most importantly — there is no chance that the chosen framework becomes outdated.

The Syllabus Page

Here's a snapshot of the Syllabus page — I think it holds up, no frameworks needed.

A missing word from the description of the front end so far: Javascript (JS). There is almost no Javascript on the site. A few pages, such as the MiniCMS described below, require JS. Most pages have no Javascript at all. The HTML loads with pre-populated data from the Flask server, and users make changes (such as a TA changing attendance) purely through HTML buttons that send POST requests back to the Flask server.

The primary reason for not using Javascript is — once again — simplicity. It would be very esay to quickly add tons of unnecessarily complicated code to have fancy features. But in the vast majority of the time, all you need is a page to load with data already in the HTML, and HTML buttons and text fields to make modifications to send back to the Flask server. A secondary reason is that we are at MIT, and if anyone is going to follow the GNU/FSF/Stallman philosophy, it will have to start here.If you haven't noticed, this very site you are reading also does not require JavaScript, although you get some cool interactivity if you are okay with JS.

A Mini CMS

Many online sites these days use some sort of Content Management Systems (CMS), which lets "admin" users control and modify pages without having to SSH tunnel into the actual server to modify files.

The old 6.UAT had no CMS, it was written in PHP, and every change to the site required a Terminal window.

In the new site, there is a very simple CMS for dealing with static pages.

Pages that load user data, such as the attendance and grades pages don't use this sytem. Major changes to those pages still require modifying the Flask server. However, there are many pages that are just static text (the course Syllabus, etc). These pages should be editable by the staff, but static to the users.

To achieve this, all static information pages are not saved as HTML templates (as most Flask pages would be), but instead stored as a string of text in the SQLite database. When a student loads the page, Flask simply loads the string of HTML content. However, if a staff member loads the page, the string of HTML from the SQLite database does load, but a small "Edit" button at the top is also present. If the staff clicks Edit, they get a very minimal editor: simply an editable text box of the page's contents with a Save button. If a staff member changes the content in the text box and hits Save, a Post request is sent to the Flask server, and the Flask server replaces the SQLite data for the page with the newly modified content.

Compared to bulky CMSes you may have used in the past, this method is shockingly snappy. In fact, I actually had to go into the database and double check it was working the first time, because it loaded so quickly that I didn't believe it actually worked.

Here's a demo of that system:

Back End

I've already established that the back end is served through Flask, written in Python3 with an SQLite database... but how does the back end actually work?

The main idea on the back end is to keep things flexible and loosely coupled. Here three examples of those motivations: how the data is represented, how files are separated, and how users are organized in the database.

Data Representation

One of the main two tables of data needed is the attendance page. There are a bunch of students and a bunch of different lectures and recitations (L1, L2, L3, etc). So at first I thought the students should be the rows of the table, and the lectures should be the columns. The piece of data at a given row or column is a 0 or 1 for whether that student attended the class.

But what would happen if a lecture had to be added mid-semester, or a snow-day canceled one of the classes? Careful database work would have been necessary to add or remove columns. Instead, our database schema has only three columns: the student, the name of the lecture, and the status (0 or 1 for whether or not the given student attended the given lecture).

Then, the Flask server requests all of the attendance records for a given student, and displays the results based on a python list of classes it knows about. If a class needs to be added or removed, the database does not need to be touched. A staff member just has to change the python list to add or remove a class. If the class already took place (and therefore already has data), but now the class doesn't need to be listed for attendance, the class string is just removed from the list. All of the data can stay in the database, the site simply won't display it.

Separation of Concerns

There are only two files that run the whole site: one for reading in the URL and returning the HTML files, and another for communicating with the database.

The first file is the main flask server. Flask makes everything very easy: when someone loads the site, Flask finds the function to run with a python function decorator. For example, if someone loads thewebsite.mit.edu/syllabus/, it finds the following code:

@app.route("/syllabus/")
def load_syllabus(page):
   dynamic_page_exists = db.does_dynamic_page_exist(page)
   if not dynamic_page_exists:
     return return_message("6.UAT: Page does not exist")
   body = db.get_dynamic_page_body(page)
   is_staff = is_current_user_staff()
   return render_template_with_nav('dynamic_page.html', page_name=page, body=body, is_staff=is_staff)

If you take this code one line at a time, it's straightforward to follow. Just note that the name for the "miniCMS" on this site is called a Dynamic Page. So first, it checks to make sure the page exists in the database (DB). If it doesn't exist, it returns that message to the user. If it does exist, then it pulls the content from the DB. Next, it checks whether or not the current user is a staff member (which will determine if the "edit" button is available on the page). Finally, it returns the HTML back to the user with the loaded content.

Student Users

Finally, I made some considerations on how students are stored on the site. The username is based on the students MIT email, and the password can be chosen by the user and changed at anytimeOf course, the passwords are not saved in the database as plain text. They are salted and hashed, automatically for us by Flask's "generate_password_hash" function.. But internally, the unique identifier of each user is actually their username append to the current semester and year.

For example, the username of John Smith might be jsmith, but the internal representation is 2019spring_jsmith. This ensures that there will be a way to distinguish between a future student that has the same email (since MIT emails are recycled after a student graduates).

Closing Thoughts

I think this website will last until 2034 without the need for a full rewrite. Unless there are dramatic changes to the class that render it unrecognizable from its current state, the site should only need minor tweaks each semester to adjust to new assignments, requirements, and structure.

I think most of the longevity will come from two core decisions: (1) not using a front end web framework and (2) using the simplest possible back end framework. But the other considerations: the database, the loose coupling of the back end, and the data structures used, will also help.