For the past several days I’ve been working on a website for fun. You can check out what I have so far at http://discuss.nathancheek.com. I have never done much with web programming other than simple static websites so this is my first such project. It was supposed to be a simple discussion website, but it has turned into quite the project for me. My programming language of choice for this project has been PHP which has been mostly straightforward. However I have definitely learned a whole lot of things in the past couple of days. Here are some of the things I’ve learned so far.

First of all, I went into this project with security in mind. This means using htmlspecialchars() to sanitize inputs, but also keeping in mind best practices when storing passwords. For password hashing, I used the new password_hash() function available in PHP 5.5. One issue that I ran into today was with special characters. I thought I’d covered every aspect but in fact I had created a problem. If the user submits a special character, it is converted properly. However, the rest of the site wouldn’t know how to handle that changed text. I have been spending a few hours hours this evening working to fix this issue. Also, when I used htmlspecialchars(), I thought that would cover all issues with inserting into sql queries. That is not the case however, but fortunately it was an easy fix with mysqli_escape_string(). I thought that might cause some issues with the other character problem I’m dealing with but by escaping it within the query functions I’ve created, I haven’t found any issue with that so far.

Second, I have been building in support for various scenarios that I can think of that could be a problem. Examples of these include situations where discussion names already exist, trying to view discussions that don’t exist, etc. These were easy enough to deal with, and of course this would be a horrible website if it couldn’t handle such issues. However, there were other issues I had to design for. For example, the way I am designing the site requires each discussion to have an ID. Instead of giving it a simple incrementing numerical ID, which would allow an attacker to modify their request and access things they shouldn’t, I opted to implement a pseudorandom ID generator. I created a function for this and I also included a check to make sure the ID isn’t already in use (even though the likelihood is extremely tiny where a 10-character string is involved). The problem was that when it checked the database, it never seemed to return anything even when I coded it to request an ID already in use. I spent numerous hours over a couple days trying to figure out why. When I finally figured out what I had done (or rather not done) I felt very stupid. The issue was caused by two things I’d left out of the function:

  1. I did not have an include statement to include the database connection information.
  2. I did not have any way to pass a variable to the function.

I had failed to realize something very simple about functions: they are separate from the rest of the code running on the page. The variables set in the main php code aren’t normally visible to the function so they must be passed to it. Currently I have each function include the database connection php but I suppose I could also pass it as an argument.

The next thing on my list is to get cookies working correctly. They were working fine this afternoon, but the issue with special characters has added a little bit more for me to work through. This project is far from over so I plan to post updates to my adventure. I’ll probably look back at this code in a few years and wonder how in the world I could write so messily.