Sunday, February 06, 2011

Magic Quotes and Lesson in Security

PHP is a kludgy language. It is common for a PHP page to take raw data from web input. Translate the data into an SQL string and execute a SQL query. A programmer might have a page with URL "page.php?id=5." The program page.php might have the code:

mysql_query('SELECT * FROM PageData WHERE id = '.$id);

$id contains the variable in the URL string. If the id is 5, PHP sends the SQL command "SELECT * FROM PageData WHERE id = 5" to the database. Since one is sending straight text to mysql, hackers have learned that they can often inject SQL code into PHP strings. For example one might call the page.php with the command ("page.php?id=5; 'SELECT password FROM SecurityTable WHERE user=1').

The variable $id now contains the string "5; 'SELECT password FROM SecurityTable WHERE user=1'". The command in mysql_query() is now sending two SQL commands to the database engine.

When looking at logs for a web site; you will occasionally find hackers sending commands to pages to inject SQL strings into PHP pages. Hackers will even run scripts that systematically attack every variable sent to a page to see if it was properly validates. This type of attack is called an injection script.

The key to this sql injection script is the single quote character "'" which breaks out of one sql command and starts another.

To prevent injection scripts, programmers must validate all input from the web. Validating a string involves changing the quote character into something that won't be wrongly interpretted by SQL. PHP created a program called addslashes(). addslashes() escapes quotes with a slash. addslashes() translates "'" into "\'" which the database correctly interprets as a quote and not a break in a SQL statement.

Early PHP manuals instructed programmers to run addslashes() on all code from the web.

But even one sloppy piece of code in a large program created a security hole.

Tired of being hacked Webhosts came up with the clever idea called "Magic Quotes"

Magic quotes automatically runs addslashes() on all data coming into the web site. There were two big mistakes in this decision.

The first was that addslashes() escapes slashes. If you add slashes on the same data twice, you start getting extra slashes. the command addslashes("'") returns "\'". addslashes(addslashes("'")) returns "\\\'". Running the program recursively three times returns "\\\\\\\'"

This is actually an interesting application of the reflective paradox.

All of the code written to PHP best practices prior to MagicQuotes started returning multiple slashes. Programmers had to convert their code to addslashes(stripslashes(data)).

The second major flaw of Magic Quotes is that people around the world want to write programs in their native language using their native character sets.

Different character sets around the world use different delimiters for text. addslashes() and Magic Quotes assumes everything was written in a latin based character set. Different character sets use different delimiters.

One does not know what characters need to be escaped until after establishing the connection to the database. Each database has its own preferred mechanism for validating data. MySQL offers an awkward function called mysql_real_escape_string().

Magic Quotes turned out be a fiasco. Programmers who've written their own validation procedures want the feature off. A small number of programs have grown dependent on it.

Now here is the interesting security problem: Any code written with the assumption that magic quotes was running becomes a security hole when magic quotes is off.

So, Magic Quotes created an interesting example in which a well intentioned security mechanism made programs less secure.

No comments: