Databases: Why should I use prepared statements?

Started by
9 comments, last by Flyverse 8 years, 8 months ago

I know that it deflects SQL Injections. But why use prepared statements, if I can just verify user input myself, before putting it in the string-query?

Pseudo-Code:


//Prone to SQL Injections
String userInput = ...;
String query = "SQL BLABLA" + userInput + "OTHER SQL BLABLA";

//NOT prone to SQL Injections, in my opinion
String userInput = verifyUserInput(...);
String query = "SQL BLABLA" + userInput + "OTHER SQL BLABLA";
...
function verifyUserInput(String userInput){
    if userInput is weird: return default input or something
}

I mean, most of the time you need to verify the input anyway.

Advertisement

Because users can get creative in ways you might have not imagined; they might be able to circumvent your verifyUserInput method in a way that you did not expect. Use parameters in queries instead, which accept data for the types they really are irrespective of its content.

I.e.


SELECT * FROM MyTable WHERE SomeField=@MyParameter

And then you define the type and value of @MyParameter -- the specific implementation details depend on the programming language, API and SQL language you're using.

Are you SURE you can validate it yourself? Are you absolutely certain that you are up-to-date on all the latest attacks? Do you want to implement those tests everywhere?

Of course, validation is only one reason.

Another reason is performance. Even if you are transmitting the query each time rather than storing it on the database, most database systems will cache them and preserve their optimized, precompiled version of the request.

Another reason is batching, many systems or some types of queries you can batch many queries with a single command.

Robert'); drop table students; --

Robert'); drop table students; --


https://xkcd.com/327/


function verifyUserInput(String userInput){
if userInput is weird: return default input or something
}

This was touched on already by frob in a way, but if you are validating the input and you know it is weird, do you know it is weird for all database types your program might ever query?

Some databases escape a ' symbol with a backslash, others use double quotes, and yet others do other strange things. Then, there are things to account for like unicode support (if the database supports it, can it be used to break out of the query in some odd unexpected way?) support for newlines in strings, or lack of, etc etc.

Never ever validate strings yourself (this goes for any form of user input!) if there is a pre-built function to do it, as the person who wrote the pre-built function probably thought of more edge cases than you, and you're probably just wasting your time to reinvent a crooked wheel...

Need I remind you that SQL injections stem from the fact that you're treating user input as SQL. "Validating" the user input the way you suggest amounts to treating it as SQL and then checking that it's not SQL. Uh, okay? Why not just, you know, not treat it as SQL but as a parameter to a query (i.e. prepared statements)? The user input still needs to be validated against the parameter's type (string, boolean, etc) and also according to your usual business rules, of course, since user input always starts off untyped as a raw string, but that's no excuse for doing not only useless but illogical and dangerous "SQL validation" on user input that should never be treated as SQL in the first place.

There is no reason to validate, it does not even make sense from a typing perspective. The user input is not just another random chunk of the query, it is a parameter that is referenced by the query. Sure, you can validate all you want, and it might eventually work (actually the typing rules are quite complex owing to SQL's odd grammar) but it's pointless because you don't need to. All you need to do is treat the user's input like a parameter and not like a part of the raw query, and that's what prepared statements do. Same result, but vastly simpler (and more logical) solution: you simply say "THIS is the query, THOSE are the parameters, and you plug them HERE, HERE and THERE" and suddenly the problem is gone. You didn't even need a validation function! (and neither does the database, by the way; databases don't magically consume SQL statements, they parse it and then do computation based on the action, parameters, conditions, etc.. involved).

This is an obvious typing problem, and the reason it's so widespread is because of the untyped, string-based nature of raw SQL, which is rather unfortunate (if convenient). Types exist for a reason, if people actually took advantage of the type system they have available to them they wouldn't need half the validation methods they are using.

So, to pertinently answer your question: why not use prepared statements?

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

I came here to say something very similar to what Bacterius said.

You can verify it yourself... but prepared statements don't need to be verified. That's is entirely different.

Verifying has two mains issues:

  1. Bugs. If you missed some injection opportunity (as others have said: newlines, unicode, weird slashes & colons, false positives, etc), you're vulnerable or left with a broken server that rejects valid input.
  2. Performance: You need to spend cpu cycles verifying each statement.

Prepared statements have none of those issues because they don't need to verify at all.

Ok then, thanks for the explanation!

Kind regards,
Flyverse


But why use prepared statements, if I can just verify user input myself, before putting it in the string-query?

As a general principle: where the option exists to use functionality that's been written by domain experts and has been extensively tested in the real world, you should strongly prefer that over your own code.

This topic is closed to new replies.

Advertisement