Things that are not strings
As programmers, we have a collective delusion that anything that can be represented as a string, is a string. This thinking causes a whole bunch of problems.
Let’s take SQL for example. Every API in every programming language that I’ve seen considers SQL statement a string.
function execute(sql: string): Promise<Result>
The problem with this API is that not every string is a valid SQL (nor sometimes it is the SQL you actually want to run).
Here’s a classic example of the misuse:
const query = 'SELECT * FROM posts WHERE id = '
+ params.id;
In this example params.id
can be anything, including invalid or malicious SQL.
The root problem here is not the lack of sanitization. The problem is that SQL is treated as a string.
Think about JSON for another example. You could certainly implement adding an item to a hash by doing this (I hope this code makes you cringe):
function addKeyValue(json, key, value) {
return json.substr(0, json.length - 1)
+ ', "' + key + '": "' + value + '"}';
}
As with the SQL example, you could add escaping and sanitization, but it’s just hacks hiding the real problem:
A string can be a representation of a thing, but it’s not the thing itself.
And it’s not only about concatenating strings. Can you spot the problem with this function? (see answer)
function isSafeDomain(url: string): boolean {
return url.includes('https://safe.com/');
}
Or in this one? (see answer)
function checkPassword(pass: string, hash: string): boolean {
return sha1(pass) === hash;
}
Strings are lower level, and thus are much more flexible than they need to be to properly implement valid operations on the higher level concepts.
Incomplete list of things that are not strings:
- SQL
- HTML
- JSON
- URL
- File path
- Password
Things are… things
You can save yourself a lot of headache if you stop treating everything that can be represented as a string, as a string.
Both OO and FP styles allow for abstracting away something as a type or a class. You can make a closed opaque structure for the thing and limit the ways it can be constructed.
For example, for SQL, you might want to make sure it’s only created from static string literals.
// Allowed
new SQL('SELECT * FROM posts WHERE id = ?');
// No allowed (e.g. via a lint rule)
new SQL('SELECT * FROM posts' + filter);
Of course, at some point, you will need to serialize the thing into a string to pass it into an API that was designed to consume a string. Do it at the last possible moment and try to limit it to a single place in the codebase.
function execute(sql: SQL): Promise<Result> {
return unsafeExecute(sql.toString());
}
Strings are coming into your app from the outer world. Don’t trust them to be what they seem they are. Convert them into proper things as soon as possible, and convert them back to strings as late as possible.
Here’s a few libraries for inspiration of how to treat things as… things:
- SQL: Slonik, LINQ
- HTML: React, Elm, rum
- JSON: a dictionary in any programming language
- URL: url - Rust
- File paths: std::path::Path - Rust
- Passwords: secstr, securemem
Hello! This text lives here to convince you to subscribe. If you are reading this, consider clicking that subscribe button for more details.
I write about programming, software design and side projects Subscribe