Skip to content

tip 3 : never ever whatsoever

When using select in your SQL, do so by specifying the name of each individual field

never use select *

Old man reading news paper early_in the morning at Basantapur

This file is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

tip three

When using select in your SQL, do so by specifying the name of each individual field. Never ever use ‘select *’

because

  • unnecessary field selections can slow your query
  • the contract between you and your client is then meaningless, even if your client is you

the line of least resistance

Ooh, ‘select *’, tempting isn’t it? When you’re querying a row in a view, a table or a table-function, why not grab all of the fields? Just scoop up everything in one go, you can leave it to the callers of your routine to sort through what comes back!

As long as you’re careful in the code that unpacks the results, ‘select *’ can save you from having to keep up with the structure of whatever you’re querying. And then there’s that gut feeling of yours, maybe (you tell yourself) it’s quicker and less trouble for SQL Server just to scoop up the whole row, rather than picking through all of the individual fields and sorting through to find whatever you wanted.

Plus, what if you change your mind about what you want, or if a field in the table /view/function changes its name? Then if you didn’t use select *, you may need to change the code that unpacks and accesses the data. Or what if you’re only writing the data access layer for other client apps that do their own unpacking? You don’t even know which fields they want… yeah, it’s select * for me, any day of the week. I mean, while you’re down there, anyway, picking up the row. Just hoover up all of the data, and pass it on!

This is like when your grandad asks you to go to the sitting room and bring his glasses, ‘while you’re there, just pick up my copy of the newspaper, thanks very much. Oh and my briefcase too, it’s next to my glasses’. See? Yes, SQL Server has gone to all the trouble to actually navigate to the row, best to scoop up everything in one go. What harm can it do?

Well, as it turns out, there are a couple of possible drawbacks to this approach. Let’s just think about grandad for a moment, what if he’s a bit forgetful? Maybe the newspaper’s not in the sitting-room after all, which means you have to go and look for it somewhere else. Oh no! It’s in the loft! And this is the Sunday newspaper, which comes in several sections, by now scattered around the house (for the purposes of this discussion, you’ll be pleased to hear that Grandad wants all sections of the paper). So just by agreeing to retrieve one item for an elderly relative, you’ve given over your whole afternoon to an exasperating hunt for the gardening section. Which Grandad won’t read because he doesn’t even like gardening, and anyway he’s asleep in his chair by the time you get back with it (it was under the dog’s water bowl, by the way). Thanks very much.

Sorry, what? Yes it is, a query to SQL Server is just like you and your grandad. As in, sometimes the server will need to potter about a bit to get what it needs.

potter

For a start, not every field in a row is necessarily stored in the row itself. Large variable-length fields can be stored on row-overflow pages, which you need to access in order to read the whole row. Which means that grabbing the whole row is not always accomplished in a single read.

Then there are some data types (text, ntext or image) for which the data is stored off-row by default. That is to say, even if you have a really small amount of data in those columns you’re committed to wandering off elsewhere to go and find their data.

You might not unreasonably say to yourself that the table or view whose contents you’re grabbing is nice and compact, with none of those unpleasant off-row data types. But then you’re relying on the notion that this will always be the case. If some bright spark comes along and redesigns your table so that each row now contains an image, well in that case your query or stored procedure is now committed to hauling in those images every time you call it. Which means that every row access will involve hunting around the database hauling in the corresponding LOB Data page to grab the image.

I’m not suggesting that protecting yourself against dim-witted database designers is the issue here (probably/possibly your colleagues are very competent). The problem is that in writing a select * query you’ve left yourself open to another pitfall to deal with if ever you have to redesign the table. It’s far from impossible that the redesign will be sufficiently urgent to preclude the opportunity to investigate all of the dependencies on the table. Why risk falling into the pit?

heavy work

Let’s take another look at Grandad (stay with me, I know where I’m going with this), he also wanted his briefcase, didn’t he? What if it’s really heavy? The second pitfall can only occur when you query a view or table function. We’ll revisit this later as a fault in a class of it’s own – it’s the hidden operation.

If you select * from a view or table function then it’s perfectly feasible to perform whole sets of operations that you just don’t need. A field in a view can hide a call to a resource-heavy scalar function, or it could be supplied by a join with a huge unindexed table. Unless you’ve dug right into the view (and sometimes even when you have), you can miss these tripwires.

breach of contract

The final reason not to select * is to protect the users of your query. By calling select * you risk breaking any contract between you and your caller (even if the caller is you). Somebody calling your query may be expecting a specific set of columns in a particular order. By using select * you risk the possibility that you’ll deliver something other than what is expected. A table change can break somebody else’s program, even though you yourself did nothing.

avoiding the pit

You can’t be expected to catch every single gotcha. But it’s possible to minimise any potential damage by only selecting what you need, and no more. Because if you don’t select a field that’s really a function then the function isn’t called. If you don’t select a field that’s delivered via an expensive join, you won’t pay the price of that join. If you don’t select a field that’s stored off-row then SQL Server doesn’t have to go and get it for you. It’s as simple as that.

The final extra bonus for only selecting what you need is that your query results will use less memory in SQL Server, which should cut down on the time it takes to deliver your precious data. So save yourself the bother and select only what you need.

Leave a Reply

Your email address will not be published. Required fields are marked *