Answer is actually very simple, but several of my friends have missed the point of having size modifier in MySQL column type definitions, so I suppose it’s something non-obvious. Main confusion, I believe, has its roots in string data-type definitions:

VARCHAR(N)
CHAR(N)

where N is the maximum number of characters you want to store into field, i.e. it is field’s length. And depending on your settings you might receive an error (in strict SQL mode) when you are inserting something that exceeds the pre-defined length. Even if you don’t receive error or warning, the field is limited to the length you set, and the exceeding part is truncated. So, VARCHAR(20) and VARCHAR(40) are really different with respect to how big the stored value could be.

Now, let’s get back to numeric types. For numeric types length modifier is actually display width, which has nothing to do with what can be stored in field. Yes, that’s it – both TINYINT(1) and TINYINT(4) can store values in range -128..127 (or for unsigned values 0..255), and are absolutely identical datatypes, with one small twist (which concerns date retrieval – see below).

Here is explanation of what is meant by display width, taken directly from the source:

The display width does not constrain the range of values that can be stored in the column, nor the number of digits that are displayed for values having a width exceeding that specified for the column. For example, a column specified as SMALLINT(3) has the usual SMALLINT range of -32768 to 32767, and values outside the range allowed by three characters are displayed using more than three characters.

So, if display width doesn’t constrain the range of values our numeric type can hold, why to use it at all? On MySQL forum, there was a post suggesting that by using the display width, we set mental reminder to ourselves on how big the field should be. Me personally, find this quite logical, this is some kind of mental tooltip which pops up when you review your schema.

Of course, there’s a more deliberate functionality for having display width. One obvious example would be ZEROFILL attribute, which when used in datatype definition guarantees that the value returned would always have pre-defined width, and default space padding is replaced with zeroes. For example, when column is defined as INT(5) ZEROFILL, and value to be retrieved is 4, MySQL would return it as 00004.

So, the bottom line: for numeric types, N in TYPE(N) has nothing to do with value range, and refers to display width.

Have a nice weekends!

  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Reddit
  • Technorati
  • email
  • Print
  • DZone
  • eKudos
  • LinkedIn
  • StumbleUpon
  • Tumblr
  • Twitter
Trackback

5 comments untill now

  1. //Me personally, find this quite logical, this is some kind of mental tooltip which pops up when you review your schema.//

    Well said and I agree with you!

    //for numeric types, N in TYPE(N) has nothing to do with value range, and refers to display width.//

    Exactly :)

    Cheers,
    Raghavan alias Saravanan M.

  2. Whats a good type for primary key?

    I usually use signed BIGINT autoincrement, but I see other people using INT.
    This always bothers me, what I should use for primary key as I don’t want to run out of numbers and I want to keep it simple etc.

    Phil

  3. @philip andrew:
    Well, actually integer is quite big enough to make sure you would not run out of numbers. If you use unsigned int, then the maximum is 4294967295 – and believe me it is a huge number!

    Of course there might be situations when even this number is not enough, and you resort to BIGINT, but it happens more rare than one might think. Primary keys are used to uniquely identify some entity in your table – and if your application has 4294967295 of anything, it’s quite a big application.

    Why I prefer INT? Because of performance – if you don’t need big number, and you use it, then you are wasting not only your resources but computing power as well. MySQL is extremely good when it comes to int primary key traversal (for selects, joins), so try to stick with INT, unless you really need BIGINT.

  4. Firstly, an int on 95% of machines will be faster than tinyint or smallint even though it is bigger — this is because it fits into your 32bit cpus register without requiring masks for operations. This is why in C an int is defined as the fastest, most natural, datatype for your architecture — and only guaranteed to be as big as a small, and no bigger than a long. In a true 64bit env, a 64bit BIGINT might be faster than a tinyint; but, again we’re talking pennies in performance considering many people who use MSSQL don’t even understand the different types of indexes.

    Secondly, tinyint(X) embeds a presentational component into the database. This is always bad practice. You gain nothing at all in querying capabilities from this. And, to change the display requires either a higher level hack, or modification of the datatype’s psuedo width which will result in an change in what is returned to all querying programs many of which will not, and should not, have knowledge of the database’s underlying widths and “virtual widths.”

  5. Evan, you’re not considering the space wasted when you use an int instead of a tinyint when you know there will only be small numbers.

    The smaller the data, the faster your queries *can* be. I agree with your argument that the usual int will fit into a register, but that speed gain may not be noticable in current cpus since they are quite fast already.

    If you can reduce 3/4 of the column size you can keep more data in the cache and in memory, which is a better gain than a few cpu cycles you may get from using a native int instead of tinyint.

    regards to all.

Add your comment now