Thursday, September 1, 2011

JavaScript Format File Size Truncate Decimals

I found a nice bit of JavaScript code which decides what size abbreviation to display for a given number of bytes. For my situation, it was a bit too limited so, I used the awesome bit of logarithmic logic and added the ability to handle null and truncate to a number of decimal places (defaulting to 2 decimals).

function formatSize(bytes, decimals) {
    if (!!!bytes) return 'n/a';
    var sizes = ['Bytes', 'KB', 'MB', 'GB', 'TB'],
        i = parseInt(Math.floor(Math.log(bytes) / Math.log(1024))),
        decMult = Math.pow(10, decimals || 2);
    return (Math.round((bytes / Math.pow(1024, i)) * decMult)) / decMult + ' ' + sizes[i];
}

The bit of logarithmic logic may not be the obvious choice of most developers; I certainly haven't seem that logic anywhere else, but it works. After seeing this and peeking my interest, I wanted to understand how this path of logic was obvious to someone else.

Lets first consider what it would take to achieve the goal using a non-logarithmic solution. It could involve casting the number to a string to find it's length and then doing some calculation on that to produce the index that mapped to the abbreviation. You would have to account for possible fringe cases. Potentially this is easy enough.

For those that despise math, the following may sound similar to Vogon poetry. Plus, I am not a math teacher, so bear with me, I am attempting to think through it logically.

Logarithms are used in many every day situations. The two that come to mind first are decibels (sound volume) and the richter scale (earthquake); both of these use base 10. So each step higher is 10 times the previous value or we can call this a period. In short, it provides a way to turn an exponential equation/graph into a linear equation/graph. It is not important to understand the math, but notice the at 10 maps to 10 and 100 maps to 20.

From: http://www.ndt-ed.org/GeneralResources/decibel/decibel.htm, Note: log is in base 10.
Ratio between Measurement 1 and 2 Equation dB
10 dB = 10 log (10) 10 dB
100 dB = 10 log (100) 20 dB


In this situation we want to determine which period the largest number is in (so to speak). Abstractly, ",...,<3>,<2>,<1>,<0> bytes".

For Example:
25490 would fit in to <1> per the representation above.

Lets start to put the decibels/richter scale and this case together. Decibels uses a base(period) of 10. A base(period) of 10 does not translate usefully to bytes. However, a period of 1024 does mean something.

So we can use Log base 1024 of x bytes (lets write this as log1024(x)) to find out which period it falls in. Javascript has a function (log10) which will evaluate log base 10 of a value. However it doesn't allow you to specify a specific base. I will elide a lot of logic and say that Javascript has a function (log) which is the natural log of a value. See the links below for more information on natural log. To simplify things, we can solve any log base "b" of "a" using only natural logs (math shorthand: ln).

logb(a) = ln(a) / ln(b)
Thus,
log1024(a) = ln(a) / ln(1024)

So we can use Wolfram Alpha to do some examples.

ln(20480) / ln(1024) = 1.432...
http://www.wolframalpha.com/input/?i=ln%2820480
%29%2Fln%281024%29






ln(2048000) / ln(1024) = 2.096...
http://www.wolframalpha.com/input/?i=ln%282048000%29%2Fln%281024%29






If we take the mathematical floor of our natural log formula, we can map this to our list of abbreviations.
...,<3>,<2>,<1>,<0> bytes
[..., GB, MB, KB,bytes]

Mapping the results yields KB for Example 1 and MB for Example 2.

This is a pretty cool line of thought for solving a programming problem mathematically.

Additional information:
The original code that spurred my interest can be found here: http://codeaid.net/javascript/convert-size-in-bytes-to-human-readable-format-%28javascript%29

No comments: