A flower, An alligator; When to use “A” and “An”

“A” and “An” are often confusing for writers of the English language. In terms of meaning, you can rest by knowing both have the exact same meaning, so no one will mistaken what you say or write because you used it wrong. The only thing it is really useful for is ease of pronunciation. It’s made so that the language flows better.

Just pronounce it out loud and you’ll know it looks weird:
- A flower
or
- An flower

Pronouncing the N and then the F feels awkward and unlinked. Just remember this simple rule: there can never be two consonants or vowels one after the other.

An flower and a alligator don’t work because there’s N and F following each other (2 consonants in a row) and A and A following each other (2 vowels in a row).

There’s also the plural trick. Look at the following:
1. “The flowers like the Sun.”
2. “The flower likes the Sun.”
That’s correct.

However, the following isn’t:
3. “The flower like the Sun.”
4. “The flowers likes the Sun.”

Case 3 is actually particular in that it is an incomplete sentence, however not an incorrect form. If the intent was to say the the flower likes (as in loving) the Sun light, then the grammar is wrong, however, by completing the sentence as follows it could be right: “The flower, like the Sun, has a nice yellow color.” A few punctuation differences and voila, it makes sense.

See how without the S at the end of like, the verb becomes the word like (as likeness, likely, likewise, alike). That’s why the S is there.

Case 4 is a major mistake, and an easy one to make because it doesn’t sound entirely wrong. Make sure to never put an S to the verb when the noun is plural.

If you have to have a trick to remember, think that there’s always only one S if it’s a verb.

Back from Vacation
Scala, Twitter and Crashed WebFaction FTP

So, I’m back from vacation today in this beautiful cottage I was at and there’s nothing like waking up to the smell of a shiny new article about Microsoft and Yahoo making a partnership… Great, I missed a lot during those 5 days (couple of hours of reading on TechCrunch that is). Anyway, while listening to this Floss Weekly podcast that featured DHH from 37 Signals, the creator of Ruby on Rails, my interest for Ruby and Rails naturally rose, kind of like back from the dead.

I’d previously given up on Ruby, and I think I’ll be doing exactly the same thing again, while not actually having picked it up again *. Why? This whole Rails thing and DHH’s “FU” humor prompted me to do a bit of research on to where Twitter was with all their Ruby problems. And so, just when I thought a language didn’t really matter anymore and that a database was the only real hog, boy was I proven wrong by this brilliant article Twitter on Scala.

Yes, I did read all the criticism towards this article, but, really, who do you trust most: angry Rails fan or Twitter developers? As for me, I decided I was going to place my bet on the Twitter developers and I am once again throwing Ruby out the window, and at the same time, throwing any dynamic language too. Ironically this blog runs on PHP, but let’s say I never expect it to really have a need for scaling beyond a simple single-core server and WP Super Cache, for which WebFaction provides ample solutions.

Oh, talking about WebFaction, there’s currently an FTP outage on my server there. Hurray, not even a week and I’m already having problems. Ironically my (mt) account’s FTP still works very well, but I can’t say I haven’t seen hiccups there either. Fortunately my sites are in top shape, but it’s already been 37 minutes since I’ve posted a ticket, and with no response and an FTP still down, it’s not cool.

The only critic I would have to make about Scala applies to every other language that’s not out-the-box like PHP or that isn’t tied to an IDE; that is, the installation process (symbolic links to your compiler, etc.) can be rather shady and is very often not explained in books. In fact, to learn it, I’ve had to figure out myself what was going on, both on Windows and Unix systems, which both have very different ways of doing it. Frankly I think Windows’ way is more simple to comprehend albeit less powerful, but really, I’m thinking of making a well-explained tutorial about that.

Edit: WebFaction finally responsded, although a bit late, but the issue has been fixed before they responded. I’m guessing they had more than me notifying them.

* Edit: I actually just did so, I picked up Rails and threw it away again. Despite having learned quite a bit of Ruby, which I did enjoy, I don’t like Rails as usual

phpBB spam no more!

Since our inception of the visual antibot and the question antibot plugin on AMV-Canada’s phpBB board, we’ve completely eliminated the bot problem. Quite simply, since May 4th 2009, we haven’t had ANY bot come through. We’ve even reduced the complexity of the captcha, and eventually disabled it completely, but to no avail, spammers haven’t come back.

Yes, I mentionned the visual antibot, basically an upgrade captcha almost impossible to figure out even for human (puts random pictures in the background, making it REALLY hard to read, we’ve actually had some people complain to us that they couldn’t legitimately register), but we got rid of that. The only thing we have to protect ourselves on top of the default installation is this: The Question Antibot

That thing is holy. Basically it’s question you make up and provide the answer for. Example, a mathematical question. It’s also super easy to set up and change everyday if you want. One big advantage is it putts off just about any bot, because they don’t know what to do with it. And if you have a large traffic site and some body programs the bot to answer the question, just change the question! It’s so efficient you don’t need a captcha.

Unless computers become sentient, I believe this should put off just about any spam bots.
Hit the link: http://www.phpbb.com/community/viewtopic.php?f=69&t=645075&start=0

PHP Particularities: Escaping characters the right way

PHP, like any language has its particularities. One of the them is the inability to understand escaped characters which aren’t in double quotes. So yes, there is a technical difference between a quote and a double quote in a programming language, in PHP at least, if anyone asks.

How does it work? Let’s look at some examples!

$stringData = '<?xml version="1.0" encoding="UTF-8"?> \n SoraGami';
fwrite($fh, $stringData);

So, that PHP script generates an XML file as you can see in our variable $stringData. However I’ve omitted the other code as it isn’t the focus of this article. So, highlighted in blue is the famous escaped character which, in PHP and in all C-insipired languages creates a new line. However, to use it, you must put it inside double quotes (“) and not single quotes (‘). Unfortunately, in the scenario presented here, we needed to put the whole string inside single quotes so that PHP wouldn’t mess itself up with the double quotes inside the XML doctype.

In case you didn’t know, if you have elements with quotes in your string, you can use a combination of double and single quotes to make it work, like this:

'  "..."  '
"  '...'  "

Maybe that helped. Anyway, back to our previous example. The way we made it, because \n is within single quotes, it won’t work and appear as text instead. In this case, $stringData would give this:

<?xml version=”1.0″ encoding=”UTF-8″?> \n SoraGami

Unfortunately this is far from what we wanted to achieve, which was to have SoraGami on a new line. We could do the following:

$stringData = "<?xml version='1.0' encoding='UTF-8'?> \n SoraGami";
fwrite($fh, $stringData);

That would work, but in my opinion Single Quotes don’t look like XML. So, instead, you can do it like in two sequences:

$stringData = '<?xml version="1.0" encoding="UTF-8"?>';
fwrite($fh, $stringData);

$stringData = "\n SoraGami";
fwrite($fh, $stringData);

While this works, you might be wondering why the second sequence doesn’t overwrite what we did before. Our PHP $fh variable is written like this:

$fh = fopen($myFile, 'w') or die("can't open file");

‘w’ says to write, but PHP is session-based in the browser, and so as long as we are in the same session and didn’t explicitly fclose() our writing script, PHP is going to append whatever we write to the file. To overwrite, either reload the page (new session) or fclose() the file and make a new fopen() and fwrite().

If you want to append to the file, just changed the ‘w’ parameter for ‘a’. Tizag has a nice tutorial for PHP File Handling that explains further on what we did here.

You might also be wondering why ‘\n’ doesn’t work and “\n” works. The reason is simple, PHP will escape \n regardless of context when inside double quotes. That means if you wrote that:
“The server can be found through Windows on \\nopi”, you would obtain that:

The server can be found through Windows on \
opi

However, since you don’t want PHP to escape the \n in this situation, you would write the following:
‘The server can be found through Windows on \\nopi’, which would give the correct:

The server can be found through Windows on \\nopi

How secure should your wireless network be?

Wireless network security is something often overlooked. Network is already complicated enough, most will simply bypass the complicated setups and go along with unprotected network access for years. But even if you go through the trouble, there’s a tone of ways to secure your router, some better than others, and sometimes choosing the correct solution is not easy. This is why I decided to create this short guide through which I’ll explain security solutions, what’s good about them, and what’s bad.

Unsecured Access

Although maybe not the best idea, unsecured access guarantees compatibility, speed and ease of use. No complicated key to enter, your network is always available no matter what. This is the worst solution of course, but something to consider if you live in a remote farm area.

SSID (Service Set IDentifier) Broadcast Hiding

Your SSID is your network name. Through your router’s setup, you can choose anything you like. This facilitates recognizing which network is yours when having to connect between multiple networks. It’s also how Windows or other OSes will be able to remember your network settings and automatically connect you. Your SSID is always broadcast over the air so that devices scanning for your network can find it. One easy technique to augment network security has been to stop broadcasting your SSID. This is an easy thing to do. What it does is it hides your router from scanning. That way, only people knowing what your SSID is can access your network.

SSID hiding is however flawed. Each time a user connects to your network, be it you turning on your laptop or a gaming console, your SSID is transferred in the clear, even on an encrypted connection. Widely available software allows to sniff network connections and easily retrieve the SSID. Additionally, most of the time your network isn’t even hidden, it simply comes up as a blank wireless entry, which, however requiring to enter an SSID to connect to, allows a cracker to trick your connection into reconnecting you, broadcasting your SSID in the clear when you connect.

In my opinion, SSID hiding more of a bother than a useful thing. I never hide my SSID, it would just make my already long connection setup longer, and for no real security benefit.

Mac Address Filtering

Every network device in the world has a unique identifier called a Mac Address, something like this: 00-0A-5E-54-59-BF. The theory is, if every adapter has a unique ID, is it possible to enable only the desired network devices to access your network. Fortunately, it is, every single router has that feature, or at least it should. Unfortunately, it’s no means of real protection and again, more a bother than a useful thing. The problem is Mac addresses can be easily spoofed, easier than SSID hiding, and detecting what Mac addresses work on a given network is also pie if you’re the least resourceful as they are transferred in the clear (without encryption).

WEP (Wired Equivalent Privacy)

This deprecated protection scheme for networks (yup, deprecated) is a very flawed but highly compatible security solution for wireless networks. WEP uses the stream cipher RC4, which is unfortunately an old and completely insecure encryption algorithm, so much that WEP’s been delcared deprecated since 2004. In fact, with software mentionned on Wikipedia, I can crack any of your WEP connection under a minute. There’s even step by step articles, not shady and very easy to find, on how to operate the tool that performs Klein’s attack on WEP secured networks. Why isn’t this being pulled down the web? Simply because WEP is deprecated. Such tools are widely available as a proof of concept as to how you should not use WEP protection.

WPA (WiFi Protected Access)

WPA is sort of a half solution. It still uses the RC4 cipher, but unlike its cousin WEP, it implements a different security protocol called TKIP which includes a countermeasure mechanism that makes it impossible to get your network key. However, in 2008, a TKIP vulnerability has been discovered but it only allows an attacker to play with packets on your network (the form in which data is sent out and in). This makes it possible for the attacker to perform ARP spoofing on your network and incidentally sniff data over the air, compromising that data’s security and privacy, and also a DoS attack or denial of service attack (blocking all network traffic, essentially bringing down a server). While a DoS attack may not be of concern for a home network (who would want to DoS attack you, seriously), it certainly is a potential threat for a business.

In other words, WPA remains a perfectly fine solution for home networks and its use of the RC4 cipher makes it compatible with legacy WEP hardware.

WPA2

However similar the name may be, if anything WPA2 is not is similar to WPA. Version 2 is the correctly implemented 802.11i standard. Yes, WPA was made in a hurry before the standard was even finalized so that router makers could address the issues with WEP. This is why WPA support is sketchy, and some routers may offer varients of WPA not intercompatible with other devices. Conversely, WPA2 compliant routers all use the exact same standard, but you have to have recent hardware/firmware for that. Getting WPA2 protection on a computer or router is as simple as having an update firmware, but even recent gaming devices like the PSP 3000, especially due to WPA2′s increased overhead, often do not support it (The Nintendo DSi supports it).

Unlike WPA, WPA2 uses a completely different protocol and cipher, respectively CCMP (Counter Mode with Cipher Block Chaining Message Authentication Code) and AES (Advanced Encryption Standard — AES certification winner Rijndael cipher). Unlike RC4, AES is an extremely sophisticated encryption algorithm used today to encrypt everything from US Government Secret Information to TLS (SSL) secure connections when you shop online.

AES is uncrackable. No one has ever find a way to crack this encryption scheme for the exception of brute-forcing. Brute-forcing a connection involves trying every password possible until you can access the network. However, brute-forcing often implies dictionary attacks, where common words are used against the network’s authentication to find the password. This can be easily avoided with a complete 63 ASCII character key you can make here: https://www.grc.com/passwords.htm

A brute-force on such a key is estimated to take a trillion years, and counter-brute-force mechanisms can slow that down several times. In other words, WPA2 is uncrackable if you use a good key.

The Perfection Solution

Unfortunately, WPA2 is not widely supported on all hardware possible, and making use of combined WPA/WPA2 for increased compatibility breaks your perfect uncrackable unsniffable protection. Fortunately for home users, routers such as the D-Link DIR-655 can handle two networks at the same time. Yup, you can setup a main network in WPA2, and a separate guest network any protection scheme desired for incompatible devices. You can even prevent routing between the two networks so that your secure WPA2 network remains completely isolated from the less secure network.

I use this technique at home to enable compatibility with my PSP, which only supports WPA. My main network is WPA2-only, and my guest network is isolated (not routable) with a WPA-only scheme. This makes my main network, for credit card transactions over the Internet for example, completely secure, while still leaving gaming access for older machines. Since WPA can only be sniffed, it makes also makes it impossible for anyone not authorized to use my own bandwidth, which could happen by leaving the Guest Connection open or on WEP security.

PAE vs 64 bit – What manufacturers don’t want you to know

Note: This article is old, dating back to May 2009, and severely outdated. Whatever the information included in my post below, it’s probably advisable not to care today. As a simple fact of the matter, tell yourself that if your computer does not yet support 64 bit, it’s time to get a new one.

From 32 bit to 64 bit

You’ve heard the drill, any system with 4GiB+ of RAM requires a 64 bit operating system. Why? Because the total addressing space of the memory (the number of locations on the physical memory of your computer) in 32 bit memory architectures is limited to a total of 4 GiB.

This can be calculated with the following formula: 232, which equals 4,294,967,296 bytes, or 4 GiB.

Conversely, 64 bit has the following formula: 264, which equals a rather stunning 18,446,744,073,709,551,616 bytes, which translates into 16 EiB or 17,179,869,184 GiB.

So, it’ll take a while before we run out of addressing space in 64 bit memory architectures.

Anyway, you’ve probably recently heard about that thing called PAE, or Physical Address Extension. Most likely, you’ve heard it as a trick to make Windows XP recognize that extra ram, or your 32 bit Vista perhaps.

Is it true? Does it really work? The short answer is yes. It does work. However, we need to dive in the specifics a little bit to know why in the world Microsoft and other hardware/software companies decided to hide this Intel invention.

PAE Explained

When inventing the x86 architecture, and incidentally the x86-64 architecture (also known by its maker Advanced Micro Devices as AMD64), Intel also created something called Physical Address Extension (PAE) as a feature of the x86 architecture to make up for future memory limitations of the 32 bit architecture.

The trick is not a trick, it is a real hardware feature that consists of augmenting the number of memory address lines on the CPU from 32 bit to 36 bit, bumping the total possible amount of memory from 4 GiB to 64 GiB.

However, the processor remains a 32 bit processor, as well as its supporting motherboard and memory chips, preventing simultaneous use of more than 4 GiB. In other words, it remains possible to exploit the total 64 GiB of memory but no single virtual ram instance (ie. Photoshop) or physical memory unit can use more than 4 GiB of memory at the same time.

This translates into the impossibility of mounting anything else than 4 GiB sticks of RAM in your system, and probably even a total of 4 GiB because of the 32 bit architecture of your motherboard, which most likely only supports a total of 4 GiB of RAM, regardless of the CPU.

Now, since the total simultaneous limit is 4 GiB, you may be wondering how the operating system is capable of expanding this to 64 GiB. Effectively, since the x86 architecture is old and very common, it’s been long since operating systems including Windows and Unix variants like Mac OS X and Linux support the PAE extension, and with a few registry hacks, it’s possible to enable this “hidden” feature of Windows and other operating systems (in which case it’s something else than registry).

In Windows, this is called Address Windowing Extensions (AWE), where it involves mapping/spanning, or “Windowing”, an operating system across to more than a single virtual instance of 4 GiB of memory (or an application).

Why nobody told you

If you’re just a tad geeky in computers, you probably understood right away why software and hardware makers are pushing you towards 64 bit right away instead of temporarily using PAE.

And there, I’ve just said it, it’s a temporary solution. In a few years, systems will already be exceeding 64 GiB of memory, and applications (virtual memory instances) will need much more than blocks of 4 GiB of memory, a limitation of PAE.

Since 64 bit is a much more future-ready effort, instead of having to do two switches in a short period of time, Intel and Microsoft, among other companies, decided to literally hide this capability and make 64 bit the only apparent way to get more than 4 GiB of addressing space.

Should you use PAE

For the moment, if you’ve been using anything else than Windows, for instance, a Mac, the 64 bit issue should not have been an issue for you because prior earlier Apple processors like the PowerPC G5 (before the Intel x86-64 switch) were already 64 bit architectures.

Unlike Microsoft Windows and Linux, Apple’s tight integration of hardware and software allowed a much harder transition, from the IBM PowerPC to the Intel x86-64 microprocessor architecture, to be done in a very smooth way.

However, Microsoft’s much larger user base doesn’t allow this freedom and Windows XP remains a largely 32 bit operating system, seeing very limited support of its 64 bit counterpart, released in 2003, the same year Apple released its own first 64 bit system, the Power Mac G5.

To smooth in the transition, Microsoft also made Windows Vista and its more recent Windows 7 into two editions, 32 bit and 64 bit, although in this case, support is much more widespread, especially caused by the wider availability of 64 bit processors from Intel.

Along with Windows Vista came the mainstreaming of 4 GiB of system RAM and its related problems.

If you were hesitating to swith over Windows Vista 64 bit and found PAE to be a good solution to solve your RAM problem while keeping Windows XP, I would reconsider.

With the imminent launch of Windows 7 and Windows XP’s extended support period ending in 2014, PAE is but a very temporary solution.

How to

But hey, it can be useful where 64 bit is not possible because of a 32 bit processor, so why not. Here goes, this option is compatible with any Intel Pentium Pro, Pentium II, III, 4, Core, Core 2, Core i7 and + processor, along with every recent AMD processors and Athlon series.

Windows XP

1. Open an explorer window
2. Tools > Folder Options > View Tab
3. Check the radio box written “Show hidden files and folders”
4. Click OK to accept changes and close the dialog box
5. Go to your local drive where Windows is installed, most likely C:
6. Locate the file called BOOT.INI
7. Right-click on the file and click Properties
8. In the Properties dialog box, make sure the Read-only attribute is unchecked (checking it will prevent you from modifying the file)
9. Click OK to accept changes and close the dialog box
10. Open the BOOT.INI (default opens with Notepad)
11. It should look something like this:

[boot loader] timeout=30 default=multi(0)disk(0)rdisk(0)partition(1)\WINNT [operating systems] multi(0)disk(0)rdisk(0)partition(1)\WINNT=”Microsoft Windows XP Professional” /noexecute=optin /fastdetect

12. Append at the end of last line the following: /PAE
13. It should now look like this:

[boot loader] timeout=30 default=multi(0)disk(0)rdisk(0)partition(1)\WINNT [operating systems] multi(0)disk(0)rdisk(0)partition(1)\WINNT=”Microsoft Windows XP Professional” /noexecute=optin /fastdetect /PAE

14. If it does, save the file, exit Notepad and restart Windows

Congratulations, your Windows XP system now runs with PAE enabled.

Windows Vista / Windows 7

1. Click on the Start Orb
2. Search for CMD
3. Right-click CMD or Command Promt in the search results and click Run as administrator
4. In the command line, enter:
bcdedit /set pae ForceEnable
5. Close the command line

Congratulations, your Windows Vista/7 32 bit system now runs with PAE enabled.

To install PAE on a specific boot if you’re using a dual/multi-boot system, refer to the MSDN BCDEdit /set command documentation for instructions on how to set the ID of the boot.

Note: Official Microsoft documentation does not as of this writing specify explicit support for PAE in Windows 7.

Note: Applications not specifically built to use the AWE api from Microsoft will not be able to use more than 4 GiB of RAM even though PAE is enabled, this includes about every non server-specific application possible, so no, Photoshop and company won’t support more than 4 GiB of RAM each.

Note: Even though PAE technically enables 64 GiB of RAM, Windows XP, along with Windows Vista 32 bit, is limited to a total of 4 GiB. Detailed information can be found in this MSDN article on memory limits of Windows. Windows Vista 32 bit and Windows 7 32 bit also support PAE, as mentioned in other Microsoft articles spread amongst MSDN and other resources. However, Microsoft documentation isn’t clear on whether it is the physical RAM limit or the physical total addressable memory limit that is 4 GiB so results of enabling PAE may vary. With luck, they meant the total physical RAM limit, and not the total physical addressable memory limit, since this one is already 4 GiB with the 32 bit architecture, which would make PAE useless.

Note: Although the x86-64 (64 bit) architecture does support PAE, Windows’s AWE does not support it.

Note: Even though Intel’s specifications limit PAE to an extended total of 36 bit of addressable memory, it is technically possible to example this amount without changing the hardware. For example, Windows Server 2003 SP1 Datacenter Edition uses a special 37 bit PAE capable of supporting 128 GiB of RAM.

Mac and Linux

PowerMac processors are 64 bit since the G5, and any Intel Mac is 64 bit too, so you don’t have to worry about PAE at all, just the OS’s own RAM limitations (up to 16 TiB of RAM in OS X Snow Leopard 10.6). In Linux, any distro on the 2.6 version and up of the kernel has PAE included native and most often enabled by default, so it’s another thing you don’t have to worry about.

kb, kB, KiB… What’s Up With That?

The world of computing uses many metrics to measure data. Probably the most significant is the amount of bits contained in a given file. However, no one has ever truly agreed on the way we write these measures, and efforts to standardize this are only ever recent.

Nevertheless, many governments and companies have started moving ahead with the adoption of more standard practices, even though companies like Microsoft have not yet adapted.

Why Should You Know This?

But the real question is: why should you, as a user, know this? Why should you care about kb, kB and KiB at all? It turns out that the moment you use a computer, it concerns you. Better yet, with today’s caps on Internet bandwidth, understanding the difference between the various measures is of crucial importance to understanding your telecom bill. There’s more to it than simple measures, but it’s at least part of the puzzle.

First off, the bit and the byte

In computers, data; documents, pictures, videos and music, all consist of binary information somewhere in the memory of your computer. Binary information is represented by 1s and 0s.

These 1s and 0s are understood by your computer via electricity. In its most basic state, 0 represents off, and 1 represents on. Various techniques are used to represent such data in different medias. For example, compact discs present a series of microscopic holes on a circular plate, where a hole may represent 1, no hole may represent 0.

One such 1 or 0 is always measured as a bit. Fun trivia, a standard CD has 5.6 billion possible holes, or in other words a capacity of 5.6 billion bits.

But a bit alone doesn’t mean much. All it can do is tell the computer whether it represents 1 or 0, not very exciting, which is where the word comes in. The word represents the natural unit of data understood by a processor. It’s a group of bits that represents the most basic data set a processor can understand. For example, in typical systems, the letter A is represented by the 8-bit word 01000001. It’s called 8-bit because it has a length of 8 bits. Therefore, if you throw 16 bits at the processor, it will divide them in 2 distinct data sets and try to understand these.

But a processor cannot understand individual bits. It is bound by its word size, meaning that if you throw 7 bits at a processor designed to handle 8-bit words, it won’t understand.

Because of this limitation, bits alone are not a good way to represent data when programming a computer. To solve this, we need to use the byte, a unit meant to represent a set of bits.

Historically, because the byte was not defined as a fixed size, it was analogous to a word. However, in modern systems, byte is universally 8 bits, stemming from the once popular 8-bit word size. But modern processors use varying word sizes, so the use of words to quantify data is ill-advised.

In this way, word is a unit relegated to the specifics of memory management in processor architectures, while byte is strictly used as a general unit of data representing 8 bits.

To sum it all, what’s important to know is that on an 8-bit word length computer, data can only be represented in multiples of 8 bits. Therefore, bits are seldom-used in the representation of data, in favor of the byte, except perhaps in the measuring of Internet connections speed, usually in bits per second.

Handling Larger Numbers

Once computer innovation started to ramp up, the quantity of data we work with continually expanded. Before engineers knew it, they had to deal with thousands of bytes, instigating the need for a way to sum up data into smaller numbers. Unlike money, which exists since way longer than bits and bytes, it was easy to perceive how we would eventually reach much more than billions of bytes of data. To sum up things better, engineers decided to borrow on the SI system, otherwise known as the “Le Système International d’Unités”, or International System of Units, a French invention to sum up large numbers in the metric system. If you live elsewhere than in the United States, you’re probably already familiar with this notation appearing in nearly everything.

In the SI system, a lower case k represents a thousand, a capital M represents a million, a capital G represents a billion, etc. Each measure has its full written form too, which consists of a prefix to add to any measure. For data, in order to represent a thousand bytes, you can say a kilobyte, a megabyte for a million, and so on.

The importance of case sensitivity: The SI system makes great use of lower and capital case for its acronym. For example, while a million is represented by an upper case M or mega, a milli, or 1/1000th, is represented by a lowercase m. When combined with the measures for distances, the very similar Mm and mm represent vastly different things, namely a megameter and a millimeter, the later being one billion times smaller.

In the case of a bit, the SI system does well. Since the SI system is always used at the power of 10, it is possible to use it perfectly standardly with the bit. 1000 bits is equal to 1 kilobit (kb).

However, when it comes to bytes, the story was a bit different. Because of the computer’s binary nature, memory addresses (where the data is physically located on a memory chip) are written in binary sequences. As such, the number of addressable memory locations are counted with a power of 2. For example, an 8 bit address space has a total addressable memory of 256 bytes. However, newer 16 bit memory architectures at the time could now jump-start the 256 bytes limit produce a much more potent 65,536 bytes of addressing space. Applying the SI system at the power of 10 on this produced a somewhat awkward number, 65.536 kilobytes (kB). To remedy the situation, engineers took the SI system and made it at the power of 2 to match memory addressable space, so that a kilobyte would equal 1024 bytes while a kilobit still equaled 1000 bits, bringing confusion still reigning into the world of computing today and a nice round 64 kilobytes of memory for the 16 bit architecture.

Controversy

Invented in America where the metric system has not yet been adopted as of 2009, the secluded world of computing did not wake anyone up with their awkward borrow on the SI system until 1995, when the IUPAC and the NIST proposed to a newer unambiguous system to the IEC, only to be accepted in 1999, giving birth the kibibyte and the mebibyte, a play on the word kilobinary and megabinary, later followed by the other higher multiples in 2000.

However, by that time, it was already 40 years since the computer industry had been using the SI multiples at power of 2. 10 years after the standardization of the IEC format, adoption has been nearly innexistant, with the only systems using the measure being rare very recent Linux distributions. Even the latest Windows and Mac OS X still use the SI prefixes for bytes.

But using the SI prefixes for bytes isn’t necessarily wrong, since the IEC standard allows it. Only, you have to use them at the power of 10, so one kilobyte equals 1000 bytes, and one kibibyte equals 1024 bytes. It goes without saying that this is a highly contested and controversial standard. It does bring clarity and non ambiguousness, but it also brings confusion for legacy systems.

For example, MP3 player users are used to see their storage capacity in GB when it should be in GiB. Adding the i will obviously bring in a lot of questionning in stores where a brand might not use the same notation. Most store clerks would probably been unable to even explain the difference. This is why marketing forces at Apple and Microsoft, along with almost every hardware manufacturer, decided to keep the original measurement.

Additionally, there’s very little explanation to why this system should be implemented at all. Afterall, in traditionnal computing, a kilobyte never meant anything else than 1024 bytes, so there’s very little proof at how the IEC’s newer system may improve the situation. Memory manufacturers won’t start making memory systems in multiples of 10 and so many think that the confusing would just be transfered over to the kibibyte, which would often been mistook for a thousand bytes.

Adoption

However, when it comes to the educated world, standards make a long way. Most of today’s technical documentation and teaching material has adopted IEC’s standard, refering to kilobytes as multiples of 10, and kibibytes as multiples of 2. This means that computer classes in school will start teaching it that way, and that eventually, operating system makers and memory companies will have to adapt, regardless of controversy.

Conclusion

While your computer probably still uses the wrong notation, assuming you’re reading this article as of 2009, future systems will probably adopt the IEC standard, so let’s review the whole thing. I’ve also included the bit version of the IEC, known as the kibibit (Kib) and its brothers, in this review, along with an in-depth explanation of capitalization rules in red.

Note on k/K: Although the SI system makes an absolute use of a lower case k for a thousand because the upper case K is for a degree kelvin, remember that the computer use of the SI system has never been standardized into the SI’s base or derived units and many sources suggest that a capital K can also be used for a thousand.

A KB or a Kb cannot be mistaken for a kelvinbyte or kelvinbit because it doesn’t make sense. Also, since the binary system doesn’t make use of subdivisions and thus does not possess the lower case d, c, m and other prefixes, only the k being lower case would make for an inconsistent notation.

For these reasons, the IEC chose to have the kibibyte with a capital K and many often use a capital K for a kilobyte and a kylobit. Note that the use of a lower case k (ie. kiB) for a kibibyte is not accepted.

As the writer of this article though, I am a very purist person when it comes to standards. Since the kilobyte and the kilobit are both borrowing on the SI standard, I think they should use a lower case k, regardless of application. However this hasn’t been standardized, and you can use whichever you think is better. The review here under uses a lower case k.

1 bit      (b)   = 1 bit          (b)
1 byte     (B)   = 8 bit          (b)

1 kilobyte (kB)  = 1000 bytes     (B)
1 megabyte (MB)  = 1000 kilobytes (kB)
1 gigabyte (GB)  = 1000 megabytes (MB)
1 terabyte (TB)  = 1000 gigabytes (GB)

1 kibibyte (KiB) = 1024 bytes     (B)
1 mebibyte (MiB) = 1024 kibibytes (KiB)
1 gibibyte (GiB) = 1024 mebibytes (MiB)
1 tebibyte (TiB) = 1024 gibibytes (GiB)

Higher prefixes can be seen here for the SI system, and here for the IEC system.

Kibibits, mebibits and else also exist, effectively meaning bit multiples at the power of 2, exactly like kibibytes, mebibytes and company. However, the IEC pretty much created this standard just for the sake of it, as it isn’t really useful. Maybe in the future 1024 bit architectures will be called 1 kibibit architectures, but most architectures are far from 1024 bit in any cases.

Also, another notation exists for bits instead of b. Literaly using the full word bit instead, however invariable. The SI prefix is the traditional k, M, G, etc. and the IEC prefix is Ki, Mi, Gi, etc. So this goes like this: Kibit, Mibit, Gibit. The IEC seems to particularily encourage this notation to further distinguish between a byte and a bit traditionnally only being difference by a lower and an upper case.

Compute your head out of it; WolframAlpha

Some time, brilliant stuff happens on the Internet. What would you do without Charlie Bit my Finger on YouTube… Well, no, really smart stuff, like Wikipedia or Google.

In my opinion, there is now a third product on the Internet that I classify “as smart” as Google or Wikipedia, Wolfram Alpha. Not a very markety name, but hey, every geek knows Wolfram. Go there now, http://www.wolframalpha.com/ and input some mathematics, or anything. Take a look at the examples especially, cause at first you might think you’re just too stupid to use it.

Well, I’m thinking this might just make math homework a lot easier.

USB Key—Permission Denied; What to do

Have you ever searched forums a night long just to find stupid answers like “Did you restart, check your drive letter, etc.” because any OS would tell you that it cannot write to or format your USB key because the permission was denied? If yes, here’s a solution you may have never thought of:

Check on the physical body of your USB key, maybe there’s a write protection switch turned on. Yes, as SD cards do, some USB keys are uselessly equipped with that hardware feature. And by uselessly, I mean that it causes more problems than it solves them because it can become loose, or you could turn it on by accident.

Cause you probably searched for it—Windows XP Tip

Yes it’s about XP. It’s old, but a lot of people still use it, and sometimes there’s some stuff that’s really deep in there. So, here’s the story.

At work, I had this very annoying default that every explorer window would open to; that is, with the folders view on. And there was no way to change it. Further more, turning off the folders view made any folder open in another window, with folders view on.

So, here’s how, in case you searched for it:

> Explorer Window > Tools > Folder Options
>  File Types > Folder > Advanced

And set default to open instead of explore.

Right, not too hard, but definitely awkward place for such a setting.