The confusing world of Newlines – Why your text file may have become a single line

If you’re in the web business, you’re bound to have come across the phenomenon of opening your text file and have it appear completely on one line or in some kind of giberish.

While a lot of it can be attributed to localization and the use of incorrect encodings for languages, we’re gonna talk more specifically about Newlines here.

In simple terms, each system has its own way of encoding new lines in text files with special invisible characters. Windows uses a CR+LF, UNIX a LF and Mac a CR.

Now, Mac OS X is based on UNIX so the CR-only format doesn’t apply to newer Macs, only to pre-OS-X systems. Mac OS X uses a single LF for a new line.

So, if somebody gives you a text file made on a Linux machine or a Mac, chances are Notepad will open it on a single line. Some uploading systems automatically convert the file to that system’s format, and since your web server probably runs Linux, you’re bound to download back your file and open it in a single line.

A very good example is opening Engadget’s RSS XML file in Notepad. It’s pretty ugly. Now this doesn’t mean Engadget is working on UNIX-based boxes, rather, that XML file was probably automatically generated by somekind of PHP script running on none-other than a very common Linux server.

Open it in a proper text editor, say, Notepad++, and you’ll see all of its glory (and the confirmation it uses LF by checking some stuff in Notepad++).

The solution?
There is none.
Yeah… unfortunatly nobody thought of making that a standard yet so instead you’ll need to use a software that supports all the formats. Luckily, about every raw-text capable software on earth, even FrontPage and Wordpad, can recognize all formats, just, not Notepad. With a bit more badluck however, UNIX-based servers aren’t so polivalent as those big software suites and could possibly read your the extra CR in Windows encoded files as crap, resulting in mysterious application errors and what not.

The morale of this story is to choose your software.

  • Linux Server = LF capable editor (Notepad++, Dreamweaver)
  • Windows Server = Anything really, your server’s not going to panick if the code is on one line, except if it’s Ruby. Try to use Windows stuff in this case, any Microsoft software fits the bill, as well as Notepad++ and Dreamweaver.
  • General Public = Windows-compatible. The big majority of users are still on Windows, don’t go publishing raw-text files with UNIX newlines in them if you don’t wish for 95% of the planet to not be able to read your files. (remember .txt files open in Notepad by default, not Wordpad, so you don’t really have a choice but to accomodate for Notepad)