VP8 (WebM) vs H.264 (MP4) – August 2010

So, while I was making a chart for WebM and H.264 bitrates everyone should use when encoding acceptably high-quality videos for the web, I ended up obviously needing to compare the codecs.

With various tests, I concluded that both codecs are approximately equal, except one is superior in the end. Therefore, I can use two sets of identical bitrates for the same H.264/WebM media. Read on to find out. Continue reading

Lucky you! Windows 2K Still Gets Updates After End of Support

As you all know, July 13, 2010 was the last day of support for Windows 2K and Windows XP SP2 (SP3 is still supported until April 8, 2014, which is just simply WTF).

OK. it’s a bit of an overstatement to say that you will still get updates with Windows 2000 Professional today, since it’s gone out of support.

Essentially, what I mean is that if you’re installing Windows 2K, you can still update it to its latest iteration. Now of course, don’t forget new security issues will never be fixed by Microsoft, so once you’ve updated to all the latest, that’s the last you’ll ever hear of Windows Update on Win 2K.

Enjoy your lasting past.

Moving to Windows 7 and sad to see your good old sounds go? Simply search for Change system sounds in the Control Panel and change them with these, the original Win 2K sounds, courtesy of myself: Win2Ksounds

Learn Japanese The Right Way

This is a tale of my experience learning Japanese and how you can make the best of it to enhance your own experience.

What makes me in the position to make such a bold statement as to how to learn Japanese?

I’ve started learning Japanese in the second year of high school. That was sometime between the end of 2001 and the start of 2002. In short, I could safely say I started learning Japanese on my 13th birthday in October, which means that at the time of writing this article, soon turning 22, I will have been learning Japanese for close to 9 years.

In all of those years, I have not mastered Japanese, a short way of meaning both that no learning solution has been adequate up to date, and that I have learned Japanese in a very slow-paced, organic fashion.

The reason I can make such a bold statement about learning Japanese is that I’ve tried, with both success and failure, probably all the ways one could think of for learning a language.

Within these, you’d find courses, text books, dictionaries, kanji books of all sorts, online interactive courses, Rosetta Stone, etc.

I’ve spent a significant amount of money on these solutions and I can say none is quite adequate as a complete solution. Read on to find out about the biggest traps and losses of time while learning Japanese.

You should learn Romaji, but not too much

If you dive in Japanese learning material, you’ll eventually meet some people that will tell you to never use romaji. In fact, they will straight on tell you learning romaji along with Japanese is just an inhibitor to your learning process.

While absolutely true, the reality is you still have to learn romaji anyway. Japanese written in our alphabet is called Romaji. The word Sushi is typical example of romaji used in daily English. The thing is, unlike the imported word Sushi, actual romaji, which can represent any Japanese content, is specifically pronounced in a Japanese way.

While learning romaji certainly won’t help you learn to pronounce Japanese, it’s used a primary way to input Japanese on computers. You would be depriving yourself of an easy way to write Japanese if you were to completely forgo learning romaji. Additionally, learning to pronounce Japanese in romaji can also help to learn pronouncing English the Japanese way, or the wrong way Japanese use for imported words. Considering the surprising amount of English-imported words in Japanese, nearly all new vocab since WW2, romaji will also help you get a fundamental understanding of how Japanese import English words in their language.

This could seem rather useless, but on the contrary, it will help you learn English-bound Japanese vocabulary faster, and it will help you understand Japanese people pronouncing foreing items in Japanese right in your home country. For example, I could not have guessed the sushi chef meant Carlton university when he asked me if I studied at Karuton university. It might seem like an easy thing to get, but it takes training to perceive such a word in fast regular native conversational speed, which is execeptionaly fast in comparison to the pace of English speech.

Don’t learn Japanese with romaji

While you should definitely learn romaji, one of the common pitfals of learning Japanese you should avoid like the pest is learning Japanese with romaji.

All this will do is hamper the speed at which you can learn actual written Japanese, which is composed of linguistic concepts simply impossible to represent in romaji. Many, many things are ambiguous in romaji and could lead you to learn bad pronunciations and ways to write a word in kana (the ensemble of Japanese scripts), further slowing your intake of the language.

Keep in mind you could always learn Japanese with romaji and only learn to speak, doing so won’t prevent you from learning Japanese but it will take you a lot more time to eventually learn all of Japanese. I should know, it’s the path I took and it’s a major stick in your wheel.

Avoid kanji books

Quite frankly, Kanji books, and practically all of their form, are just like dictionaries.

Trying to learn Kanji like that is like trying to learn French words one by one with no context other than a definition.

Kanji, in Japanese, aren’t just an alphabet and should not be learned by heart. They form literal vocabulary in context and learning them by heart is the most innefficient way to learn them.

Let me repeat myself here. Kanji are not an alphabet, no matter what people say to you, they are words and parts of words.

There are a few basic kanji you can remember that will help you understand and properly look at more complex kanji, but just like there is no trick in learning vocabulary other than to use it, there is no trick in learning Kanji.

The best way to learn kanji is to learn them via their primary function, a reading device.

The best textbooks you will find will gradually introduce kanji right in the text and tell you about them and their associated use in context, just like you would learn English by reading.

Additionally, Japanese is much easier to read with kanji, because they bring precision and structure to the text. Without kanji, Japanese would be an excessively ambiguous language to write. While learning Japanese, you’ll discover that reading without kanji is an extremely confusing and painfully structure-less experience.

Learn without translating

One of the most common mistake while learning languages is the act of translating.

For instance, learning words by comparing them with their equivalent in your own language is the most innefficient way to learn fluent speech.

What you have to do is learn to identify things and speak about things by thinking in Japanese. This is why reading is also so important in order to learn katakana and hiragana, Japanese’s two phonetic alphabets, because you have to memorize the sound they represent, and not the English romaji equivalent they represent. Reading will force you to do so because their is no other way to enhance your reading speed. You might read like a child at first, but regular practice will solidify your memory of these phonetics. The same goes for kanji and their meaning.

In the same regard, visual language learning solutions like Rosetta Stone provide an excellent way to learn many concepts in a native way, rather than by translating. By doing so, you’ll be learning to speak just like you learned as a child, breaking the oft said barrier to language learning that is adulthood.

However, Rosetta Stone and the such should not mean forgoing a good grammar study. Just like in any language, grammar is fundamental to mastering a language. Some people, notably Rosetta Stone, will try to sell you their lack of grammar tools by saying no child learns grammar to learn a language. That is correct, and grammar is in no way a natural aspect to language learning, so you shouldn’t go overboard with it, but literate people, including educated nations’ children, learn grammar.

Grammar is not a learning tool

Contrary to most beliefs, grammar will not help you learn to speak. The natural occurence of grammar does not exist. Languages are rather defined by a list of exceptions and accepted uses in varying contexts, which humans refer to as grammar.

Because grammar is so intricately unnatural by definition, learning it won’t really help you since languages don’t follow any given logic perfectly.

Rather, the best learning tool is context and use cases. Again, this stresses the importance of reading, which will provide you with an array of valid and often well written use cases of the language. Trying to understand the subtleties of a language’s grammar and structure won’t help you learn to speak fluently and write correctly.

As a proof of the preceding statement, ask yourself this question: have you ever thought about the grammar of what you were reading just now, or what gossip you were telling your friend over the phone yesterday?

Chances are you answered no, because the use of grammar is not a natural occurence of a language. It is simply humans’ attempt at defining our languages.

Every language has two grammars

When referring to grammar as the set of rules and exceptions defining a language, it’s observable that every language has two grammars.

One of them is the formal, correct, or written grammar, and the other is the incorrect, or spoken grammar. The problem with text books, or even Rosetta Stone, is that they only focus on correct grammar.

However, speaking remains an essential aspect of any language and forgoing learning the bad spoken grammar can mean you’ll never be able to understand spoken Japanese.

So, in your oh so important curriculum of reading practice, you should also add listening of spoken language, like Japanese television shows. If you’re an Anime fan, you could watch Anime and read Manga, although many manga use a speech-like grammar, so be aware that reading traditional texts is also essential here.

This will help you train your ear to the spoken language, as well as help your pronunciation greatly. Many university students I met who took Japanese courses were impossible to understand because of their thick accent and major pronunciation mistakes. Out of these students who happened to actually speak well, all of them actively watched Japanese anime and drama. Lots of them also listened and sang Japanese musics. In fact, if you’re into music, singing can be a really good way to learn to pronounce. Additionally, you will be exposed to faster speech and lots of native pronunciation variations by doing so.

In other words, if you don’t include a correctly spoken curriculum to your Japanese learning, you’ll always be a lousy speaker.

No, it’s not correct if you speak a language with an accent from another language. It just proves you didn’t learn the language as you should have.

I heed this warning especially to English speakers, who, because of the difference between Japanese and English sounds, will find a greater deal of difficulty to pronounce the language than say, a French speaker.

Also watch out for non-native university teachers teaching you the wrong pronounciation. Heck, native English teachers even make mistakes while teaching English, which explains the proliferation of such words like Template being pronounced as tem-pleyt instead of tem-plit.

Japanese luckily does not suffer from such ambiguities, so simply having a native speaker say it for you should be enough.

Conclusion

In the end, there’s nothing like practice and exposure to a language, but I hope my advice will help you choose the best material and avoid the worst.

As always, just remember exposure to real written and spoken Japanese material is your best bet at advancing your skills.

To help you in your quest, I advise you to pay a visit to jisho.org and smart.fm. Kodansha also makes excellent books like the Communicative English-Japanese Dictionary and Japanese for Busy People. Rosetta Stone is also a good place to start, and all the speech is from native Japanese, unlike many other audio-based solutions like Rocket Japanese and JapanesePod101, both of which you should avoid. You might also want to check the Rikkai-chan browser add-on which can help you read Japanese online, although it’s only useful if you’re already fairly comfortable with the language. Books like Japanese for Busy People, notably the kana version of the first tome, will be better as initial reading material.

None of these will provide a complete guide to the language. Instead, use them in conjunction along with native material like books and television shows and you’ll be on your way to speaking and reading fluent Japanese.

Google Pacman

Google has released today probably the coolest doodad you can make for a logo. It’s a JavaScript version of Pacman, along with the original sounds, although these use Flash.

I’ve managed to save a local copy of the game for the sake of keeping such a cool piece of JavaScript code. I hope Google or Namco don’t mind. In any cases, it doesn’t run directly on my server, for reasons I still ignore (I have to go through minified JavaScript from Google and it’s a real puzzle), but this copy should manage to run from your hard drive.

I’ll probably be releasing the JavaScript code in human-readable form soon so that we can all learn from this cool piece of JavaScript. Additionally, with the <audio> element, HTML now has a sound API (kinda), so I believe it would be possible to implement this game without having to use Flash at all. Google didn’t do this, but it’s understandable as the game works as well on IE.  A pure HTML5 solution probably wouldn’t be so cross-browser for now. With the launch of WebM however, I have a vested interest in exploring the possibilities.

Google decided http://www. was too complicated

In a very surprising move, Google just touched what no one dared to touch before, what actually appears in the address bar.

Well, actually Microsoft did decide to touch it once, by graying out everything that isn’t part of the domain in the address bar when you’re not typing and your mouse isn’t on the address bar. Google copied them with Chrome but botched the whole goal, removing the un-graying action when you’re actually typing something and leaving the whole domain name including the sub-domain in full black, which makes the point of doing such a thing pretty worthless, since it’s technically supposed to help people see when they are on phishing web sites.

For example, a fake PayPal web site would look like this in both scenarios:

IE 8 http://paypal.fake.com/something/blabla.ext
Chrome http://paypal.fake.com/something/blabla.ext

I thus think it’s very clear for the end user of IE 8 that they’re not on paypal.com but on fake.com, however the distinction isn’t so clear in Google Chrome. Note that Firefox, Opera and Safari provide absolutely no such clues to inform users on the validity of a web site. As a reader of this blog, you might think that having highlighting is unnecessary and that even without it anyone can tell if it’s a fake address, but you’d be surprised at how frequent Internet users get fooled quite easy, even on IE 8. Actually, there’s a strong percentage of users who don’t even know what a sub-domain is.

But in Google’s case, I tend to believe they highlight the address that way to favor quick identification instead of doing it for security purposes. Considering Google’s practical aim for speed, efficiency, and, well, search, it’d make a lot of sense for them to think that way, especially given the efforts they have gone through to simplify the address bar. Google is the first to combine the search and the address bar by default, and Chrome is also unique in being the first browser to prioritize on direct results (ex: engadget.com) while typing in the address bar instead of giving the most recent history item (ex: www.engadget.com/2010/04/13/editorial-engadget-on-microsoft-kin/).

What Google Did

Starting with Chromium 5.0.377.0 (I think, but I’ve noticed it since I installed this release, jumps from 370 so it’s in-between), Google decided that the famous but confusing http://www. was no longer. Instead, it’s now stripped from the address bar on whatever web site you go to.

So, what’s up with that and why is it so important? Keep reading to find out.

History of the “www”

WWW, or World Wide Web, is well, the Internet as we know it today. Back then, FTP and NNTP protocols used prefixes like ftp:// and nntp:// in order to define what they were. They still do today, but the practice of using such prefixes to define the services was transcended into the WWW.

However, because web sites weren’t transfered over the www protocol but on the http protocol, which they still are today, it was impossible to prefix a web site with www://. It had to be http://, because of its underlying protocol. The Internet web sites however, came to be known as a whole as the World Wide Web, or WWW, and not HTTP. And so, in order to market web sites as being web sites, the generic www sub-domain was prefixed to practically every web site in existence.

When web sites were just a new thing, the practice of prefixing addresses with www was meaningful. Anything preceded by www on a business card for instance automatically referred to a web site.

They could have wrote http, but web browsers were quick to implement automatic http:// input before it even the web even became mainstream. Given that the HTTP protocol had to be written http://, www was much more elegant, hence its popularity in the mainstream market.

www is actually a sub-domain

But in reality, www is actually a sub-domain. Contrary to popular belief, there is no real difference between www, ww2, www2, blablabla and somethingelse. If any of these precede a web site, they’re a sub-domain. Some people are so used to www they think they have to put it in front of actual sub-domains, resulting in ugliness like www.sub-domain.domain.com. Yes, this is actually a dual sub-domain, and believe it or not, but famous web sites like deviantART implement it to avoid cases where users would type www.user.deviantart.com and come up with an error instead of the user page.

However, a sub-domain is distinct from its domain, so http://www.pacoup.com and http://pacoup.com actually don’t necessarily point to the same web site. Usually though, practice has been that http:// and http://www. point to the same web site, with one often redirecting the user to the correct sub-domain or domain, which is the case of most dynamic web sites that require a consistent web domain in order to function properly, this blog not being an exception.

In fact, you can test it yourself. Simply type www.pacoup.com and press enter, and you’ll find that you’ve been redirect to pacoup.com without the www. In most cases where a web site works on both domains, it’ll be a static web site.

Arising problems because of uneducated admins

In the old days of the web, www became so well-known that a significant amount of web sites didn’t work at all without www before the address. In fact, the host didn’t even support accessing the domain without the www sub-domain, which led to the false belief that web addresses must be preceded by a www. Heck, the ability to access a web site with or without the preceding www even became a feature on some low-tech-consumer-targeting hosts such as 1and1 and still is today.

Unfortunately, major web sites like practically all the Government of Canada web sites and myriad other small web sites don’t work without the www. Even the USA’s CIA web site doesn’t work without it because its SSL certificate is attached to www.cia.gov and cia.gov.

Fortunately, in the case of Google Chrome, it redirects you to a Google Search page proposing the correct address. Baldy setup SSL web sites like cia.gov however throw you a very pretty dangerous web site warning.

To www or to not www?

I’m very much in line with what Google thinks, www should go. The prefix www is inherently wrong so you should not use it. It’s not questionable, it’s just the way it’s made. Referring to your main web site via a sub-domain is just plain stupid, even though the industry’s been doing it for ages.

For the sakes of compatibility, you should include a redirecting www sub-domain for the uneducated masses out there, but you should never default your web site to www. Don’t redirect users from http:// to http://www., redirect them from http://www. to http://. That’s the way it’s meant to be, and in fact, the first web site to ever exist didn’t use any www prefix (although it was a sub-domain, namely nxoc01.cern.ch).

Is there confusion because you didn’t include the www? No, frankly, people today recognize .com, .net, .org, or .anything for that matter. Advertisers discovered it was more efficient to forgo a prefix before their company name. www.microsoft.com/silverlight is not as easy on the eyes as microsoft.com/silverlight, and the later provides the added benefit of providing the reader with the name of the company first instead of a generic “www” marking.

How Chrome handles it

In all actuality, Google did not remove the entire www sequence of web sites, they just removed the protocol sequence. In fact, it would be suicide to remove any www, because let’s not forget it’s a sub-domain. Such a rule would also be bound to remove regular sub-domains, which are very important.

But on web sites where Chrome detects it makes sense, such as www.engadget.com or www.techcrunch.com, the www is also stripped out. I don’t know how this is figured out, but it looks like it relies on a few tips, such as if the web site is powered by WordPress, etc.

Chrome is also smart in that it only hides the http sequence. Copying any address from the bar will copy the complete, http sequence included, address.

Simplicity Rules

In other words, Google has decided to push the simplicity of the web even further by getting rid of the www. Google Search still runs on the www, but I’m guessing changing this one is a bit more complicated considering the size of it. If this Chrome thing sticks though, I think it’s easy to see a very close future where Google.com comes without the www.

Advice for Web Masters

  1. Redirect www to the raw, http:// address.
  2. Never refer to your web site by including the word “www”
  3. Keep a www sub-domain for compatibility reasons
  4. Teach your friends and family to type addresses without the http or www sequence
  5. Contact the owners of web sites who don’t provide www-less access with a polite advice email/letter
  6. Twit a thank you to web sites like arstechnica.com that use the correct web address syntax

What’s the ideal video quality for Theora?

I’ve made a few tests on –videoquality encoding with Theora via ffmpeg2theora which I think is the best way to encode OGV media. Earlier tests conducted with ffmpeg2theora 0.25 reveal that 2-pass Target Bitrate encoding with Theora is less efficient than videoquality encoding, which is a one pass constant quality variable bitrate encode.

This might be surprising if you come from an H.264 encoding background, where quality-based encoders are almost nonexistent, but it’s quite the buzz with Theora. I could get the best quality encode possible on a somewhat motion-intensive animation with only 4.4 Mb/s.

Here’s the result of an intensive scene taken from an AMV that will appear in February at the G-AMV 2010 contest made by AMV-Canada. I’m the president of AMV-Canada and I do pretty much all of the grunt work of web encoding for the moment, so I was testing OGV encoding for our HTML 5 Open Video technology.

This scene is a very intensive scene which goes up to 7 Mb/s in VQ 10 . What you have to look for is discrepancies in the screenshots from VQ 9 down to VQ 0 comparatively to VQ 10. The video is a 740 x 410 pixels, which is a close approximate of what you can expect for an average SD video.

Benchmark

VQ 10

Size 99.03 MiB
Average Bitrate 4499 kb/s

VQ 9

Size 78.35 MiB
Average Bitrate 3589 kb/s

VQ 8

Size 61.41 MiB
Average Bitrate 2770 kb/s

VQ 7

Size 48.91 MiB
Average Bitrate 2255 kb/s

VQ 6

Size 39.79 MiB
Average Bitrate 1800 kb/s

VQ 5

Size 31.43 MiB
Average Bitrate 1415 kb/s

VQ 4

Size 24.52 MiB
Average Bitrate 1117 kb/s

VQ 3

Size 19.12 MiB
Average Bitrate 859 kb/s

VQ 2

Size 15.19 MiB
Average Bitrate 682 kb/s

VQ 1

Size 11.40 MiB
Average Bitrate 512 kb/s

VQ 0

Size 8.86 MiB
Average Bitrate 398 kb/s

Analysis

What you can observe is that the codec, in general, doesn’t start to drop in perceptible quality until VQ 8, and it’s still very subtle. VQ 8 is ideal for high quality encoding with minimal space.

VQ 7 starts to have noticeable blocking going on which often plague Theora videos. It’s very particular and while not exactly in your face, it’s always sort of there in lower quality Theora videos. VQ 7 might be more adapted to high quality web streaming.

VQ 6 and VQ 5 on the other hand fit perfectly in what the average H.264 encode looks like in terms of bitrate. While of substantially lower quality than H.264 at the same bitrate, both choices remain watchable and ideal for the web.

From VQ 4 and down, the codec starts to lose its ability to define straight lines and makes text increasingly hard to read. VQ 1 and VQ 2 then lose any ability to define detail and text becomes unreadable in most cases.

You can also enable –optimize in ffmpeg2theora, which makes encoding slightly slower and increases quality a bit. There’s definitely less blocking, but the difference is so minimal that with side-by-side comparison it’s often hard telling which is actually better. So while this might be a must when encoding, it’s definitely not worth re-encoding your whole library for it.

Opera on Mac gets new makeup; From 9.6 to 10.5

I’m seriously happy about Opera right now. Not only have they succeeded at keeping up with the technical updates, their browser is now finally another proud owner of a native shell on Mac.

This whole native shell movement really makes my eyes happy. Afterall, we did go from 9.6 to 10.5:

From the left to the right, Opera 9.6, Opera 10.1, Opera 10.5 Pre-Alpha for Labs.

10.5 also has a nice addition, smoothly animated tabs à-la-Safari.  We might see additional changes by the time this hits release, but it really makes me want to use Opera more. (Note, Opera 10.5 has other interface change niceties not covered here, but you get the gist of it; more animation and coolness, including for the first time being able to tab through checkboxes on Mac)

A note on speed

Opera Software proved once again that sheer technical expertise can surpass open source communities, as well Google on that note. On my computer, I got 393.2 ms for Opera 10.5 and 428.4 ms for Chrome 4 in the SunSpider JavaScript Benchmark, and it’s apparently even faster on Windows but I didn’t test that yet. Amazing.

Learn Java the Awesome Free Way

In my new career as a Java developer, as I’m French, I’ve been using the siteduzero.com’s Java tutorial. Actually, that’s just when I didn’t want to get my big Deitel and Deitel book out, but seriously, forget about SiteDuZero’s crap. I’m really sorry for the author, but as a message to you: “Your code sucks”.

All that said, if you want a serious and reliable source to learn Java, just try Java Tutorials! It’s an official publication from Sun, it’s free, and it’s more than complete (check out that massive index). You can even buy it in book form.

The only hiccup with this is the coding style. While K&R might be Java conventions, it might not be your cup of tee. I know, I’m allergic to K&R too and I prefer Allman. If you really can’t look at Sun’s code, there’s always Deitel and Deitel’s Java books.

Note: This does not cover Java EE. For that go to the Java EE 5 Tutorials (lol, it’s amazing really, they have it for like everything).