Robert Speer Web Development: Symfony, PHP, Wordpress, Business Analysis

16 Tips for Multi-Language Websites

Most of us in Web Dev are familiar with making CMS based websites, but adding languages is a very occasional task.  The good news is that multiple languages don’t have to be very hard, but the bad news is that each language added increases the content management task’s complexity, and done poorly translations are a budget killer.

Most CMS platforms like Drupal, WordPress, & Expression Engine can handle translations in some capacity.  When I did my research I found my preferred CMS of Apostrophe 1.5 did a good job, and I was happy with the results.  However my best advise is to pick a good team and do what they suggest… after verifying of course.   If you’re looking for an experienced team with competitive U.S. rates please contact me to discuss, even if I can’t help you I’ll probably be able to point to you to someone that can.

Web Site Translation Terms:

Web Site Translation Gotchas:

  1. The Russian language is the Ivan Drago (rocky 4) to your layout. It’s going to come in and blow stuff out just because it’s bigger.
  2. Every Language is likely to need some tweaking, this needs to be mitigated by having a layout with high tolerances for variable length content.
  3. Facilitate the language specific changes by appending the language code into the body tag’s class. This allows CSS rules to be written for specific languages. Example: <body class=”sinatra en”> OR <body class=”sinatra ru”>
  4. Each alphabet is going to need it’s own set of fonts. Very few fonts have every character of every alphabet. Seriously don’t ignore this, if you do you’ll be wondering why everything looks different in IE9, and I’ll forward you this URL and be a jerk about it. Examples of different alphabets would be Cyrillic for Russian & Greek, & every Asian variant of characters.
    1. [update Dec 11, 2012] A good reference for Chinese Fonts:  ”… recommendation for a sans-serif font stack is: font-family: arial, 黑体, 微软雅黑, 宋体, sans-serif;”
  5. Images:
    1. avoid text in images
    2. avoid images in the CSS files that change based on language
    3. depending on the CMS image captions may be difficult to translate
    4. often images will need to be re input into the CMS for each language this is not fun
  6. Video:
    1. Avoid embedded text, it’s expensive to re edit videos
    2. Make sure you plan on translating any voice overs
    3. You may find that services like YouTube.com & Vimeo.com are not available in all countries, like China
  7. Arabic & other right to left languages are tough, if you don’t need them call that out to help lower your estimate
  8. Start looking for a translator service ASAP, they can be a PITA, perhaps worse than developers ;)
  9. Try to get your translators to use the CMS for translations, that way they are more likely to fit the space provided & context. Plus you don’t have to migrate the content from one format to another and deal with the inevitable human errors in languages you don’t know.
  10. Make sure each language has it’s own URL, & language meta data is set correctly, that way search engines, like google, will index the site correctly
  11. You’ll need to decide if each translation is representing a language, country, or culture. This is so you can use the correct ISO code in your URL’s and database meta data. ISO code resources are linked in line above.
    1. Language codes are for the language family only
    2. Country codes are just for the country and imply a site that also has it’s content tailored to the country
    3. Cultures are a combination of country & language (British English, American English), you may prefer using cultures if you’ll have different content for different regions using the same language, this is L1on.
  12. Content change monitoring across languages is going to be tough. Most CMS’s don’t have logging for when content is changed so you have to keep track of where each language is. For example if you’ve changed the English version, but not the Spanish version.
  13. Most of your problems are going to be in CSS, a frontend developer experienced in multiple language websites will help the project go smoothly.
  14. Testing is a MUCH bigger problem. Instead of all pages in all browsers, it’s all pages in all browsers in ALL LANGUAGES. And the types of issues that come up require eyeballs & discipline to find.
  15. Avoid flags or other symbols of nationality, and make sure to respect things like simplified & traditional chinese, which implies mainland China or Taiwan, and both sides may have hurt feelings when you mix them up, so don’t be rude.
  16. Google Translate is not very reliable, but it looks good in a pinch.

All these things will cause you pain and suffering if you are not prepared for them.

Got more tips? Please post them in the comments, and thanks :)

Thanks to Betsey Kershaw of Sugarbeet Creative for commiseration on these hard knocks & the headline.

Be Sociable, Share!
  • Great collection of tips. Wish I had it when I did my first multi-language project.

    Some additions, though:

    1) German is also likely to blow up you layout due to long words (“Donaudampfschiffahrtsgesellschaftskapitän” is a perfectly acceptable word). Corollary: Think about hyphenation, preferrably automatic.

    2) Get a decision about how to address the customers ASAP (http://en.wikipedia.org/wiki/T%E2%80%93V_distinction). They might be reasonably put off by being addressed like a family member, let alone like a spouse.

    3) Set up a glossary early on. Not every translator is versed in technical or financial terminology.

    4) “culture” is a Microsoft term. Java developers use “locale”.

    5) Try to use the same charset for all languages, if possible. IE needs special TLC (http://stackoverflow.com/questions/153527/setting-the-character-encoding-in-form-submit-for-internet-explorer)

  • Great compilation, very useful tips for an early approach

  • Hi Robert,

    Avoiding text in images is easier said than done. Sometimes it’s a necessary evil and needs to be done.

    Now a note about Google translate, I think it’s weird that it doesn’t have a built-in learning system – Google translate is the same as it was 2 years ago. On the other hand, it’s still much, much better than the alternatives.

  • @itoctopus: I can see that sometimes text is going to be an image, it’s pretty rare, but it happens. If an image really needs to have translated text in it, then it’s very important that the image not be loaded through CSS so it’s easier to translate.

    I did stumble upon the Google Translator Toolkit that looks like does have some kind of learning:
    http://translate.google.com/toolkit/list?hl=en#translations/active