A few reasons why Rosetta should not be considered as a translation platform for existing open source projects

Last week I received an invitation to join the LXDE project, “an extremely faster, performing and energy saving desktop environment maintained by an international community of developers”, to collaborate and help organize their translation effort. As I have maintained the Brazilian Portuguese translation for Openbox, one of the components that make up LXDE, and am very much involved with the translation of GNOME, KDE, XFCE, and other modules, I felt very compelled to accept the invitation.

One of the very first questions posed to the translation guys was which tool/platform to use in order to empower collaborators and provide an easy way to organize and maintain the translation effort. A poll was set up giving people 3 choices of platforms to vote on:

The initial result showed that Transifex was the favorite choice by the majority, who also opted to have it hosted locally instead of taking advantage of the instance already setup by the Fedora project. Those of you who have followed my work from afar may be surprised to know that I also voted for that same option, going as far as pointing out that Launchpad’s Rosetta should not be considered for this enterprise. Those of you who have followed my work more closely probably know that through the years I have reduced my use of Rosetta to zero, choosing to do my translations directly with the upstream projects instead.

I have been asked about my reasons for not supporting the same tool that helped me get started in the open source world, most of which were replied to via personal emails. I have also spoken to some close friends about this same topic, who have encouraged me to write a blog post exposing my position and maybe allow more people to learn how strongly I feel against the use of Rosetta for existing open source projects.

Disclaimer

First of all, I want to make sure a few things are very clear so to avoid confusion of my intentions. For a long time Rosetta was my tool of choice and the one I recommended to anyone wanting to join the translation hordes. The reason?

  • It was dead easy to use!
  • It bridged the (huge) gap between regular users and open source projects!
  • Being web based meant anyone could contribute from anywhere, using any OS, so long as you had internet connection (yeah, you could also download a PO file and work offline)!

For quite a while I was probably one of the most active Rosetta users (at least that is what I was told by someone really close to Rosetta) out there and a huge supporter. I still believe that the 3 points above are very much valid today. Rosetta has a lot of potential and I whole heartily believe that its developers are trying really hard to tighten it up.

Now that I have added my disclaimer, let me explain what caused me to go from being a great supporter to someone who doesn’t recommend it for translating existing open source projects.

Seeing the tree but not the forest

A common work flow for someone using Rosetta is to search for modules that need some work done (either missing translations or messages that need to be reviewed) and then filter it to display only those messages that need to be worked on. Great, right? Not quite! What if you were translating the fictitious application “Mouse Trap”, a board game where you move a cute little mouse through a maze to get to a pile of cheese, and asked Rosetta to only display messages that had yet to be translated?

Let’s just say that after filtering for the untranslated messages, you are then presented with the string “Mouse“.  Since the context here is a board game with a little mouse, one could fairly easy be tempted to translate mouse as the equivalent word in your language that describes an actual mouse, a rodent. That was easy and once you commit your work you’ll have a 100% translated game. However, what if that specific message was located in the configuration dialog for this game where the user can select how to control the mouse through the maze? Since we’ve specifically filtered for untranslated messages only, we missed the previous message that provided some useful messages “Please select how to control the character:” and “Keyboard“, which were already translated. If you speak Brazilian Portuguese you’d probably be confused with the choices to control the game via a keyboard or an actual (the rodent) mouse, as there are 2 different words to differentiate between the animal from the pointing device.

Obviously this is not a problem with Rosetta per se, as I believe that in order to keep quality translations a team should always look at the whole picture before committing their work. Unfortunately to Rosetta, this feature has most definitely generated a lot of issues similar to the one I’ve attempted to describe above. Before someone starts using it to contribute with translations, instructions about the team’s work flow and processes should be something that a new contributor gets as part of their “welcome” package.

RReedduunnddaaccyy

Through the years it became apparent to me that there was a lot of redundancy happening with the work being done by the translation teams. Since most open source projects have their own translation teams, we often had situations where (for instance) GNOME and Ubuntu Brazilian Portuguese teams would work translating the same piece of software in parallel. Because the GNOME team (from now on called upstream) already had their own vocabulary and process in place, it was trivial for someone to overwrite their work in Rosetta, something that understandably ticked off many upstream translators. The Ubuntu Brazilian team got such a bad rap for this that we would be immediately spurned from IRC channels, or mailing lists related to upstream projects. Heck, I was even called names by the Brazilian KDE / Latin America representative (or whatever that asshole’s “title” is) just for introducing my self and trying to start a collaborative relationship so to avoid redundancy.

Swimming against the flow

The real problem in my opinion, and the reason why I believe Rosetta is flawed from design, is that the work done by anyone using Rosetta will only benefit Ubuntu and its derivatives. There is no two-way traffic happening as far as translations goes, and none of the many thousands (yes, thousands) of strings that I’ve translated during the many hundreds of hours I spent translating with Rosetta ever made its way back up to upstream, where other distributions could also use them.

Everyone who’s worked with open source projects know that if you’re going to make use of it, the least you can do is send any, if not all, improvements made back to the source. It doesn’t necessarily mean that your patch/work will be taken or incorporated but it is something a good citizen from the open source world should do, specially a project like Ubuntu that is so popular among the generic GNU/Linux user base!

What to do then?

I have many times attempted to provide more feedback and action plans to improve these areas in Rosetta, even presenting during this year’s FOSSCamp at Boston in February to a small group including Mark Shuttleworth and Troy Unrau representing the KDE project. My last attempt was when I wrote this blueprint trying to detail a possible new translation workflow whereby quality wouldn’t be sacrificed in favor of quantity and contributions done via Rosetta would start to make their way back to the upstream projects. Unfortunately nothing has ever come out of my attempts, which after a while became less frequent from my part. I understand that the Rosetta team is fairly shorthanded with the huge list of issues and features pilling up on their queue, so it is not their fault that some of the massive features can’t get implemented in the short term. But as I have talked to many people during these last few months about this same topic, I dare to ask:

Could the reason why Rosetta is still doing things backwards be more aligned with how much return these features could fetch Canonical from enterprise users?


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

AddThis Social Bookmark Button

23 Responses to “A few reasons why Rosetta should not be considered as a translation platform for existing open source projects”

  1. I don’t think it’s intentional on Canonical’s part that Rosetta isn’t getting the right kind of care and attention it needs. “Good enough” seems to work and so the resources are given to other projects instead. I think it’s simply that good, collaborative translation systems are REALLY HARD to develop. Desktop usability is the current shiny–which is a good thing to have as shiny! But it’d be awesome if we could figure out how to make translations a shiny project as well.

    I’ve worked on a few different Web based systems (as a developer, not as a translator), and was also the I18N coordinator for the Linux Documentation Project for a little while. I’m interested in working more on the problem. I think you’ll find that others are interested as well. Please ping me directly and let’s figure out how to make translations a shiny problem worth solving. :)

  2. Hi there emmajane, thank you so much for your reply. You can expect an email from me as soon as the Christmas hype is over. :)

    Merry Christmas!

  3. I don’t think there is some kind of conspiracy or evilness. My personal experience shows that they just don’t care about this area enough. Just take a look at my i18n-related bugs in LP – most of them are ignored, or nobody has time AND some clue to solve it.
    Probably this would change if we could send them paying customers bitching about the quality of Ubuntu translations :) . Sending bugs and hatemail seems not to help, apparently.

  4. “Swimming against the flow” does not make any sense. You say this is for an open-source project, not a Ubuntu package.

    An open-source project *is* the upstream.

  5. Vadim, the proper way would be to do the work upstream and then let it trickle down to distributions and derivatives. Ubuntu does the opposite and that is what I wrote about.

  6. Jaap Woldringh Says:

    I think your post very interesting. I am a Dutch translator for KDE and use Kubuntu and Ubuntu, which I both love.
    But I hate to see what becomes of some of my translations for KDE and some other programs when I see them in (K)Ubuntu: sometimes they are not used, or even worse: sometimes they are replaced by translations that are many years old and very obsolete. Mistakes exist (again) from times before I took over the translations, and which induced me to offer that I correct them, knowing (some) mathematics, physics, astronomy (and navigation, but that I did not have to use yet. And some programming, which comes very handy sometimes).
    I have translated for Ubuntu too, message string by message string, via Rosetta: the translations did not get through (I think by a democratic vote, where I translate the more specialist programs such as KStars, KMplot, Qalculate, and so on, and see that my mathematical and physics translations are not accepted). My translation of gcalctool, from a .po file, took 3 or 4 years before it was used. And at least once (I do not check that often, just to take care of my heart :) ) that translation is replaced again by the previous very defective one (mathematically speaking).

    In the Netherlands KDE is translated by a very motivated team in which one person coordinates all translations, and supervises the process. This works perfectly, and I can safely say that the Dutch KDE translations generally are of a high standard.
    This I can see in a distribution such as openSuse, which I have on one of my computers, just to see how my latest translations look like, without having been tampered with, such as quite often happens in Ubuntu.

    So I quite agree you being against using Rosetta now. Personally I am glad that modifications by Rosetta do not get upstream, so that in the other distributions the translations of KDE remain of a high quality.

    Translation is the ONLY aspect where Ubuntu sucks.

    Greets, and a merry Christmas, and of course a very good 2009,

    Jaap

  7. Can you go into why your don’t want to list the module at the existing transifex website?

    You have every right to want to build a new instance, its open tech. But by doing so you aren’t necessarily benefiting from the growing existing transifex based community which works at the existing site. So with that in mind, I’d like to hear the reasons why you want to set up a new service instance.

    If the primary concern about the existing transifex implementation is that its too closely aligned with Fedora contributor process, note that the devlopers behind transifex are planning on going independent with transifex.net. Yep that’s right. An open technology incubated by Fedora has the freedom to grow its own business interests outside of Red Hat’s. Isn’t that neat?

    http://www.indifex.com/about/

    You should talk to them about their plans with regard to transifex.net and how its going to interact with other transifex implementations. I don’t know all the ins and outs, but they could have already figured out how to federate multiple instances and have different translation communities working together.

    -jef

  8. Hi Og,
    After the launch of Ubuntu, I read from my Debian mailing lists, the critiques of the Ubuntu project and its efforts. It was mentioned, a few months after the birth of Rosetta, after folks had time to understand what it was, that Rosetta had this inherant flaw about not using other translations and not giving back its work upstream. I surprised you bring it up now after a few years of working with it. But you joined the FLOSS effort and have been a valueble contributer to it regardless of the route you took to get here. Happy $HOLIDAY to you and your family.
    Kev

  9. Hi Kevin, happy $HOLIDAYS to you and yours as well. :) It actually took me several months to understand not only how translations worked under Rosetta, but also in upstream. You see, I was a newbie who took advantage of the low barrier entry that Rosetta provided. Several months later I got a pretty good trial by fire experience with upstream and was then able to better understand the issues at hand. That is when in 2006 I brought the entire Brazilian Rosetta team to do translations directly with the GNOME crew and for the first time managed to get the GNOME desktop modules 100% translated. That was back in late 2006 and I never returned to Rosetta after that (though I continued to pester its developers to change its ways). :)

    Take care,

    Og

  10. Hi Jef, thanks for your comment. My personal position is that I don’t care whether the LXDE project chooses to go with their own install or not. I have actually helped translating a module for Fedora (right before the 10 release) and am more than OK with using the same instance. To me the limiting factor is actually having the bandwidth and resources to host the translations, and if someone at LXDE wants to take on this task, so be it. I really agree with you on the last paragraph and I’ll do some more digging and present it to the LXDE guys to see if we can finalize it and get things started for 2009.

    Cheers,

    Og

  11. Og,

    The transifex developers are really keen on solving the problem generally in a way that every open project can efficiently benefit. Multi-lingual translation expertise is a precious commodity, and it cuts across all possible distribution channels, all possible project definitions. It really is its own community with its on distributed work flow needs which is overlaid over the traditional code and documentation development processes. Even if the transifex technology as it stands isn’t the right solution, I think the people working on transifex are going to be the people who will get it right, because I think they fundamentally understand how translators interact.

    I’m just as wary of fracturing that translation community along individual distributor lines as I am about fracturing it along individual project lines. Any fracturing of the community increases the likelihood that language coverage of the entire project space will suffer. If individual projects are going to have their own transifex implementations, some care may need to be taken to make sure that project specific translations community is part of the global transifex translator community. But that’s me, that’s the sort of stuff I worry about.

    -jef

  12. Hey Jef, I think that you and I have a lot in common, as I also share the same ideas and concerns defined in your last paragraph. Do you think that we could schedule some time to chat a bit via Skype after the holidays?

    Cheers,
    Og

  13. I have the same comment as Vadim. The “Swimming against the flow” part is about the Ubuntu project, but the introduction speaks about the upstream project considering to use Launchpad.

    If upstream uses Launchpad the translations don’t need to be send anywhere, so that point doesn’t really fit into the rest of the post.

    [I had written a long paragraph more about stuff with which I disagree from your post, but I haven't read your specs so I'll better just shut up until I'm informed enough :P .] You make some good points, though :) .

  14. KDE upstream has been hearing all these exact same things for a while now. Having talked with Jono a bit about it, I do know they are aware of the issues. Hopefully they can do something about it for the sake of their users.

    At this point, simply using upstream translations for many projects would be a much better solution.

    Anyways …

    “Since the context here is a board game with a little mouse”

    we solved that in KDE4 with the concept of context, so you can do this in the code: i18nc(“The pointing device”, “Mouse”) and the translator will see “The pointing device” as context when translating, and if “Mouse” appears twice in the code it is disambiguated using the context. lots of clever i18n stuff in kde4 (including the ability use javascript to translate at runtime really bizarre things).

    “I was even called names by the Brazilian KDE / Latin America representative”

    I assume you mean Helio? Please feel free to email me privately with what happened so that we can do an internal debriefing and try and make sure such bad feelings are stirred in the future. We (KDE) try to be good community members, though sometimes individuals need some support in doing so.

    That said, I hope whatever happened actually was worth calling him an “asshole” publicly on your blog. I know Helio personally and he’s certainly Brazillian, but not an asshole, in my experience.

  15. Hi Og and thanks for posting your thoughts.

    I think you made a huge fundamental confusion in this post.
    There is a big difference between an free/open source application and a free/open source distribution. Both are flos project but only a distribution needs to deal with upstream problems.

    Is an open source project, other than a distribution is hosted in Launchpad and will use Launchpad Translations it will not have to deal with the upstream problems (as Launchpad is the upstream source).

    From my point of view your whole post is about translating a distribution in Rosetta, but this is not stated clearly,

    I know the upstream projects are bitching about Ubuntu Translations teams, but all this work is available to everyone under a bsd licence and every upstream project have access to the work of Ubuntu translators and integrate the changes upstream.

  16. Hello,

    you are mixing to different issues: Ubuntu and Launchpad Translations (Rosetta). Ubuntu is “just” a project using Launchpad Translations. So the policy to allow suggestions by everybody is Ubuntu only.

    You should not translate your project through Ubuntu, but register a separate upstream project in Launchpad.

    So the arguments of suggestion policy and upstream cooperation are not really ones.

    I could show you a lot of errors done by translators using traditional tools and not taking the context into account. This has to be handled by a review process as in every other team/platform too.

  17. The real problem is the translating itself. It shouldn’t be done.

  18. Hi Sebastian, thank you for your reply. As far as I know, registering a project in Launchpad does not synchronize the translations done via Rosetta back up the source code tree. There is a blueprint for it but you still have to manually download the translations from Rosetta (offered as individual or lump tarballs) and then apply those into the source code yourself. Transifex has an enormous advantage there…

    “So the policy to allow suggestions by everybody is Ubuntu only.”

    Right, and though my post may have deviated a bit from its original purpose, why Rosetta shouldn’t be used by upstream projects, I still think it is valid from the point that Ubuntu’s policy is flawed for not doing the right thing.

    Your last comments are absolutely correct and part of my blueprint (mentioned in the post) is dedicated to trying to improve this area.

    Cheers,

    Og

  19. Hi Aaron, thank you for your feedback/reply. The feature you’ve mentioned about providing more context for translators is something that is being discussed in the GNOME community as well, and if I remember correctly it will be part of our workflow with he incorporation of the newer version of gettext. The thing is, afaik, Rosetta doesn’t expose this yet and a translator would only benefit from the feature if doing the translation using the PO file itself. Though I did translate a few KDE packages this year I may have picked modules that did not include context strings. It is great to know about this feature!

    About the last part of your comment, or how I was publicly mistreated by Helio (and only him), everything is available in the mailing lists, though in Portuguese. My experience with *every* other member of the KDE community has been very pleasant to say the less, and you can only imagine my surprise when attacked out of the blue.

    I will gladly send you a private email as soon as the holidays are over. Though this incident took place a while back and has not affected my desire or interest to contribute with the KDE translations effort, I’d very much like to see this type of behavior eliminated, specially from someone who is supposed to represent this project in the entire Latin America. I can only take your word about his true character but to me he has shown nothing less than arrogance and disrespect.

    Cheers,

    Og

  20. Benjamín Valero Says:

    Og, my story is like yours, but in Spanish. I discovered that my translations for Ubuntu were almost all in vain. That’s why I decided to join to Gnome translators group, for example, so my work will be available in any distro, not just Ubuntu.

    Of course, translations just for Ubuntu have meaning in strings used only in that distro, as the ones of the installer. This also happens in other distros.

    By the way, I use Fedora since then, but this is just another story.

  21. Rosetta support msgctxt feature of PO files since very early stages (i.e. soon after it was introduced in GNU gettext). Rosetta also fully supports translator comments.

    KDE4 makes good use of that, but Rosetta was there first (of all the translation tools that I know of, except gettext itself): https://blueprints.launchpad.net/rosetta/+spec/rosetta-message-contexts (“Completed on 2007-08-23″, gettext 0.15 was released on 2006-07-21).

    Rosetta even supports special KDE3 formatting for both plural forms and contexts in PO files and displays that nicely.

  22. I like translationproject.org. Some very popular softwares are translated this way. It’s simple and it works.

  23. Hi Og,
    We had the same problems with polish translations. Now, as an official Ubuntu translation team, we are encouraging people to join upstream teams. We only use Rosetta for Ubuntu related, and independent projects. But still there are some people, that are translating in Rosetta once they discovered their favourite apps lacks translation. It’s a pitty that work couldn’t be synced.

Leave a Reply

CAPTCHA Image CAPTCHA Audio
Refresh Image