Home > Chromium, Ubuntu > Chromium translations explained: part 2b

Chromium translations explained: part 2b

February 10, 2011 Leave a comment Go to comments

In the second part of this series of posts about the Chromium translations, I mentioned a problem with a recent change in Launchpad that triggered the loss of hundreds of strings from contributors. I also mentioned a possible future evolution meant to improve the translation coverage of stable builds. After a long fight, it seems things are getting right again. Here is what changed, and why…

If you didn’t read the previous parts, I highly recommend you to do so. In the first part of this series, I covered Grit, the format of translations used by upstream for Chromium (and Google Chrome, ChromeOS..). In the second part, I explained the workflow, going from Grit to GetText and back to Grit, via Launchpad.

Here is where we left off after part 2:

Trunk was the base of the upstream translations, Launchpad received only those. Contributions were merged by Launchpad, and exported. Then for each of the 4 chromium branches, those strings were merged with the branch specific templates and strings. It worked fine for a while, but with a small gap growing with the age of the considered branch. Typically, it was no more than 10%, but that could definitely be improved.

So I devised a way to fix it this way:

It’s almost the same except that this time, Launchpad is getting a merge of the 4 upstream branches (templates and translations).

This design has a few advantages:

  1. it eliminates the gap, so it becomes possible for translators to translate even stable builds, no more dead line. As stable builds get minor updates, it’s still possible to refresh the translations
  2. more stable builds automatically inherit translations improvements, both from more recent upstream branches and from Launchpad.

But before I even finished developing this feature, Launchpad changed and seriously broke this workflow. Launchpad dropped the distinction between “translated upstream”, “translated in Launchpad” and “translation updated in Launchpad”. Worse, each time a new upstream change was imported, Launchpad dropped the ”updated in Launchpad” strings (in fact, they are still there, but hidden, and missing from the export). After some discussions with the Launchpad developers behind this, I’ve been told that this use case is not supported and that I now have to merge everything myself.

My first attempt was to re-import the gettext pool populated by Launchpad (the export branch), and to resurrect the lost strings using the bzr revisions from before the change to the present day. It was a painful experience:

  1. it’s big: 3500 strings x 57 langs x 4 branches = ~800.000 strings, each to import, parse, check, export
  2. many things changed upstream in the interval (new additional json format for the policy template, some massive template and strings updates, new strings, ..)
  3. some LP contributors tried to re-import their changes several times
  4. as days were passing, more revisions had to be reviewed (up to 25, meaning ~20 millions strings)
  5. the LP export is asynchronous, often taking 2 days to update the export branch after a given change

I eventually did it and it seemed to work, but the next day, the strings disappeared once again. They didn’t stick, most probably because of the 2 days delay:

  • day 0: i send everything to LP
  • day 1: lp doesn’t give me back what i expected, so i send it the incomplete set of strings
  • day 2: lp gives me the strings i wanted at day 1
  • day 3: lp gives me the incomplete set from day 1

Obviously, the solution was to keep feeding everything to Launchpad:

So to feed Launchpad, I now import the 4 grit branches, the previous gettext import and the gettext export and perform some magic spell to determine what should prevail. More seriously, I had to make some choices. The most important one is that for the “updated in Launchpad” strings, I always prefer those over the upstream translations. The drawback is that it is no longer possible to return to an upstream translation once it diverged. I consider it a small price to pay at the moment, but maybe it will have to be revisited one day.

For now, I keep an eye on the numbers and so far, it seems ok. I’d appreciate if translators could check their strings are back to what they expect them to be. Please let me know.

Advertisements
  1. February 10, 2011 at 09:51

    I have been checked the Spanish translation and it seems fine

    • fta
      February 10, 2011 at 10:23

      Excellent, thanks!

  2. February 10, 2011 at 12:54

    “The drawback is that it is no longer possible to return to an upstream translation once it diverged.”

    Suppose there’s an upstream translation “foo”, and someone incorrectly changes it in launchpad to “bar”. Does that mean that it’s now impossible to change it back to “foo”, or does it simply mean that the restored “foo” value will be forever considered to be a divergence from upstream, even though it’s the same string?

    • fta
      February 10, 2011 at 13:18

      it means that if string “foo” is translated upstream as “bar” in lang “X” , then someone sets it to “baz” in Launchpad, it is no longer possible to change it back to “bar”. but it could be changed again to something like “bat”. I know it’s not ideal.
      It’s a restriction i had to add because Launchpad now overwrites the LP contributed strings when there is an upstream update, even unrelated, in the template, which happens almost every day in Chromium.
      As I said, if the problem is ever solved in Launchpad, I should be able to remove this restriction.

      If you have this situation, please tell me.

  3. February 21, 2011 at 14:04

    I have a problem and it seems to be related to all these relationships among Launchpad and upstream, but I am a bit confused by this workflow, and I am not sure about it.

    In Basque, some time ago I made a typo and translated the word “Offer” (in “generated resources”) with the word “Eskaiki”. It should be “Eskaini”. Now, each time a correct the mistake in Launchpad (4 times by now), the next day I got the mistaken word again in Launchpad.

    Any idea about this?

  1. February 10, 2011 at 09:29

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: