Chromium translations explained: part 2b
In the second part of this series of posts about the Chromium translations, I mentioned a problem with a recent change in Launchpad that triggered the loss of hundreds of strings from contributors. I also mentioned a possible future evolution meant to improve the translation coverage of stable builds. After a long fight, it seems things are getting right again. Here is what changed, and why…
If you didn’t read the previous parts, I highly recommend you to do so. In the first part of this series, I covered Grit, the format of translations used by upstream for Chromium (and Google Chrome, ChromeOS..). In the second part, I explained the workflow, going from Grit to GetText and back to Grit, via Launchpad.
Here is where we left off after part 2:
Trunk was the base of the upstream translations, Launchpad received only those. Contributions were merged by Launchpad, and exported. Then for each of the 4 chromium branches, those strings were merged with the branch specific templates and strings. It worked fine for a while, but with a small gap growing with the age of the considered branch. Typically, it was no more than 10%, but that could definitely be improved.
So I devised a way to fix it this way:
It’s almost the same except that this time, Launchpad is getting a merge of the 4 upstream branches (templates and translations).
This design has a few advantages:
- it eliminates the gap, so it becomes possible for translators to translate even stable builds, no more dead line. As stable builds get minor updates, it’s still possible to refresh the translations
- more stable builds automatically inherit translations improvements, both from more recent upstream branches and from Launchpad.
But before I even finished developing this feature, Launchpad changed and seriously broke this workflow. Launchpad dropped the distinction between “translated upstream”, “translated in Launchpad” and “translation updated in Launchpad”. Worse, each time a new upstream change was imported, Launchpad dropped the ”updated in Launchpad” strings (in fact, they are still there, but hidden, and missing from the export). After some discussions with the Launchpad developers behind this, I’ve been told that this use case is not supported and that I now have to merge everything myself.
My first attempt was to re-import the gettext pool populated by Launchpad (the export branch), and to resurrect the lost strings using the bzr revisions from before the change to the present day. It was a painful experience:
- it’s big: 3500 strings x 57 langs x 4 branches = ~800.000 strings, each to import, parse, check, export
- many things changed upstream in the interval (new additional json format for the policy template, some massive template and strings updates, new strings, ..)
- some LP contributors tried to re-import their changes several times
- as days were passing, more revisions had to be reviewed (up to 25, meaning ~20 millions strings)
- the LP export is asynchronous, often taking 2 days to update the export branch after a given change
I eventually did it and it seemed to work, but the next day, the strings disappeared once again. They didn’t stick, most probably because of the 2 days delay:
- day 0: i send everything to LP
- day 1: lp doesn’t give me back what i expected, so i send it the incomplete set of strings
- day 2: lp gives me the strings i wanted at day 1
- day 3: lp gives me the incomplete set from day 1
Obviously, the solution was to keep feeding everything to Launchpad:
So to feed Launchpad, I now import the 4 grit branches, the previous gettext import and the gettext export and perform some magic spell to determine what should prevail. More seriously, I had to make some choices. The most important one is that for the “updated in Launchpad” strings, I always prefer those over the upstream translations. The drawback is that it is no longer possible to return to an upstream translation once it diverged. I consider it a small price to pay at the moment, but maybe it will have to be revisited one day.
For now, I keep an eye on the numbers and so far, it seems ok. I’d appreciate if translators could check their strings are back to what they expect them to be. Please let me know.