translating longer documents

Bug #298921 reported by EmmaJane
2
Affects Status Importance Assigned to Milestone
Launchpad itself
Won't Fix
Undecided
Unassigned

Bug Description

I need an interface to translate longer files. Specifically I am working with the screen casting team (translating subtitle files) and the Desktop Training Course (translating chapters of a book). I think the best interface would be to have a long file broken into input boxes according to line breaks (new paragraphs). With the original language on the left, and the translation box (suggested text) on the right. (Although this might be flipped for rtl languages.)

I am interested in hearing ideas from other people on how to translate longer files. The only other system I have used is Drupal's i18n module. It allows you to do short strings (like .po files) and also whole pages of text. I would like something in between. :)

This is a feature request.

Revision history for this message
Matthew Paul Thomas (mpt) wrote :

What's the difficulty with the current system? Is it presenting the entire file as a single string? If so, that seems like a problem with the original .pot file generation, not a problem with Launchpad Translations.

Revision history for this message
EmmaJane (emmajane) wrote :

This is a feature request as the current system does not meet the needs of several projects that are also working on translations. Longer documents are not written using translatable .po files... they are written in DocBook XML (training manual), or .srt files (subtitles for screencasts).

Revision history for this message
Martin Albisetti (beuno) wrote :

I also think that translating big docs need a different way of displaying the information.
Probably in paragraphs, sequential, and have the whole document on one page.
So I think this feature will probably drive some improvements to the translations UI, a new view for docs, and being able to parse more sources of translations.

Personally, I'd love to see this in Launchpad.

Changed in rosetta:
status: New → Triaged
Revision history for this message
Данило Шеган (danilo) wrote :

At the moment, you can use many available tools to transform XML to PO file format. Eg. xml2po transforms DocBook (and any other XML, for that matter) to PO file formats, and is used for both Ubuntu and GNOME documentation. It's far from perfect, but that's a separate issue (we can support xml2po generated PO files better).

Translating longer paragraphs could do with some improvements as well, but that's again an orthogonal problem (and not restricted to document translation).

Btw Martin, we are already displaying all PO files sequentially, which means that we are also displaying all xml2po-generated PO files sequentially. It's not done by accident, it's a feature of both xml2po and Launchpad. IMO, anything that would improve the UI for document translation would improve it for other PO files as well.

Changed in rosetta:
status: Triaged → Won't Fix
Revision history for this message
EmmaJane (emmajane) wrote :

I believe you have closed this *feature request* prematurely.

I would like to see a translation interface for longer DocBook XML files as well as .srt sub-title files. If the conversion of DocBook XML to .pot is trivial it would be lovely to have this feature integrated into rosetta.

The Ubuntu Desktop Course is currently available as an LP project:
https://edge.launchpad.net/ubuntu-desktop-course

And the screen cast subtitles are currently hosted on a Canonical static file server. This is an example of a subtitle (.srt) file:
http://static.screencasts.ubuntu.com/20070909_installing_ubuntu_part_1_en.srt

I look forward to working with you on making Rosetta better for our documentation-related Ubuntu teams! :)

Revision history for this message
Данило Шеган (danilo) wrote :

The world is full of different formats. If we even start to support all of them, we are going to run into a maintainance nightmare. If something can be easily transformed into PO files with external tools, we should let people do it themselves (each tool does it slightly differently, and has to make certain restrictions: I know that because I am an author of xml2po). For instance, GNOME uses xml2po for documentation, and KDE uses poxml (their own tool). Others probably use po4a. By blessing only one approach to translating documentation, we are going to limit our user base.

Also, I know there are a bunch of subtitle formats, and I don't see why and how we can decide to support one, and not the others. Writing a tool and maintaining it externally to transform them into PO files and back is much easier, and that's what I recommend as an approach. If you need help working on such a tool, we'd be happy to help.

Otherwise, Launchpad will become a resource hog, will have infinitely more bugs, and people will get worse impression of it. By concentrating on a single (or few) native formats, we can make sure we do absolutely the best job with them.

Revision history for this message
Martin Albisetti (beuno) wrote :

So, this is interesting.
You can't translate documentation in Launchpad today. The tools are just not right for it.
People who help with translations and documentation are most often not very technical people, so telling them that they need to convert between formats when we could potentially do it ourselves, is just silly.
We all agree resources and code maintenance is an issue, but the way you're prosing to solve it, just doesn't at all.
So, the request for a proper tool to translate documentation is a valid one, and one that would benefit all open source projects, since this bug is exactly about that, I think it should be open until that's solved.

Changed in rosetta:
status: Won't Fix → Triaged
Revision history for this message
Данило Шеган (danilo) wrote :

Martin, people are translating documentation in Launchpad today (so, your assertion is wrong). For anyone who's able to write DocBook I'd say they are a technical person (and they are likely not using it as such, since they are likely to be converting it to something else).

Now, can anyone clearly define a problem, please?

I can see several issues being mentioned in this bug report, so let me go one by one.

This I can say as much—subtitle translation support: won't fix.

DocBook (or most XML-based documentation formats) support: too much legacy issues (i.e. different handling in different communities), and too complex (I am not talking out of the blue sky: I have implemented a converter from DocBook to PO and back, and we'd need something even more complex here). Take note that DocBook support is a hard problem, because it's a problem with differing interpretation of it. If we are going to integrate it, we should integrate it with DocBook generation as well (i.e. over-the-web documentation creation), because then we can use a controlled subset of it. This is not something we need a bug for: it's not like we are going to forget about not providing support for documentation in Launchpad.

Improving translation of long documents with longer units of translation: we should do it. This will benefit not only current documentation translation in Launchpad, but all other use cases as well. If that's what you want to make this bug about, feel free to do so.

The other thing we can do is improve support for specific PO outputs coming from documents as well (i.e. xml2po has quite a few specific features for figures and similar, contained stand-alone elements and such). That's an interesting improvement which would basically only touch our UI, and that's where I think we can make the biggest improvement.

Revision history for this message
EmmaJane (emmajane) wrote :
Download full text (3.4 KiB)

I'm not asking for a "fix." This is a feature request or a "wishlist" item. I will be sad if you, Danilo, are the only developer who works on Launchpad Translations and you say that you will never implement this request. It doesn't seem to be in the spirit of open or more to the point--it doesn't seem to be in the spirit of encouraging participation to help distribute the work load involved in translating documents. To dismiss the request because of legacy issues also makes me sad. I want to improve the tools. To say, "I tried once and it's too hard" does not seem like the right approach. I have identified a problem and now I would like to identify a solution. If there is a better place to put this feature request, I would like you to help me find that place so that we can work through the problems and find a solution!

Here is the clearly defined problem that I am trying to solve: In the last 24 hours I have worked (as a volunteer) with two other volunteers who wanted to help with translations.

Here is a real life Problem #1 that I dealt with yesterday for nearly three hours:
- LP holds the "code" (DocBook XML) for the Ubuntu desktop course.
- Volunteer downloads the files (using bzr branch) and translates them on their system using Poedit.
- What does the volunteer do with these translated files? In their mind it is not a separate project, it is a translation of an existing project.
- And further more the XSLT style sheets that the UDC uses are language specific. They do not work with the translation the volunteer has just spent several months (probably without backups) working on.
- Using lp/bzr for the UDC code, there is no way to integrate a team of people --- how do you know that others are working on a translation of the project if there is no "translation" project tied to LP "code"?
- The "code" can't be tied to translations because there is no automated way to pull in the DocBook XML files without first "branching" the content and making it into .pot files to be uploaded as a separate project. At least no way that I could find and my staging account has not been approved so I have no way of testing it.

Problem #2 (dealt with this problem for about 20 minutes last night):
- Volunteer wants to translate Ubuntu screen casts -- they log into #ubuntu-screencasts and want to work on translations TONIGHT. (Awesome!)
- There is no "code" for a screen cast because it's a binary OGG file. Someone has to transcribe the video and then convert this into a .srt (subtitle) file. This file can now be translated.
- But how can these transcript and translation files be stored? They do not currently fit into the "code" model of Launchpad, but they definitely fit into the concept of "translatable files."
- The volunteer who transcribes a video is not necessarily proficient at anything other than typing what they are hearing.
- The volunteer who translates the file is not necessarily proficient at anything other than translating what they are reading/hearing within the context of a video.
- There is no way to facilitate the translation of these files within Launchpad Translations, and I think there should be.

These are common problems. Maybe they need...

Read more...

Revision history for this message
Данило Шеган (danilo) wrote : Re: [Bug 298921] Re: translating longer documents
Download full text (11.3 KiB)

Hi Emma,

У чет, 20. 11 2008. у 22:07 +0000, EmmaJane пише:
> I'm not asking for a "fix." This is a feature request or a "wishlist"
> item. I will be sad if you, Danilo, are the only developer who works
> on Launchpad Translations and you say that you will never implement
> this request.

I am not the only developer, but you might be amazed at how many things
of pretty high priority we've already got on our plate. As I said, I am
being realistic about what we can achieve in a recent future.

I am sorry if it hurts your feelings, but that's how software
development works: there are too many excellent ideas, but you get to
implement only a small subset of them.

Would you feel better if I said: "ok, we are going to implement this in
five years time?" (maybe we are, but would you really, honestly care?).
I understand how you may be unfamiliar with bug statuses in Launchpad,
but if I wanted to say that I believe this is a silly request, I would
have marked it "Invalid", not "Won't fix". "Won't fix" acknowledges
that this is a legitimate request, but says that this is something we
don't have time to work on.

> It doesn't seem to be in the spirit of open or more to the
> point--it doesn't seem to be in the spirit of encouraging participation
> to help distribute the work load involved in translating documents.

I offered to help set up appropriate system, and judging from your
"use-cases", I think I was correct in what you need. You have an option
to disregard my experience (or simply not to trust me), but the choice
is yours.

> To dismiss the request because of legacy issues also makes me sad. I want
> to improve the tools. To say, "I tried once and it's too hard" does not
> seem like the right approach.

I am not sure what you are offering here, because this sounds insulting.
I think I've proven my good motives by sitting down and spending
countless hours developing a tool as a volunteer, to help other
volunteers translate documentation.

> I have identified a problem and now I
> would like to identify a solution. If there is a better place to put
> this feature request, I would like you to help me find that place so
> that we can work through the problems and find a solution!

If you want to have a discussion, launchpad-users mailing list is the
best place to have it on. If you want to engage in a technical
discussion with Launchpad Translations developers, you can email us at
<email address hidden> as well. Or if you've got any particular issues
you want to discuss with me (eg. get help about setting up a module for
translation using xml2po), you can find me on <email address hidden>.

> Here is the clearly defined problem that I am trying to solve: In the
> last 24 hours I have worked (as a volunteer) with two other volunteers
> who wanted to help with translations.

Thank you for sharing these two particular use cases. I am quick to
mark a bug as "Won't fix" as soon as it's a blue-sky bug, because we get
hundreds of requests like this. And they don't help.

> Here is a real life Problem #1 that I dealt with yesterday for nearly three hours:

And we could have instead used those 3 hours to set up a project for
translation using existing ...

Changed in rosetta:
status: Triaged → Won't Fix
Revision history for this message
EmmaJane (emmajane) wrote :
Download full text (9.9 KiB)

Excellent, this is now progress!! I have snipped the parts that are not
a productive use of our time.

Данило Шеган wrote:
> У чет, 20. 11 2008. у 22:07 +0000, EmmaJane пише:
>> It doesn't seem to be in the spirit of open or more to the
>> point--it doesn't seem to be in the spirit of encouraging
>> participation to help distribute the work load involved in
>> translating documents.
>
> I offered to help set up appropriate system, and judging from your
> "use-cases", I think I was correct in what you need. You have an
> option to disregard my experience (or simply not to trust me), but
> the choice is yours.

As I mentioned before, I have added documentation to the "staging"
server to test the functionality and still have not received a response
saying that my documents have been approved. If this is something you
can look into, I would appreciate it.

>> I have identified a problem and now I would like to identify a
>> solution. If there is a better place to put this feature request, I
>> would like you to help me find that place so that we can work
>> through the problems and find a solution!
>
> If you want to have a discussion, launchpad-users mailing list is the
> best place to have it on. If you want to engage in a technical
> discussion with Launchpad Translations developers, you can email us
> at <email address hidden> as well. Or if you've got any particular
> issues you want to discuss with me (eg. get help about setting up a
> module for translation using xml2po), you can find me on
> <email address hidden>.

Is rosetta@launchpad a discussion list? I have also emailed you directly.

>> Here is the clearly defined problem that I am trying to solve: In
>> the last 24 hours I have worked (as a volunteer) with two other
>> volunteers who wanted to help with translations.
>
> Thank you for sharing these two particular use cases. I am quick to
> mark a bug as "Won't fix" as soon as it's a blue-sky bug, because we
> get hundreds of requests like this. And they don't help.

My preference would be to keep one open and mark others as a duplicate.
If there are so many that you have to deal with surely it would be more
efficient to point people to where the discussion has already taken
place than to have the discussion fresh each time? "Won't fix" is very
different than "duplicate." Remember also that Bug #1 in Launchpad is
about as blue-sky as it gets. ;)

>> - LP holds the "code" (DocBook XML) for the Ubuntu desktop course.
>> - Volunteer downloads the files (using bzr branch) and translates
>> them on their system using Poedit. - What does the volunteer do
>> with these translated files? In their mind it is not a separate
>> project, it is a translation of an existing project. - And further
>> more the XSLT style sheets that the UDC uses are language specific.
>> They do not work with the translation the volunteer has just spent
>> several months (probably without backups) working on. - Using
>> lp/bzr for the UDC code, there is no way to integrate a team of
>> people --- how do you know that others are working on a translation
>> of the project if there is no "translation" project tied to LP
>> "code"? - The "code" ca...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.