As long as we’re generalizing past the specific example of Django documentation in Italian, I think that it’s worth reflecting on the active maintenance required to keep any sort of diversity effort in effect. Maintaining multiple different ports for different architectures and operating systems, for example, has all of the same dynamics, due to the need to find maintainers who “speak Windows”, “speak ARM”, etc.
Maintaining multiple different ports for different architectures and operating systems, for example, has all of the same dynamics, due to the need to find maintainers who “speak Windows”, “speak ARM”, etc.
I think that’s a great analogy, because supporting different architectures and platforms also has the benefit of finding bugs that are hidden or difficult to trigger on some platforms. When I started writing open source code, the recommendation was to make sure it ran on both an i386 BSD variant and SPARC64 Solaris and then it would run anywhere: you had 32/64-bit, big/little endian, strong/weak memory order, strong/weak alignment requirements, BSD/SysV userland and kernel APIs covered. Now it’s much harder to find a pair of platforms that have that kind of coverage.
Similarly, if you have translators then they will be the ones most likely to pick up on ambiguous phrasing or poor explanations. The best feedback that I’ve received on any of my books came from the Japanese translator. I was very fortunate for one that he was writing the Japanese translation based on my near-final draft and so I was able to fix all of the things in the final print version that he found. He did a far better job that any of the copyeditors because he wasn’t just reading the words passively, he was actively trying to express the same ideas and so couldn’t just skim past things that didn’t make sense.
You need some solid infrastructure for this to be easy. In particular, you need to be able to pull out each sentence from your docs so that translators can just fix things that have changed. A style of one sentence per line in TeX / Markdown / whatever sources works well for this, but there are some tools that make it better.
Flesch-Kincaid is an established metric for linguistic complexity (sorry, I mean “Flesch-Kincaid warns you of hard words”) in the same way that McCabe complexity is an established metric for code readability or code coverage is an established metric for test completeness. Optimising for F-K score means writing for low-reading-grade audiences, which if your audience requires detailed technical complexity is doing them a disservice. In the same way that writing in English if your audience requires Italian is doing them a disservice.
Actually, the point I was trying to make in the article is slightly different
In the documentation of an Open Source software it would be better to try to keep the level of the language simple (not at a low level of learning), while maintaining a high technical value, and this means not to use slang words, idioms, sentence constructions unnecessarily complex or obsolete terms.
P.S. I don’t consider coverage as a test completeness parameter, but rather a basic test condition.
If you want to get involved – with Italian or any other language – see the project (and the note in the README for why the translated content is in specific branches).
Django’s documentation currently has at least partial translations into ten languages other than English, and of course improvements to the English version to improve readability are always welcome, too.
Also, the strings included in Django, as well as things like date/time and numeric formatting, have a lot of translations/localizations you can contribute to if you want to improve them, and the django-localflavor project (formerly django.contrib.localflavor but now maintained separately) aims to provide country- and region-specific model and form fields, validation rules, and data for all sorts of other things like phone numbers, national ID numbers, postal codes, and so on.
This is a topic where I have changed my mind a little over the years. My first “bigger” contributions were to the German PHP manual, first proofreading and syncing updates, then translating whole pages. There were several reaonps, I liked translating, the project was in good shape so it seemed beneficial to help keep it in good shape (aka I wasn’t someone starting a new translation for a couple users), and more.
But I started getting less and less interested in this, of course the main reason was that I was contributing more code and didn’t suddenly have unlimited time to work on the project, but I also grew more and more disillusioned that the work was actually helpful. I think I’ve been to a single German irc channel (again, PHP) where at least a minimum level of not only “we talk about this topic”, but with developers and deeper discussions (in this case a lot of the German-speaking devs hung out there, so it wasn’t a general PHP, but with the right people of course also the internals were a topic) - but apart from that.. everything is in English anyway. Docs, bug trackers, source code comments. You can’t avoid English anyway, and most developers I’ve met are able to reasonably read and understand the English docs even if their spoken English is bad.
Or maybe I’m just bitter and have given up. I wouldn’t dream of telling anyone to stop these efforts, but in my personal experience there’s no way around English anyway and that’s why I stopped my efforts in that direction.
As long as we’re generalizing past the specific example of Django documentation in Italian, I think that it’s worth reflecting on the active maintenance required to keep any sort of diversity effort in effect. Maintaining multiple different ports for different architectures and operating systems, for example, has all of the same dynamics, due to the need to find maintainers who “speak Windows”, “speak ARM”, etc.
I think that’s a great analogy, because supporting different architectures and platforms also has the benefit of finding bugs that are hidden or difficult to trigger on some platforms. When I started writing open source code, the recommendation was to make sure it ran on both an i386 BSD variant and SPARC64 Solaris and then it would run anywhere: you had 32/64-bit, big/little endian, strong/weak memory order, strong/weak alignment requirements, BSD/SysV userland and kernel APIs covered. Now it’s much harder to find a pair of platforms that have that kind of coverage.
Similarly, if you have translators then they will be the ones most likely to pick up on ambiguous phrasing or poor explanations. The best feedback that I’ve received on any of my books came from the Japanese translator. I was very fortunate for one that he was writing the Japanese translation based on my near-final draft and so I was able to fix all of the things in the final print version that he found. He did a far better job that any of the copyeditors because he wasn’t just reading the words passively, he was actively trying to express the same ideas and so couldn’t just skim past things that didn’t make sense.
You need some solid infrastructure for this to be easy. In particular, you need to be able to pull out each sentence from your docs so that translators can just fix things that have changed. A style of one sentence per line in TeX / Markdown / whatever sources works well for this, but there are some tools that make it better.
I feel this is very important. We shouldn’t needlessly divide our community.
The Django effort to increase English readability reminds me of the easy_rust project.
Very interesting example. Thanks.
Flesch-Kincaid is an established metric for linguistic complexity (sorry, I mean “Flesch-Kincaid warns you of hard words”) in the same way that McCabe complexity is an established metric for code readability or code coverage is an established metric for test completeness. Optimising for F-K score means writing for low-reading-grade audiences, which if your audience requires detailed technical complexity is doing them a disservice. In the same way that writing in English if your audience requires Italian is doing them a disservice.
Actually, the point I was trying to make in the article is slightly different In the documentation of an Open Source software it would be better to try to keep the level of the language simple (not at a low level of learning), while maintaining a high technical value, and this means not to use slang words, idioms, sentence constructions unnecessarily complex or obsolete terms.
P.S. I don’t consider coverage as a test completeness parameter, but rather a basic test condition.
If you want to get involved – with Italian or any other language – see the project (and the note in the README for why the translated content is in specific branches).
Django’s documentation currently has at least partial translations into ten languages other than English, and of course improvements to the English version to improve readability are always welcome, too.
Also, the strings included in Django, as well as things like date/time and numeric formatting, have a lot of translations/localizations you can contribute to if you want to improve them, and the django-localflavor project (formerly
django.contrib.localflavor
but now maintained separately) aims to provide country- and region-specific model and form fields, validation rules, and data for all sorts of other things like phone numbers, national ID numbers, postal codes, and so on.This is a topic where I have changed my mind a little over the years. My first “bigger” contributions were to the German PHP manual, first proofreading and syncing updates, then translating whole pages. There were several reaonps, I liked translating, the project was in good shape so it seemed beneficial to help keep it in good shape (aka I wasn’t someone starting a new translation for a couple users), and more.
But I started getting less and less interested in this, of course the main reason was that I was contributing more code and didn’t suddenly have unlimited time to work on the project, but I also grew more and more disillusioned that the work was actually helpful. I think I’ve been to a single German irc channel (again, PHP) where at least a minimum level of not only “we talk about this topic”, but with developers and deeper discussions (in this case a lot of the German-speaking devs hung out there, so it wasn’t a general PHP, but with the right people of course also the internals were a topic) - but apart from that.. everything is in English anyway. Docs, bug trackers, source code comments. You can’t avoid English anyway, and most developers I’ve met are able to reasonably read and understand the English docs even if their spoken English is bad.
Or maybe I’m just bitter and have given up. I wouldn’t dream of telling anyone to stop these efforts, but in my personal experience there’s no way around English anyway and that’s why I stopped my efforts in that direction.