Are you working in a non-English codebase?

Yes. I am not talking about local languages like ДРАКОН or 1C. I am talking about Elixir and Erlang.

Universities are usually several years late about everything in programming. And still, any more-serious-than-bachelor degree requires good English language knowledge. Academia in CS is 100% in English. If you take a look at any Erlang-related paper from Swedish academia or any Elixir-related paper from Brazilian academia, you’ll find out that all of them are in English. So your point is completely wrong in practice.

Anyway, level of knowledge from a person not speaking English would be limited by what a few teachers know. And I personally do not teach anybody without English knowledge, since I don’t have time to translate every paper I want student to know (and generally speaking, there is no point in translating it). Teachers tend to give fishing rod, not the fish. So first thing every CS program does is English courses

3 Likes

I learn a lot by following GitHub - etalab/transport-site: Rendre disponible, valoriser et améliorer les données transports

Can you recommend any similar codebases written in English? I’m seeking similar, in terms of code base, team size, open practices and government support?

1 Like

Well, I think it’s fine to comment or document code in languages other than English, since we can easily translate full sentences back to English or any other langage. For a single word taken out from its context, that won’t be the same thing though. I’m talking about variables and functions names, here…

However, upon quickly reviewing the repository you’re sharing, I noticed that apart from the comments, the names of variables, functions, and modules all seem to be in English. This would make it easy for a non-French speaking programmer to work on this as well.

One aspect of English that I appreciate is that comments or variable names in my code are generally quite short, even without resorting to abbreviations.

And since most programming languages, database languages, and essential system tools like the shell have their syntax in English anyway, it makes sense to go with the flow, because you’re already immersed in it anyway.

For instance, I attempted to name an Elixir module with an accented French word or a word in Japanese characters, and I immediately received an error message stating that such characters are not permitted.

It’s a bit off-topic, but the thing I still find most challenging in English (because it’s not my native language) is speaking and listening. For this reason, I prefer written books and programming guides over video courses and tutorials. I know there are tools for translating speech into text, but I don’t find it enjoyable at all to read automatically translated subtitles when watching a video on YouTube. The translation is far from satisfactory for me, and it’s exhausting to keep pausing to read everything.

4 Likes

It makes sense only in “West World”. Think what would happen if people from China would suddenly spread across all countries. In that case English would no longer be a top 1 natural language. BRICKS for example have something like half of total population. I would not be surprised if many countries with a big budget have translated the core resources. That’s said I think that I saw few sites already. One for sure is elixirschool.com which supports 25 languages and that’s open source project. If there would be only one language for writing a code then I would prefer Esperanto even if I not have learned it yet. :bulb:

1 Like

Not really, it’s just that English is mostly accepted as the world language. Sure there are literal billions of people speaking Chinese and Hindu but again, if a local company needs remote contractors then they’ll need their code be in English.

Until there’s another universal / world language then English it is. Resisting this is pointless IMO.

2 Likes

Making lingua franca discussions political is entirely wrong in my opinion. Language is tool for people to understand each other. For example, medicine is speaking Latin, though there are no Latin native speakers left. It is just that there was already a language with a ton of terms, and it made no sense to invent terms in a new one.

And it definitely makes no sense to talk about lingua franca from point of popularity, fairness or politics.

English and Mandarin languages have almost the same amount of speakers. And Mandarin is still the top 1 language by native speakers. So we already live in the world you’ve described, and English is still the lingua franca, and it is most definitely the lingua franca in CS and SE


Coming back to software development, we are finally living in UTF8 world, and no matter what flaws it has, it is better to live in a world where every text has exactly the same encoding, opposed to world with different encodings.

3 Likes

I thought that already happened, otherwise what am I doing? :slight_smile:

English won. In a sense it is easier for us Chinese: it is a vastly different language so we code switch: Chinese in private life, English in professional life. I guess it is a bit harder for people native to a European language; it might be tempted to mix up the 2.

3 Likes

Off-topic: I do the same with books I read for pleasure. I speak English on a level that impresses British and Americans but I can’t make myself read a normal book in English. If it’s not in my native tongue then I don’t want it. Guess my brain forever associated English with work.

1 Like

In your case I’d run.

In my case, I’m a native German speaker, I’d also run.

3 Likes

In my current job in Germany, we handle it the same: Code must be written in English, while documentation and comments should be German. I once had a C# codebase with German names for classes and methods, but reading it felt really awkward.

When reimplementing parts of it in Ruby, I switched to English names. Pluralization in Rails was another pain point. Thanks for not pluralizing tables and controllers in Phoenix, btw.

One reason for localized names may be doing DDD in German or whatever language and not translating the ubiquitous language to English. But that should be business people and domain experts job, not software engineers.

2 Likes

I believe this was the reason, yes.

My 2 cents on this discussion as a brazilian, is that some business logic and terms doesn’t exists on english. so how you would describe them? there’s no “boleto” mean in english, neither for “pix”, “carnê” and so many other terms that translating would lose the original meaning… so yes, sometimes i prefer to code in portuguese (comments, variables and database tables/columns).

also, if a local company only had brazilian workers that don’t know english, why to make the effort to translate everything to the “world language”

“but if a remote developer start to working on this company they would need to learn the mother language” yes that is literally what i need to do, as a brazilian remote freelancer who need to understand english for international companies…

now, if you’re building an open source library that indeed the target public is the “world”, it makes more sense to code in english

2 Likes

The whole thread is so rich and interesting. Reminds me of an inspirational keynote on PyCon 2007 by Robert ‘r0ml’ Lefkowitz.

Among many amazing ideas packed in the presentation, he brought up the point of “why do we have to program in English, why can’t we all use our preferred spoken languages like in other domains like web browsers, text editors, operating systems…”

Sometimes I toyed with that idea… seems plausible to do when working with low-syntax programming languages like those in the LISP family. But the reality and economics of software seem to be going in a different direction.

I have not found any “spoken-language agnostic” programming language, let alone find something viable long-term for production code. Can’t prove they don’t exist though :slight_smile:

2 Likes

J or any other esoteric special-character-only language.

It is also possible to redefine names in languages with preprocessors like C.

But still, any dependency you’re using will most likely be written in English, what will make a mess of your code.

I really like some languages built on top of non-English natural languages, and I love esoteric languages like J or Brainfuck, but, as an old meme says “I am so tired of looking at the bad screen, can’t wait to go home to look at good screen”, or rephrased: there is programming we do for money, and there is programming we do for joy. Sometimes they intersect, but still all programming for money should be done in English, since it is a convenient default. (Though I know one exception: neat 1C language which is used for accounting in Russian-speaking countries)

1 Like

Oh, yeah, I’m aware of J, APL, etc. But those are not what I meant.

Take a code base in J, I took randomly https://github.com/jonghough/jlearn: it is full of English (function names, etc).

The comparison with a web browser is like this – “if you want to browser the web, do you need to know English? No.” You choose the language you want to use your web browser and off you go (and probably most people don’t even need to “choose”, it is somehow chosen for them based on their geographical location). The same browser can be used in multiple different languages.

The equivalent programming language environment would need to be automatically suitable to be “translated” into several distinct spoken languages… imagine a code base where you can choose if you want to code in English or German or Portuguese, without affecting what other programmers in the team can choose, and everyone gets to have a wonderful experience. Variables and functions and other constructs are available in multiple languages, perhaps automatically, perhaps with manual effort. I’m unaware of anything like that :upside_down_face:

Thanks for sharing, that one is new to me! Funny enough, ChatGPT gave me an example code snippet in English… and I complained it was supposed to be in Russian and it translated everything… I have no idea how much hallucination that was!

If such tool existed and was something broadly used in the industry, it would basically mean that the OP’s task would not be about translating the code base from German into English by replacing the existing text, but, instead, translating the code base to a new language while preserving all existing text – German speakers can consult the German version, English speakers consume the English, and so on.

We do that with tools like gettext for web applications and other UI, but don’t do it for programming languages and code (even though IDEs support multiple spoken languages for their UI)!

That’s a fun task to research in academia or by anybody who has a lot of free time and knowledge in automated translation. I would definitely make a student write a diploma about software engineering translation

Tools like gettext require knowledge of every language you’re supporting

This might come close: https://www.unison-lang.org

Not sure if this still applies but basically when you write a function definition it gets hashed and stored. Should you choose to rename your function definition at a later stage (or even translate it) the rest of the codebase, or even other code that depends on your library, does not care because it uses the hash and not the name you have chosen for the function definition.

Edit: see 💡 The big idea · Unison programming language

2 Likes

Adding complexity to already a very complex ecosystem is not the way to go IMO.

While I generally agree, I would suggest you take a (closer) look at Unison. They have some interesting ideas and the whole hashing of definitions is done behind the scenes so you would not need to call a definition by its hash or something.

Another interesting idea is that your code base is essentially a database (SQLite). You do not have a folder with files in it but instead you have a code base server running which you can query. Once you found what you wanted to change or add you can instruct your code base to open a scratch pad where you would write all your code.

They have some very interesting ideas which are very different from what we have been doing so far and it might be useful to read and think about it at least.