Module naming conventions

joeerl · September 16, 2018, 5:29pm

I have been studying the scenic code and have a question about the module naming convention.

Here’s my problem - I quickly discovered that the first module to be called was MyApp so I wondered where the code was. It happens to be in a file called my_app.ex- this then supplies a module name MyApp.Sensor.Supervisor so I thought to myself “where can I find this code” – the answer is in a module lib/sensors/supervisors.ex (the name changed from MyApp to my_app seems strange – file names are UTF8 so I don’t see why camel case files names are not used) keeping the file and module names identical seems to be fairly common in many languages …

Now lib/sensors/supervisors.ex just says:

defmodule MyApp.Sensor.Supervisor do
…
end

In other words it defines a single module MyApp.Sensor.Supervisor.

So now I wonder “is this a local convention or a global convention?”

In Erlang module names and file names are exactly the same, that is the module xyz will always be found in a file xyz.erl this makes finding module code (if you know the name) very easy - also since module names are unique we can put all the modules in the same directory (people tend not to do this - but I do, since it makes the problem of deciding a directory name go away).

If a .ex file contains a single module definition I don’t really understand why the file name should not be exactly the same as the module name. If this were the case finding where the code is can be done with a simple find command.

In the scenic case the names differ - firstly some CamelCase stuff happens and some pluralisation MyApp.Sensor.XXX is in a directory called lib/sensors/XXX.ex (note the plural) – is this a local convention or a widely adopted practise? Had there been only one sensor would the directory been called lib/sensor and the plural s dropped?

If the conventions here are widely used then where are they documented?

Cheers

josevalim · September 16, 2018, 5:39pm

Hi @joeerl! In Elixir, we usually follow a direct mapping from module name to file name. So a file named MyApp.Sensors.Supervisor should be put in lib/my_app/sensors/supervisor.ex. So maybe you want to report this to the scenic project and get their $.02 on the topic.

Note it is just a convention though, Elixir does not care about the filename as long as it ends with .ex.

joeerl · September 16, 2018, 5:58pm

Hi @josevalin - I’m not sure about this convention. Breaking the convention will confuse people.

Suppose you have a single module defining say JoesApp.lib.misc and put it in a file called crypto/utilities/rsa.ex this will confuse everybody and is a violation of the principle of least astonishment. If you have hundreds of thousands of modules then you’ll have to build an index to find a particular module.

I would have liked a stronger convention - something like if there is no good counter-reason the file and module names should be identical. And if they are different the reason should be documented.

(actually I’d really like no readable file names at all - all file names should be SHA’s of the content and in a global content addressable store - but we’re not there yet

A counter case would be when the file names is (say) the SHA1 of the content or a UUID (I have used both conventions) - here it would be obvious that the file name and module name differ.

Failing a stronger convention, a warning “Gratuitous module/file name used …” might warn the user that whatever convention they were using might confuse people later.

tmbb · September 16, 2018, 6:12pm

How do you edit these files? Do you update the filename on each save? This sounds unworkable

joeerl · September 16, 2018, 6:37pm

Edit with a regular text editor - you could use one file per save or make a write-append file and save diffs. It’s very workable.

GIT essentially does this - but the granularity is “per commit” and not “per file save” - I wrote a wiki years ago that saved all old versions of everything forever - it used diffs and compression - surprisingly saving all old versions forever does not result in huge text files.

Of course, the user should never be aware of the SHAs of the file but see ‘regular’ file names.

Far from being unworkable I think the opposite is true. Write append only files with all old versions have many desirable properties. I hardly dare say that blockchains are “merely” write append stores with all old versions in the chain - the untrusted nature of things like bitcoin make implementations energy inefficient - but in a trusted environment a blockchain reduces to a write append only store with a few crypto certificates throw in for good measure.

Actually journaling files systems are used everywhere, and these are just write append logs.

Storing files in mutable stores which can be overwritten many times has loads of other problems. Long subject - not enough room here to make the case for a content-addressed file store.

josevalim · September 16, 2018, 7:25pm

Exactly. The convention would ask you to put it inside lib/joes_app/lib/misc.ex. So as long as you follow the convention, not surprises should arise.

tmbb · September 16, 2018, 7:33pm

Ok, that makes sense. I thought you meant user-visible SHA-based names. I like your approach, yeah. I think version control should be more integrated in the systems we use everyday.

AstonJ · September 16, 2018, 9:23pm

I’ve found that generally in Elixir and Phoenix we don’t pluralise (pluralisation does happen in other frameworks in other languages, like Rails, however it can get messy with edge cases, such as ‘octopus’!).

Naming conventions help a lot in frameworks, so for instance Phoenix will infer the name of certain modules from the name of others. For example it infers the name of the view, AppWeb.UserView, from the controller, AppWeb.UserController. Similarly, the view modules infer their template locations from the module name. So in the example in the Phoenix book, RumblWeb.UserView would look for templates in the rumbl_web/templates/user directory.

If you’re wondering why RumblWeb becomes rumble_web in the directory or why PageController in the module name becomes page_controller in the file name - I guess it’s for readability (and possibly inherited from the Ruby/Rails world - after I imagine quite a bit of discussion around it). There may be other benefits which I am not aware of though.

Once you get used to these conventions, it does make life easier as you don’t have specify them manually (although of course you can should you wish to break these conventions).

With regards to the pluralisation you found in Scenic, maybe @boydm could chime in and let us know his reasoning

mgwidmann · September 17, 2018, 12:27am

@joeerl I think it’s important to understand that not everyone uses the same tools and while putting everything in one directory may work for you it doesn’t for plenty of others. I’ve experienced what you’re describing with Swift (and maybe Obj-C too?). Apple’s XCode editor organizes source files into groupings that look like folders while you’re building things and then modifies a big XML file which saves this change but the files are not organized that way in the file system. Problem is, when looking at the source code via an external tool (maybe you don’t use XCode, or via GitHub or anything else), it’s a giant unmanageable flat list of files for the larger projects that goes on for many screens. They compiler doesn’t recursively traverse directories or something silly like that, it’s quite a hack IMO.

I don’t believe external tools will ever get on board to anything like this and there will always be more and more of them that it will just always become a bigger headache to both advanced users and newcommers alike having to deal with the two different organizational structures.

Elixir usually goes by uppercase camel names for module names and lower snake case for file names. I think this is particularly fitting because the lowercase version is typically accessed via command line (where the standard is typically lowercase for faster typing) and module names make sense as uppercase to denote their importance with respect to the other things around then (variables, ect.).

boydm · September 17, 2018, 8:16am

In general, I’ve been trying to follow the non-plural forms. The only exception has been the Scenic.Primitives and Scenic.Components modules, which are collections of helper files to instantiate those data structures.

I’ll go review the scenic_new project. Came together late in the game…

boydm · September 17, 2018, 8:39am

@joeerl This is a really good question.

When I set up the scenic_new project I took a look at Phoenix and tried to replicate the pluralization there. When you run mix phx.new my_app, it creates a lib/controllers folder, but the modules within have controller in the singular form.

Same goes for views.

When it comes to Scenic, it made sense to follow the Phoenix pattern. An app will have a collection of scenes. So put those in the scenes folder. However, each module defines a single scene, so it would have a name like MyApp.Scene.Whatever.

On further thinking, the folder structure in Phoenix generated apps doesn’t really map to the Module names. In other words,

lib/my_app_web/controllers/page_controller.ex contains MyAppWeb.PageController instead of MyAppWeb.Controller.Page or MyAppWeb.Controllers.Page

To me this breaks the philosophy @josevalim described above.

Personally, I would prefer a module name of MyAppWeb.Controller.Page, which (to me) denotes that it is a single controller. I could go either way if it should live in a folder named controller or controllers, although I have a slight preference for controllers, since that folder contains a collection of individual controllers.

That’s the philosophy I used in the scenic_new generator.

As far as the sensor supervisor goes, the version in master is MyApp.Sensor.Supervisor, which is singular. It lives in the sensors folder, which would be where I would put all the sensors. (I assume a real project would have more than one)

If there is a strong opinion on pluralization of the folders, I can change it, but for now I think it reflects the folder pluralization in Phoenix.

LostKobrakai · September 17, 2018, 9:08am

Afaik phoenix opted out of the convention to prevent having tons of .Controller and .View modules.

peerreynders · September 17, 2018, 2:21pm

My perspective on MyAppWeb.Controllers.Page:

Page module
in the MyAppWeb.Controllers “namespace”
therefore I expect Page to be a single controller - one among many in the MyAppWeb.Controllers “namespace”.

So the pluralization isn’t in the service of a good folder or module name but to create the sense of a namespace.

Now when it comes to the wisdom of having separate MyAppWeb.Controllers and MyAppWeb.Views namespaces given how tightly coupled controllers and views tend to be - that is a separate discussion (as I recall there are technical reasons for this as views require very different build processing from controllers - I’ve never been fond of the Rails convention of collecting “like things” under the same namespace (when we are not dealing with a (standard) library); I’m more of a put things that work as a cohesive whole under the same namespace sort of guy).

AstonJ · September 17, 2018, 3:27pm

Ah that makes total sense. (For some reason I was thinking about automatic pluralisation (and conventions) that happen in frameworks like Rails (and not in Phoenix) of modules/schemas etc, rather than the generated structure on creation.)

I quite like that too!

Thinking about it, and I remember it was discussed somewhere but not sure where now - does anyone know why in Phoenix we dropped the web folder for MyAppWeb? I think Web.Controller.Page or Web.Controllers.Page is much cleaner.

OvermindDL1 · September 17, 2018, 4:45pm

I’ve noticed that myself, I’ve accidentally let slip a camel-cased filename in my folders… ^.^;

A ‘direct’ mapping would be putting it in lib/MyApp/Sensors/Supervisor.ex though?

Lol, so git without the filename part of the tag? ^.^

My dream has always been a database of code personally. Everything properly linked together, denormalized, etc… Mmmm…

As something similar, OCaml uses lower-case-initial-character filenames but Modeles always start with an upper case characters. I’ve always found that odd too. In C++ I have the filenames match the case of the namespace/static-class that are defined in.

I prefer the MyAppWeb.Controllers.Page style too, but that is harder to macro’ize automatic linking with Views and all as the ‘Controllers’ part could be arbitrarily higher from the current controller name if you use a hierarchy and such things, so for macro’ing reasons having it be appended seems ‘safer’.

boydm · September 17, 2018, 4:58pm

Ah. That makes sense. I’m not trying to macro-ize anything, so am going to try to stick with the more readable name.

wojtekmach · September 17, 2018, 5:30pm

I believe the reason that Phoenix has PostController instead of Controllers.Post is that it’s pretty common to alias modules and it wouldn’t be possible to have in the same module the following:

# contrived:
alias Controllers.Post
alias Views.Post
alias Blog.Post

A good example where not sticking to 1-1 mapping between file and module is in Elixir itself: https://github.com/elixir-lang/elixir/tree/master/lib/elixir/lib/calendar, we have lib/calendar/date.ex defines Date, instead of Calendar.Date. It’s very convenient to keep these files together.

What I personally try to do is to stick to 1-1 mapping in the vast majority of cases but sometimes, very rarely, diverge from it when it has some benefits.

dimitarvp · October 3, 2018, 4:15am

@joeerl I quite like Java’s way of enforcing the same file name as the class name (when the class is public anyway), and forcing one class per file:

Go has conventions that the compiler enforces:

I don’t quite like the more liberal approach of Elixir because I’ve seen people put 4 modules in a single file and the name of the file does not even reflect the purpose of the bunch of modules. Granted that’s not often the case but IMO the tooling should not give you the ability to shoot yourself in the foot.

Conventions are only good for as long as the inhabitants of the ecosystem are willing to comply with them.

I understand some of the counter-arguments though – like convenient aliasing and naming consistency for the sake of macro automation / code generation.

I would absolutely do this. It gets tiring chasing files in directories even with an IDE. When a project grows beyond a certain point however, a single directory becomes unsustainable (I’d say if you go beyond 40-50 files; the whole thing has to be human-friendly after all). But right now I think that having separate directories for 1-2 modules (usually the root of your app – lib/my_app/my_app.ex) is taking it too far and is unnecessary. Same for the separate directory for 1-2 supervisors. Configuration and tests do make a lot of sense in where they are now, but nothing much else really.

I’d like to see more conventions inside the lib/ or lib/my_app directory contents, like supervisors/ or background_workers/. And a flatter directory structure overall.

I can’t argue a directory hierarchy is very useful for Phoenix projects though. Or umbrella projects where you can have an app for Phoenix site, GraphQL gateway, REST gateway etc.

EDIT:
In a recent project I had these two files in lib/: my_app.ex and cli.ex, where the latter file contained a module named MyApp.CLI. Still not sure if the CLI file should be named my_app.cli.ex or be forced to my_app/cli.ex. I lean toward the former because I think Elixir projects have too many directories with too few files in them.

axelson · October 3, 2018, 4:19am

If you want to use credo to enforce the module name there’s a PR up for that feature: https://github.com/rrrene/credo/pull/587

joeerl · October 3, 2018, 7:15am

Just a followup - over the years I have tried many different ways of organising files - non of them are good.

What I currently do (or am tending towards) is the following:

a) All modules that I think I might reuse are put in the same directory (called Dropbox/elib) with hopefully meaningful module names.

The advantage of this is I can grep for things in this directory and easily find things.

The Dropbox bit means my files are synced across all the machines I use - so for shared code there is one version (the latest) with a unique name on all machines.

It’s not even in GIT - I don’t want all old versions of the code just the latest and I’ll change the name if I make significant changes to the code - (and I’m not collaborating with anybody here - for collaborative code I do use GIT - but for private projects I don’t care about branches and saving old versions - just the latest is fine)

(I should add that I’m old-school - I learned to program before revision control systems where invented - so we developed strategies based on file naming that were useful - these strategies are a lot simpler than things like GIT and are fine for personal projects - but not for big collaborative projects)

b) All projects go into a sub-directory of Dropbox/experiments with a meaningful name

c) Inside a project directory (say Dropbox/experiments/transluder) local modules are just stored anywhere - but modules using the shared code in lib I just add as symbolic links.

This has a few advantages - my ls command colors symbolic links so I can see they are shared. The editor always edits the master version (in lib) - all experiments using the shared library get to see the same version of the code.

What I actually hate is having multiple versions of a file with the same name in different directories. These files endup with different sizes and modification dates. Some go back 30 years and all are slightly different.

I also hate the idea that files with the same name (in different directories) can have different contents. It’s ok for a file with a unique name in a single directory to have time varying content - I’m usually only interested in the latest version.

Before GIT (and friends) I used to name files with names like lib_this_vsn1 lib_this_vsn2
etc. bumping the version when appropriate. This was great - all the old versions were frozen and we knew that the greatest vsn could change without warning. This was how erlang was developed - there was no GIT @rvirding and I worked on a module until we got fed uo and then mailed the latest version to each other.

GIT etc hides this so we just say lib_this - the problem here is this fails the “telephone test” - ie if we were to tell somebody "code code is in the file XYZ’ two people could view the file at the same time (both see it is called XYZ) but they are looking at a different version of the code and are not aware of this fact. This has happened to me so many times since files get them selves detached from their revision control systems (for example by send in mails) and is large source of errors.

Actually there this is a symptom of a much deeper problem - but there is not room here to explain this (short version - the problem lies in editors, they do not record where the data in the file came from in the first place)