The best format for multiple-word identifiers

Darius J Chuck

2021-07-07

Let’s consider the examples of multiple-word identifier formats from Wikipedia:

Formatting	Name(s)
`twowords`	flat case
`TWOWORDS`	upper flat case
`twoWords`	(lower) camelCase, dromedaryCase
`TwoWords`	PascalCase, UpperCamelCase, StudlyCase
`two_words`	snake_case, pothole_case
`TWO_WORDS`	SCREAMING_SNAKE_CASE, MACRO_CASE, CONSTANT_CASE
`two_Words`	camel_Snake_Case
`Two_Words`	Pascal_Snake_Case
`two-words`	kebab-case, dash-case, lisp-case
`TWO-WORDS`	TRAIN-CASE, COBOL-CASE, SCREAMING-KEBAB-CASE
`Two-Words`	Train-Case, HTTP-Header-Case

These are the common ways of dealing with the fact that spaces are forbidden in identifiers in virtually all modern programming languages. This was however not always the case. As the same article notes:

Historically some early languages, notably FORTRAN (1955) and ALGOL (1958), allowed spaces within identifiers, determining the end of identifiers by context. This was abandoned in later languages due to the difficulty of tokenization.

What seems to remain obscure is the fact that also early Lisp (ca. 1956-1958) allowed spaces within identifiers (there called symbols). A footnote added in 1995 to the foundational paper on Lisp (John McCarthy, April 1960) states:

Imbedded blanks could be allowed within symbols, because lists were then written with commas between elements.

Here the difficulty of tokenization certainly wasn’t an issue. So why did Lisp move away from this? I can find no clear explanation. I suspect though that it comes down to a historical accident.

Moreover, I’d argue that allowing identifiers in spaces was the right idea and it can be made to work very well with a simple syntax akin to Lisp’s.

This would truly be a better format for multiple-word identifiers than all of the existing ones, because the need to map between the natural and the programming language would be removed altogether, without compromise.