Back to Coding Horrors

A Rose by Any Other Name...

Coding Horrors Community

Started by Ectara May 15, 2014 04:42 AM

16 comments, last by Ectara 9 years, 11 months ago

samoth

9,833

May 17, 2014 08:01 PM

The only place where I really find case sensitivity harmful instead of helpful is filenames under Unix (or unix-like).

DOS/Windows got it right that time (yes, that's a rare thing, but it happens!). I want to be able to type del thatfileyousentme.zip and it should work. I don't want to type ThatFileYouSentMe.zip or any other possible capitalization. I don't want to be forced to remember how you spelled that file some 3 or 4 weeks ago, and even if I remember (or ran ls to find out), capitalizing is much more complicated to type than necessary.

It doesn't get into my way often, admittedly, since most people are too lazy and use all-lowercase anyway. But it does happen, e.g. every time I update VirtualBox guest additions. Autorun doesn't work because of sudo, nor does double-clicking an icon, so you have to do it by hand in a root shell with something like sh ./VBoxLinuxAdditions.run, but really, wtf? Why can't I just type sh ./vboxlinuxadditions.sh and it works anyway? The user's intent is very clear, there's no way it could be understood wrong. Sure, v and V are different characters to a computer, as are b and B, but why do I have to care as a user?

swiftcoder

18,997

May 18, 2014 01:54 PM

It doesn't get into my way often, admittedly, since most people are too lazy and use all-lowercase anyway.

And, you know, tab completion and shell glob both work on Unix/Linux, so in reality you would be typing something like 'sh ./V<TAB>'...

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

samoth

9,833

May 18, 2014 02:38 PM

And, you know, tab completion and shell glob both work on Unix/Linux, so in reality you would be typing something like 'sh ./V'...

So you'd wish, unluckily doesn't work nearly as well as you'd like. There are three or so files with identical-prefix names, and tab completion will complete to the next word, which means you need to type a capital "L" next. Of course tab and shift are the same finger on the left hand, so hitting tab and moving the hand makes the subsequent shift so unergonomic that it's faster to type "Linux" than to use tab completion. It's better for "Additions" since that'll be the right shift key you'll use.

But how nice would it be if you could just type 3-4 lowercase letters, and tab-complete them correctly (assuming it's not ambiguous, of course), and it would just work.

Luckily this is something that happens only once or so per month, so it's tolerable.

Lactose

11,696

May 18, 2014 03:38 PM

There are three or so files with identical-prefix names, and tab completion will complete to the next word

In Windows' command line, hitting Tab multiple times will cycle through the matching names. Is this not the case for Unix/Linux?

Also, I might type strangely, but I use different fingers for LShift and Tab. Pinky finger for LShift, ring finger for Tab.

Hello to all my stalkers.

swiftcoder

18,997

May 18, 2014 04:20 PM

In Windows' command line, hitting Tab multiple times will cycle through the matching names. Is this not the case for Unix/Linux?

It does (in essence, the actual behaviour is different), but it only shows matches that match *case* as well as characters.

Of course, what samoth does not know about is the relevant bash configuration option:
[source]set completion-ignore-case on[/source]
Drop that baby in your ~/.inputrc, and you are ready to party like it's 1998...

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

TheChubu

9,484

May 18, 2014 05:45 PM

I hope there isn't a setting so bash adds those stupid double quotes that Window's cmd has all over the place...

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

Ectara

3,097

Author

May 18, 2014 09:12 PM

DOS/Windows got it right that time (yes, that's a rare thing, but it happens!). I want to be able to type del thatfileyousentme.zip and it should work. I don't want to type ThatFileYouSentMe.zip or any other possible capitalization. I don't want to be forced to remember how you spelled that file some 3 or 4 weeks ago, and even if I remember (or ran ls to find out), capitalizing is much more complicated to type than necessary.

It usually gets on my nerves because you never know the name of the original file. I see that a program is trying to load "LONGFILENAME.TXT" from somewhere, and it is failing to parse it, so I'm trying to find it, and I'm passing right over it, or it doesn't show up in the search results, because it's named "LongFileName.txt". The only way to solve the problem is to be able to search case-insensitively; it seems to be propagating the responsibility of handling case-insensitivity all of the way up the toolchain to keep its usage from breaking at one critical point.

I always hear stories of people working on database projects with others, and find that people were connecting to the database "mydb" and scanning the table "the_table", and having the (cross-platform) program abort when they move to a different platform because the database was actually "myDB", and the table was "The_Table". If the programmers knew it was spelt one way, why did they purposely spell it another way?

Another notion that I hear a lot is "I always use the wrong case in an identifier name, or I hold the shift key too long when I type an identifier name." The very idea of a file full of "foo", "Foo", "FOo", "foO" (whoops, left caps-lock on, too. Still good!) all referring to the same thing scares me. Say we're now refactoring the code, and that name wasn't very descriptive, or we removed the dependency on that library. Now, let's just change all occurrences of that variable... Great, now we need a case-insensitive search, and we need to keep from replacing things that look like "foo", like "FooBar" or "Footer".

One-off command line interactive input: okay, if you want it case-insensitive, set the correct option, and you're good to go. If you're writing code, why would you knowingly type the identifier in an inconsistent case? Saving a little bit of effort seems hardly worth the lack of readability. There's always the option of lowercase_with_underscores if you don't want to wonder what the correct case is.

samoth

9,833

May 18, 2014 10:00 PM

"myDB"

That used to be (still is?) an issue with "portable" webservers, too. If someone named the page you were trying to read ThePage.htm, the webserver would show a 404 because you would of course try to get http://example.com/thepage.html (which would fail on two accounts, the capitalization and the extension). But hey, why do you have to do the dance for the darn computer? It's not like rocket science to figure out what you want if there is just one name with the same (except capitalisation) basename and an alternate spelling of the extension.

file full of "foo", "Foo", "FOo", "foO" (whoops, left caps-lock on, too. Still good!) all referring to the same thing scares me

Yup. The important difference between filenames and URLs and a source file is that at least for half of these different versions, it actually makes sense to have them in a source file (... meaning different things). Whereas as file name, there is usually only one such file, and case sensitivity only harms you.

FILE* file = fopen(...) tells me purely from the capitalization rules that FILE is "some obscure macro definition" and file is a variable. Of course I could chose a better, more intelligent name for file that does not "collide" with a standard library name, but is there a better name for, well... a file?

Same for class names and instances, I often find myself wanting to have an instance that's called the same as the class (only it's all lowercase, not camelcase).

On the other hand, I don't care whether a file that I want to open is spelled foobar or FOOBAR or FooBaR, there exist no other files with alternative spellings, and I'm just interested in opening the darn thing, regardless of such sopistries whether "F" and "f" are binary identical or not.

Ectara

3,097

Author

May 18, 2014 11:21 PM

That used to be (still is?) an issue with "portable" webservers, too. If someone named the page you were trying to read ThePage.htm, the webserver would show a 404 because you would of course try to get http://example.com/thepage.html (which would fail on two accounts, the capitalization and the extension). But hey, why do you have to do the dance for the darn computer? It's not like rocket science to figure out what you want if there is just one name with the same (except capitalisation) basename and an alternate spelling of the extension.

On one hand, users shouldn't be exposed to bare file path URLs, with modern URL rewriting and the majority of pages being accessed through links from other pages.

On the other hand, I don't care whether a file that I want to open is spelled foobar or FOOBAR or FooBaR, there exist no other files with alternative spellings, and I'm just interested in opening the darn thing, regardless of such sopistries whether "F" and "f" are binary identical or not. laugh.png

Whoops, forgot you were only talking about file names, sorry.

I've always felt that the case-insensitivity in file names was a relic of an outdated era (like using FAT32, with its case-insensitivity, but case remembrance), and was due to the constraints of the technology (and the convention that followed from its use), rather than conscious design decision on the OS' part. Case sensitivity also reduces the number of possible file names by a great factor.

Now, where do we draw the line on case-sensitivity? What about diacritics? Should an 'ñ' and an 'n' be considered the same letter in a case-insensitive file system? A lot of people make no distinction between case-sensitive and accent-sensitive. I wouldn't want to be the poor sap that was writing a Spanish application that looked for "año.jpg", and found "ano.jpg". How about Japanese? I found it useful to be in a Japanese book store, and look for a particular book by typing its name in katakana in the search terminal; I know how the title is read, but I can't write it in kanji. However, for file names, that's a poor solution (see an example of this, using romaji instead of katakana, here: http://www.cjk.org/cjk/reference/japhom.htm#2); there are many different words and phrases that wind up being homophones, so being able to type a file name in katakana or hiragana means that you'll hit serious ambiguity with several different file names of different meaning corresponding to the same phonetic sounds that can be expressed in katakana or hiragana. So, while this is suitable for, say, a book store search terminal to allow illiterate people like me to find literature, I find this to be a poor solution for a file system. There are many other languages to consider in this regard.

In short, our notion of file system case-insensitivity generally only apples to ASCII English characters, and, in my opinion, is a hold-over from when even that was more than we wanted to or could care about. It's a slippery slope, without considering the implications of anything that isn't English, leading to an inconsistent representation of file names. Furthermore, we already know it is possible to encounter "File.txt" and "file.txt" at the same time in a lot of filesystems; using a scheme that can't deterministically pick one or the other just asks to be bitten by this.

Oberon_Command

6,371

May 20, 2014 05:17 AM

Now, where do we draw the line on case-sensitivity?

How about on case-sensitivity?

A lot of people make no distinction between case-sensitive and accent-sensitive.

But should make that distinction because accents and case have separate orthographical functions, and not doing so is handling case-insensitivity improperly. So those people are doing it wrong.

Furthermore, we already know it is possible to encounter "File.txt" and "file.txt" at the same time in a lot of filesystems; using a scheme that can't deterministically pick one or the other just asks to be bitten by this.

I think that's a different issue - isn't that just a matter of the filesystem not actually enforcing full case-insensitivity? I would expect a case-insensitive file system to be able to deterministically pick one or the other, ie. reject naming a file "File.txt" if there already existed a file "file.txt" in that directory while still preserving the casing of whatever filename I originally gave it.

My stance is that case-insensitive filesystems (for those language for which case is actually a relevant concept) are more convenient to the user than case-sensitive filesystems. Since writing software is ultimately about doing things that are useful to the user, I conclude that to me, a software developer, the user's needs are more important than mine in this case.

A Rose by Any Other Name...

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

A Rose by Any Other Name...

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines