A Rose by Any Other Name...

This topic is 1341 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

I figure, it's my turn to try my hand at a coding horror. This one had me stumped for a couple better-spent-elsewhere hours:

menu_prompt:
call WriteString
call WriteChar
call Crlf

cmp al, 'p' ; If 'p' was entered...
je print    ; ...print nodes

...

mov edx, OFFSET error	; Print invalid option message
call WriteString

print:
push cur
call Print

...

done:
call WaitMsg	; hold display window open
exit
main ENDP
.data
...
.code
...

Print PROC
...
Print ENDP


I was trying to figure out how the values of the cur and head pointers weren't being stored on the stack before the call to the Print procedure; the program repeatedly generated a memory access violation when it tried to dereference the pointer, and I'd be sitting there rooting around the stack frame, and the values are nowhere to be found, like they've never been pushed on the stack before the call. Huh. Visual inspection of the code provided no answers. If you see it already, you're better read than I am on this assembler.

The answer? (MASM transforms all identifiers to uppercase by default, for case-insensitivity! Thus, my jmp statement was just jumping to the Print procedure, without pushing its parameters on the stack, instead of jumping to the print label. I have no idea why the assembler wouldn't raise an error or a warning for that, but I've learned my lesson [just started using MASM, in particular, not long ago].)

Share on other sites

Writing assembly in this day and age might be the real WTF

It's for a class I have to take; I'd really rather not have to take it, but it's required.

Share on other sites

It's for a class I have to take; I'd really rather not have to take it, but it's required.
At least its x86 assembly, the assembly class I had taught us 8085 assembly.

Share on other sites

Writing assembly in this day and age might be the real WTF

Hey, it's not entirely useless if you want to play with microcontrollers.

(Ok, most of them have C compilers these days...ahem...)

Share on other sites

Assuming that case matters and that you can safely use "Print" and "print" seems to be a bigger WTF to me.  Most of the world has long since moved on from those days, and even in places where it still does matter (such as C) seeing this kind of thing would make me wince.

The sooner that the rest of the world wises up and ditches case-sensitivity the better.  I long for the day when I'm writing or reading code and I no longer have to fret over whether it was really intended to use "E" or "e" (simplistic examples to illustrate a point) in the current block (particularly if they're the same type).

Share on other sites

Yeah apart when you are debugging someone else application, like, I don't know, something which has been written by 20 unpaid under-graduate over the course of 5 years (true story, which i guess is more common than i think), have fun with this one in a case sensitive language. Because, yes you'll find plenty of Objects oBjects ObJeCts and so on (not to mention my all time favorite: little i and big I as loop counters in the same complex nested loop) littered all over the code base making the whole thing a mess to debug.

I never run into that problem. Constant? All upper, with underscores. Composite type? Title case. Function or variable? Camel case. Dealing with others' code is behind a layer, so that the differing styles are contained.

So in fact you run in this problem all the time, but you're so used to work with it with a convention that you don't notice it. And hoping that everyone (current and future), everywhere, is using the exact same when you're coding with others. :p

Share on other sites

So in fact you run in this problem all the time, but you're so used to work with it with a convention that you don't notice it. And hoping that everyone (current and future), everywhere, is using the exact same when you're coding with others.
On the contrary. This is making use of an available feature to one's advantage. Code using consistent capitalization rules to distinguish identifier families (such as constants, function names, variables...) is (usually) much easier to read and understand than if this is not encoded at all or using something like "hungarian" (which is an abomination, think of the LPWTF stuff in the Windows headers, for example).

Using capitalization to your advantage is in my opinion a very good compromise between "simply not knowing" and "knowing, but having to parse unintellegible letter salad".

Consider something like szName (which is already not that great compared to name, if you ask me, it's harder to read than necessary, impossible to pronounce, and what if you want to change the type one day?). Now you also wish to encode the fact that this is a local variable and not a function or class. So it will become something like varSzName, or szName_var. It's not improving the legibility (and intellegibility) or pronouncability of your code.

Share on other sites

Sorry I may explain myself poorly sometimes, english isn't my primary language and i forget that my jokes tend to fall flat.

Yes I agree with that, having some sort of convention (details varying from coder to coder) to name and differentiate variables / function / ... is of course a good thing no matter if the language is case sensitive or not. I am not disputing that in the slightest.

The point I was trying to raise, poorly I guess, is that in an environment with multiple coders over a long period of time, case sensitive languages have a tendency (in my experience, which is not that big, true) to lead to the amusing stuff I talked about in the first part of my post. It's not the fault of the language itself, but that combined with "meh" coders make the work of the debugger (me, in that case) tenfold harder as I have to check the syntax, logic (that's standard) and also triple check that each of those successive coders that are were not there anymore are really calling Save and not save or any variant of the past year used for a totally different purpose, but hey how could he know that someone else, 2 years earlier used that to save something in a part of the code he never touched.

The example is a bit extreme. I had to fix in a very (very) large code-base that was full of this kind of stuff during 6 months, so I may be kinda biased

Edited by SerialKicked

Share on other sites

The only place where I really find case sensitivity harmful instead of helpful is filenames under Unix (or unix-like).

DOS/Windows got it right that time (yes, that's a rare thing, but it happens!). I want to be able to type del thatfileyousentme.zip and it should work. I don't want to type ThatFileYouSentMe.zip or any other possible capitalization. I don't want to be forced to remember how you spelled that file some 3 or 4 weeks ago, and even if I remember (or ran ls to find out), capitalizing is much more complicated to type than necessary.

It doesn't  get into my way often, admittedly, since most people are too lazy and use all-lowercase anyway. But it does happen, e.g. every time I update VirtualBox guest additions. Autorun doesn't work because of sudo, nor does double-clicking an icon, so you have to do it by hand in a root shell with  something like sh ./VBoxLinuxAdditions.run, but really, wtf? Why can't I just type sh ./vboxlinuxadditions.sh and it works anyway? The user's intent is very clear, there's no way it could be understood wrong. Sure, v and V are different characters to a computer, as are b and B, but why do I have to care as a user?

Share on other sites

It doesn't  get into my way often, admittedly, since most people are too lazy and use all-lowercase anyway.

And, you know, tab completion and shell glob both work on Unix/Linux, so in reality you would be typing something like 'sh ./V<TAB>'...

Share on other sites

And, you know, tab completion and shell glob both work on Unix/Linux, so in reality you would be typing something like 'sh ./V'...

So you'd wish, unluckily doesn't work nearly as well as you'd like. There are three or so files with identical-prefix names, and tab completion will complete to the next word, which means you need to type a capital "L" next. Of course tab and shift are the same finger on the left hand, so hitting tab and moving the hand makes the subsequent shift so unergonomic that it's faster to type "Linux" than to use tab completion. It's better for "Additions" since that'll be the right shift key you'll use.

But how nice would it be if you could just type 3-4 lowercase letters, and tab-complete them correctly (assuming it's not ambiguous, of course), and it would just work.

Luckily this is something that happens only once or so per month, so it's tolerable.

Edited by samoth

Share on other sites

There are three or so files with identical-prefix names, and tab completion will complete to the next word

In Windows' command line, hitting Tab multiple times will cycle through the matching names. Is this not the case for Unix/Linux?

Also, I might type strangely, but I use different fingers for LShift and Tab. Pinky finger for LShift, ring finger for Tab.

Share on other sites

In Windows' command line, hitting Tab multiple times will cycle through the matching names. Is this not the case for Unix/Linux?

It does (in essence, the actual behaviour is different), but it only shows matches that match *case* as well as characters.

Of course, what samoth does not know about is the relevant bash configuration option:
[source]set completion-ignore-case on[/source]
Drop that baby in your ~/.inputrc, and you are ready to party like it's 1998...

Share on other sites

I hope there isn't a setting so bash adds those stupid double quotes that Window's cmd has all over the place...

Share on other sites

DOS/Windows got it right that time (yes, that's a rare thing, but it happens!). I want to be able to type del thatfileyousentme.zip and it should work. I don't want to type ThatFileYouSentMe.zip or any other possible capitalization. I don't want to be forced to remember how you spelled that file some 3 or 4 weeks ago, and even if I remember (or ran ls to find out), capitalizing is much more complicated to type than necessary.

It usually gets on my nerves because you never know the name of the original file. I see that a program is trying to load "LONGFILENAME.TXT" from somewhere, and it is failing to parse it, so I'm trying to find it, and I'm passing right over it, or it doesn't show up in the search results, because it's named "LongFileName.txt". The only way to solve the problem is to be able to search case-insensitively; it seems to be propagating the responsibility of handling case-insensitivity all of the way up the toolchain to keep its usage from breaking at one critical point.

I always hear stories of people working on database projects with others, and find that people were connecting to the database "mydb" and scanning the table "the_table", and having the (cross-platform) program abort when they move to a different platform because the database was actually "myDB", and the table was "The_Table". If the programmers knew it was spelt one way, why did they purposely spell it another way?

Another notion that I hear a lot is "I always use the wrong case in an identifier name, or I hold the shift key too long when I type an identifier name." The very idea of a file full of "foo", "Foo", "FOo", "foO" (whoops, left caps-lock on, too. Still good!) all referring to the same thing scares me. Say we're now refactoring the code, and that name wasn't very descriptive, or we removed the dependency on that library. Now, let's just change all occurrences of that variable... Great, now we need a case-insensitive search, and we need to keep from replacing things that look like "foo", like "FooBar" or "Footer".

One-off command line interactive input: okay, if you want it case-insensitive, set the correct option, and you're good to go. If you're writing code, why would you knowingly type the identifier in an inconsistent case? Saving a little bit of effort seems hardly worth the lack of readability. There's always the option of lowercase_with_underscores if you don't want to wonder what the correct case is.

Share on other sites
"myDB"

That used to be (still is?) an issue with "portable" webservers, too. If someone named the page you were trying to read ThePage.htm, the webserver would show a 404 because you would of course try to get http://example.com/thepage.html (which would fail on two accounts, the capitalization and the extension). But hey, why do you have to do the dance for the darn computer? It's not like rocket science to figure out what you want if there is just one name with the same (except capitalisation) basename and an alternate spelling of the extension.

file full of "foo", "Foo", "FOo", "foO" (whoops, left caps-lock on, too. Still good!) all referring to the same thing scares me

Yup. The important difference between filenames and URLs and a source file is that at least for half of these different versions, it actually makes sense to have them in a source file (... meaning different things). Whereas as file name, there is usually only one such file, and case sensitivity only harms you.

FILE* file = fopen(...) tells me purely from the capitalization rules that FILE is "some obscure macro definition" and file is a variable. Of course I could chose a better, more intelligent name for file that does not "collide" with a standard library name, but is there a better name for, well... a file?

Same for class names and instances, I often find myself wanting to have an instance that's called the same as the class (only it's all lowercase, not camelcase).

On the other hand, I don't care whether a file that I want to open is spelled foobar or FOOBAR or FooBaR, there exist no other files with alternative spellings, and I'm just interested in opening the darn thing, regardless of such sopistries whether "F" and "f" are binary identical or not.

Share on other sites

That used to be (still is?) an issue with "portable" webservers, too. If someone named the page you were trying to read ThePage.htm, the webserver would show a 404 because you would of course try to get http://example.com/thepage.html (which would fail on two accounts, the capitalization and the extension). But hey, why do you have to do the dance for the darn computer? It's not like rocket science to figure out what you want if there is just one name with the same (except capitalisation) basename and an alternate spelling of the extension.

On one hand, users shouldn't be exposed to bare file path URLs, with modern URL rewriting and the majority of pages being accessed through links from other pages.

On the other hand, I don't care whether a file that I want to open is spelled foobar or FOOBAR or FooBaR, there exist no other files with alternative spellings, and I'm just interested in opening the darn thing, regardless of such sopistries whether "F" and "f" are binary identical or not. laugh.png

Whoops, forgot you were only talking about file names, sorry.

I've always felt that the case-insensitivity in file names was a relic of an outdated era (like using FAT32, with its case-insensitivity, but case remembrance), and was due to the constraints of the technology (and the convention that followed from its use), rather than conscious design decision on the OS' part. Case sensitivity also reduces the number of possible file names by a great factor.

Now, where do we draw the line on case-sensitivity? What about diacritics? Should an 'ñ' and an 'n' be considered the same letter in a case-insensitive file system? A lot of people make no distinction between case-sensitive and accent-sensitive. I wouldn't want to be the poor sap that was writing a Spanish application that looked for "año.jpg", and found "ano.jpg". How about Japanese? I found it useful to be in a Japanese book store, and look for a particular book by typing its name in katakana in the search terminal; I know how the title is read, but I can't write it in kanji. However, for file names, that's a poor solution (see an example of this, using romaji instead of katakana, here: http://www.cjk.org/cjk/reference/japhom.htm#2); there are many different words and phrases that wind up being homophones, so being able to type a file name in katakana or hiragana means that you'll hit serious ambiguity with several different file names of different meaning corresponding to the same phonetic sounds that can be expressed in katakana or hiragana. So, while this is suitable for, say, a book store search terminal to allow illiterate people like me to find literature, I find this to be a poor solution for a file system. There are many other languages to consider in this regard.

In short, our notion of file system case-insensitivity generally only apples to ASCII English characters, and, in my opinion, is a hold-over from when even that was more than we wanted to or could care about. It's a slippery slope, without considering the implications of anything that isn't English, leading to an inconsistent representation of file names. Furthermore, we already know it is possible to encounter "File.txt" and "file.txt" at the same time in a lot of filesystems; using a scheme that can't deterministically pick one or the other just asks to be bitten by this.

Edited by Ectara

Share on other sites
Now, where do we draw the line on case-sensitivity?

A lot of people make no distinction between case-sensitive and accent-sensitive.

But should make that distinction because accents and case have separate orthographical functions, and not doing so is handling case-insensitivity improperly. So those people are doing it wrong.

Furthermore, we already know it is possible to encounter "File.txt" and "file.txt" at the same time in a lot of filesystems; using a scheme that can't deterministically pick one or the other just asks to be bitten by this.

I think that's a different issue - isn't that just a matter of the filesystem not actually enforcing full case-insensitivity? I would expect a case-insensitive file system to be able to deterministically pick one or the other, ie. reject naming a file "File.txt" if there already existed a file "file.txt" in that directory while still preserving the casing of whatever filename I originally gave it.

My stance is that case-insensitive filesystems (for those language for which case is actually a relevant concept) are more convenient to the user than case-sensitive filesystems. Since writing software is ultimately about doing things that are useful to the user, I conclude that to me, a software developer, the user's needs are more important than mine in this case.

Edited by Oberon_Command

Share on other sites

I think that's a different issue - isn't that just a matter of the filesystem not actually enforcing full case-insensitivity? I would expect a case-insensitive file system to be able to deterministically pick one or the other, ie. reject naming a file "File.txt" if there already existed a file "file.txt" in that directory while still preserving the casing of whatever filename I originally gave it.

It's not really about issues within your own case-insensitive file system. It's more about the inevitable issues that arise if you physically plug in, or mount over the network, a drive with different case-sensitivity rules.

Even on windows, you can obtain a variety of 3rd-party drivers for natively case-sensitive file systems (ext3, HFS+, and so on). These drivers have to perform some truly ghastly file name transforms to ensure that case-insensitive windows APIs don't screw up the contents of the disk.

Share on other sites

...That doesn't quite answer the question. As outlined, there are other languages (disregarding accents) that have different character sets that can be substituted for each other (barring grammatical convention). When we think "case-insensitive", we tend to think of common Germanic, Romance, Indo-European, etc. languages only. There's a potential for this behavior in other languages that we intentionally ignore. It seems rather strange to have this core kernel feature that is only effective for a portion of the languages spoken by the user base, and thus, I don't feel that it belongs in the kernel. It seems like an esoteric part of the time period where a subset of the English alphabet was all we cared about, with everyone else as an after-thought.

I think that's a different issue - isn't that just a matter of the filesystem not actually enforcing full case-insensitivity?

swiftcoder nailed this one precisely - I frequently dual-boot between Windows and Linux. I could mount my NTFS partition right now and create two files that differ in name only by case. It doesn't seem wise to pretend that this can't happen (though, I know it will forever stick around due to the compatibility issues of ever changing it).

My stance is that case-insensitive filesystems (for those language for which case is actually a relevant concept) are more convenient to the user than case-sensitive filesystems.

Though I'm a developer, I'm also a user: I find it a big pain. Nothing like trying to find something you've referred to, only to have trouble finding it because it wasn't named with the case that you've always that it had. I find case-sensitivity to be very brittle; any break in the case-sensitive seal, and all hell breaks loose. Often, as a user, I find that software doesn't do case insensitivity properly.

This reminds me of a bug in the product inventory software I had written at work. To make life easier on the people entering the data for product listings, you could enter a partial name, and it would show a drop-down list of possible matches based on a case-insensitive search. Well, it would search case-insensitively, and it would find it for display, but if you didn't click on it and have it fill out the correct name for you, and you didn't type in the correct case, it would pass the check to see if it is in the database (case-insensitively) after submitting, but it would choke when it accidentally looked up the ID of the product in a case-sensitive manner, resulting in several listing records having null product IDs, failing after the input was already validated.

My point to the story is that if you ever decide to uphold case-sensitivity, your workload is now doubled: if you slip up even once and do something in a case-sensitive manner, it may produce serious bugs that defy expectations, especially if it fails on the backend, after the input's already declared valid.

This is why I feel that case-insensitivity should be in the top-most layer, facing the user, and out of the core:

1. It is only useful for part of the user base.
2. Not all users enjoy the feature; many hate it, especially those that spend more time in other operating systems.
3. It is very brittle.