Jump to content

April 2017 »

- - - - -

My Biggest Fear for the Future of Human-Computer Interfaces

openstack vim human computer interface
4: Adsense

I recently had to install and configure an 18-node OpenStack cluster, a process which involved a lot of SSHing and text-editing in terminals. I thought about learning Vim, but I was afraid of the incredibly steep learning curve, so I made do with GNU nano. It's not at all powerful, but it's easy.

Eventually I realized, "This is my job. This is what I do every day. Why am I holding off on learning something now, thinking it will slow me down, and that I'll have time to learn it later? It's not like I'm anticipating a major career shift any time soon."

With that in mind, I quit nano cold-turkey and moved to Vim. I won't waste time explaining why it's so great. There are already plenty of fantastic articles on that subject.

I'll just say this: Vim is powerful because it opens up a new interface to interact with your entire computer. Especially on Unixes, there's hardly anything you can't do with a shell and a good text editor. Which means you have one consistent interface that exposes everything on your computer.

Think of that. How many poorly designed, mismatched UIs do you use on a daily basis? Right now I have Chrome, Steam, Spotify, Windows Explorer, and Visual Studio open. I see about fourteen different UI paradigms cobbled together here. And, if I click in the lower-left corner, I get a completely disorienting context switch into an entirely different paradigm (that of the dreaded Metro tiles). I'm at the mercy of all these hapless UI designers.

Each one of those programs has a UI that I had to learn, each with their own quirks and bugs. Granted, Spotify and Chrome are both shining examples of UI design. I think they're about as good as it gets. Incidentally, web browsing and music organization are two things I will probably never do in a terminal.

Exceptions aside, it's incredibly empowering to be able to operate your computer on your own terms. And that brings me to my biggest fear for the future of human-computer interfaces:

There is no terminal in the cloud, or on mobile.

"That's good, right? CLIs are old and not at all user-friendly."

No argument there. But imagine for a second what a UI would look like if it had all the capabilities of a CLI with none of the cruft.
  • Again, it would provide one consistent interface between you and all your apps.
  • On the other hand, it would allow you to operate your apps on your own terms. Going with the analogy, right now you can choose one of 17 different shells and 5 text editors. Apache doesn't care what editor you used to configure it.
  • It would glue together all your applications, connecting them together however you want. In a CLI, that's accomplished with a single keystroke.
Compare that with current trends:
  • Cloud applications are the future of computing. Yet, to copy a picture from Facebook to Gmail, I still have to download the image, save it to disk, and upload it to another server. Most people don't have time to figure out how to do that.
  • Mobile applications, the uh, other future of computing, are notorious for not working with each other. Particularly on iOS, where the filesystem is almost completely opaque. On Android, it might as well be.
  • In both cases, each app has its own set of paradigms which do not relate to other apps at all.
The whole point of the internet is to connect things together through a common interface: HTTP and hyperlinks. These days, web apps have a single URL with a giant hashtag fragment appended. That breaks the interface. I can't write a script against that; I'd have to simulate user clicks.

"No you wouldn't, it's probably calling a RESTful API!"

Yes, the one shining light of hope is that every web app now has a nice, friendly, documented, open API. No, there are still major problems:
  • The main use of these APIs is still just identity. Great, I can connect my Facebook to my account on the pygmy llama forums I visit! Oh wait, all it does is save me the hassle of logging in all the time. I still can't have these two "apps" communicate with each other in any meaningful way.
  • Third-party clients are the other use-case. Great, I can choose between 3,000 different Twitter client apps! Oh wait, each one still only talks to Twitter and nothing else.
  • In the few instances where apps do talk to each other, it's only because the users bugged the developers enough for them to coordinate a common interface. The users can't operate their computer on their own terms. They're dependent on the developers to add this functionality.
Contrast this to command-line tools, where every program is designed from the ground up to work with other programs through common abstractions, most notably files and pipes, and where having a UI automatically entitles you to a scriptable API.

In short, open web APIs are good but not good enough. The question is, can we design an interface that has the power of a CLI with the user-friendliness of a GUI, and that is designed from the start for cloud and mobile environments?

If we don't, we will eventually lose control of our own computers; we'll be at the mercy of app developers.

Mirrored on my blog

Mar 18 2013 03:33 AM

Interesting read, thanks!

Mar 18 2013 04:28 AM

I have to disagree a little with you here.


First, you start talking about UI and UI design - true, more or less most applications put their own little twist on their UI in both visuals and organizational manner. But while using Win7, I ran into a lot of programs that stuck the the Win7 look. Windows 8 and the whole metro tile concept is new, and most people have yet to write software for it. Given enough time, there will be a unified UI that will come into play.


Second, talking about the Unix command line I had to laugh. I'm not one of those *nix gurus that has detailed intricate knowledge of the shell commands, and related materials. I do use Fedora on a daily basis now at my work. And I found the command line to be just as disjoint as the UI is between major software. Want to find something using the shell? Easy, it's:

find [switches] [path] [expression]

But wait, want to look for text inside some files?

grep [options] [expression] [files]


To put it in another way, "find" takes the haystack first, then the needle, while grep takes the needle first, then the haystack (using the old looking for a needle in a haystack analogy)

Sure, its minor and sure I can remember it, but it's just as problematic as remembering different UI paradigms and designs. 


In fact it's worse in some ways because the [options] that those programs accept are different from each other. In fact, most shell commands have options that differ, even when potentially trying to represent the same thing. 

See, these programs were not written as a unified tool, but each was tackled individually, often by different people, over different periods of time.


For those who read webcomics the somewhat recent xkcd on this subject makes me laugh. And you know what? I couldn't think of a valid tar command either. F*ck.

It's not that these tools aren't useful - they are. And it's not that these tools aren't powerful - they also are. But they are just as disjoint and often harder to learn than any UI is, for the sole reason that UIs can visually represent information to make them easier to use, while at best with command line you're stuck reading the online linked man pages. (At worse, its stuck reading the man pages under vim)


That's in terms of UI. None of these reasons suggest that you should not learn shell commands, and I think that any programmer should have some knowledge of them, since they will inevitably be forced to use them one way or another.




Now for your second point, you talk about control and APIs. I think your example of copying a picture from Facebook to Gmail isn't accurate - the problem isn't that facebook doesn't provide some common API - it does - the images are transferred in a standard way, and accessible via standard ways. GMail can also accept drag and drop images.

Not entirely surprisingly then, in the version of Chromium you can actually drag an image from one site, and drop it right into gmail. Yeah. You can. Try it.


Still, despite that, while you have a point, the UI design and differing paradigms of applications have nothing to do with control and interaction of those applications. The UI is just a UI - the data handling is what's important, and is actually being worked on. And even in the case of command line tools and piping input/output, they too are limited in terms of what they can do. There is no easy way to use a command line tool, or combination thereof, as far as I know, to copy and image from a site and insert it into gmail. Maybe someone will prove me wrong, but at the end those tools weren't designed for that sort of work - they were mostly designed for text based operations. 


My point is, command line tools have their own discrepancies and inconsistencies, despite still being very useful, and there actually are efforts in areas to improve the application interaction, similar to what you're talking about. However UI design and differing UI paradigms isn't really related to the common api problem, nor will having one unified interface guarantee free interaction.

Mar 18 2013 11:36 AM

Very nice post. Joel on Software's Strategy Letter VI is slightly related as well (Note: The first few paragraphs will seem completely unrelated, but they lead into it).

Mar 20 2013 03:58 PM
When he talks about the nix cli he means the interop of the commands. Every command does something differently yes but they can all interop with each other. For instance you can pipe a result from find into grep which then gives its result into sed which then manipulates the stream. This interop creates an echo system where amazing things can be completed in a simple mannor.

Note: GameDev.net moderates comments.