Archived

This topic is now archived and is closed to further replies.

Oluseyi

Re-Imagineering Unix

Recommended Posts

Reapplying the Unix Philosophy I think that the 9 Tenets of the Unix Philosophy by Mike Gancarz and a brief exposition of the nine tenets, as well as the ten lesser tenets are solid principles for the design and implementation of software in general and operating systems in particular. Today we encounter more and more accusations of Unix not being "user-friendly" or being "difficult"; while Unix enthusiasts generally shrug of such comments with disdain, pointing to the simple elegance of the system and the power it provides, how can we fundamentally improve Unix? How can we maintain its power and expressiveness while making it easier to use? No, just throwing a GUI on top is not enough. GUI and CLI Inadequacy The big flaw of the graphical user interface is that it leads to the construction of applications with captive user interfaces and a lot of code redundancy. In addition, no one so far has been able to conceive an intuitive way to string GUI applications together to create powerful custom tools the way we can with CLI utilities. The big flaw of the command-line interface, OTOH, is that unless you already know what they do or where to look for documentation, you''re screwed. GUIs can take advantage of exploration and recognition to make themselves more usable while CLIs can''t. Is the solution to make GUI tools front ends to CLI utilities? To supposedly get the "best of both worlds" by making applications usable in multiple modalities? I don''t think so. Concept: Micro-Services CLI utilities fall into two categories in terms of their data input and output properties: generators and filters. A utility that "consumes" data - accepts input and generates no output - is a virus. Utilities that we are all familiar with include ls, cp, rm and find. I''ve been wondering whether it is not possible to convert the vast majority of system utilities, and even many third-party utes, into "micro-services" - headless applets of a System daemon that perform the core processing and return results to whatever application is requesting them. CLI and GUI utilities alike would use the same processing, meaning that less code - including glue - would need to be written for each of them (we''ll get to revamping CLIs and GUIs themselves in a sec). As an example, CLI ls would simply pass the filename pattern to System:ls and display the results while a GUI filebrowser would do the same, but then render icons or fill out detail tables based on the data returned (System:ls would return a pair of unordered lists of files and directories respectively by default, but could be instructed to return more information). Thus, the interface tool focuses on interfacing while the functionality is made a system-accessible service of sorts. Any other application that seeks to obtain a list of files simply makes a request to System:ls. Now how is this better than calling an API function? The invocation mechanism is the same whether you''re calling System:ls, System:cp or Oluseyi:Discombobulate - akin to a network send/recv session - and thus there''s no special libraries or binding involved. Furthermore, applications would be able to direct one microservice to pass its output to another microservice, thus emulating the commandline''s piping and redirection features whether in CLI or GUI apps (less of a big deal for CLIs, but fairly significant IMO for GUIs). Concept: I/O Abstraction The concept here is that almost every application needs to perform some kind of file I/O sooner or later; the more applications you have that can read and write the same file format (except for plain text), the more copies of functionally equivalent code you have on your disk. Data falls in relatively few categories (one of which is a hedge, basically for "unclassifiable" data): Text, Image, Audio, Video and Application - the basic MIME types. By structuring I/O hierarchically underneath these headings - thus obtaining text/plain, text/html, text/xml, audio/mp3, audio/ogg, video/divx, video/mpeg4, image/gif, image/png and so forth - it is conceptually possible to create a system where applications deal with the high-level type (eg Image) and system codecs deal with converting from a specific format (eg image/png) to the high-level type. The upsides of such a system are 1) less identical or functionally equivalent code on your system, which can be telling when totalled; and 2) the addition of a codec can potentially allow all applications that deal with the associated high-level type to read the specific format - adding a PNG codec means that Adobe PhotoShop, Internet Explorer, Microsoft Word, MS Paint and all other apps that render and/or write images (using Windows examples) can now read and write PNGs. Essentially, I/O becomes another system API or shell script that everyone can use in developing their app, and continue to gain from even after their app is shipped. Developers no longer need to focus on inanities like supporting obscure formats, and users can use their favorite editor with their favorite format even if the editor''s developer didn''t support the format by adding a codec to their system as a whole. Concept: GUI Assembly Taking a page from graphical flow process tools like National Instruments various instrumentation lab simulators and coupling with the micro-service concept (especially when our micro-services return structured information about themselves upon request), a GUI app could be written that allows for the rapid graphical construction of CLI-style piping and redirection-laden custom tools/utilities. It''s not as efficient as the real thing, but it allows users unfamiliar with the system to reap many of the same functionality benefits even if they don''t do it at the same speed as CLI wizards. These aren''t very concrete ideas, but I just wanted to share and see what others think or come up with.

Share this post


Link to post
Share on other sites
GUI and CLI Inadequacy
I''ve often thought about how one could attempt to adapt the strengths of the unix command line into a GUI. I haven''t read other papers on this topic or really discussed or tried to implement my ideas, for what it''s worth.

My favorite idea for an implementation right now is an extremely drag-and-drop-able GUI. This is my favorite idea at the moment mostly because:

  • It could be done with existing toolkits, if an appropriate transport framework exists for the drag-and-dropping, and existing applications would benefit.

  • It''s more immediate and compact than "pipeline construction" style programs.



A special "sink" variety of file manager might have to be implemented or plugged onto an existing file manager to replicate some of the file manipulation type of stuff.

The text manipulation tools (awk, sed, et cetera) would probably be the hardest to replicate in a GUI; maybe a "pipeline construction" program would be applicable for this?

Of course, while this approach may be overall more intuitive that a "pipeline construction" approach, it may be more tedious unless a simple way to "replay" (macros, whatever) a series of tasks is devised.

Concept: I/O Abstraction
This is something that belongs in a user-land library, in my opinion (not to say that you wanted it in the kernel, or whatever). Libraries already exist to an extent that do this with audio and video (and combination) files (GStreamer, and to an extent libxine, among others). So, this comes down to the whole "getting everyone to use the same library" deal.

This is just an "off the top of my head" post, feel free to critique or whatever.

Share this post


Link to post
Share on other sites
quote:
Original post by Oluseyi
In addition, no one so far has been able to conceive an intuitive way to string GUI applications together to create powerful custom tools the way we can with CLI utilities.

It''s called Object Linking and Embedding, aka OLE. The very thing that allows you to put an Excel spreadsheet into a Word document that contains a Visio diagram.

I think that making GUI a frontend for CLI is a very bad idea. This is essentially what Linux people are doing (KDE, Gnome) and the results speak for themselves. I think there a much better solution that Windows Longhorn will probably expose. If Microsoft does this right, it will blow away Linux CLI. MS Office components are already scriptable (look up COM automation). Anyone with sufficient knowledge of VB (or any other COM enabled language) can do very powerful things with Office, a lot more powerful than Office GUI allows. Now, take this one step further and make the shell a fully blown scripting environment based on a modern scripting language (Python, anyone?). I suppose Microsoft will use C#, although I think a dynamically typed language would be a much better solution. Voila, you now have a very powerful GUI and an even more powerful CLI. Also, since the .NET framework is natively scriptable, you can take full advantage of a huge library in your scripts.

Share this post


Link to post
Share on other sites
I''ve already thought about this extensively, and come up with essentially the same ideas.

What you describe is essentially Bonobo and GStreamer, components of Gnome. Don''t get me wrong, Gnome is nothing like what you describe, but the architecture is there. Gnome, as it currently stands, is essentially Windows by another name, but the value of the Gnome 2.x series is to provide an upgrade path.

Application developers and users currently cannot understand this kind of UI, so implementing applications such as AbiWord and Gnumeric as applications instead of components, makes sense at the moment, especially given the slow, methodical progress that Free and Open Source Software tends to make. When the components exist, then they can switch to the new model with a minimum of effort and loss of functionality.

It''s kind of creepy that you mention flowcharting applications as one method of constructing these applications. There happens to be such an application for constructing Gstreamer pipelines. I also considered this, but realized it was much too slow to begin to compare with command line applications. But this is not a fundamental problem. In fact, it can be possible, with a surplus of information, to make command line applications seem wasteful (in the time it takes to create meaningful applications), in addition to being user-unfriendly. Look to vi and Blender for examples of what I''m talking about. There are fundamental problems with both UI''s, but they''re actually the same problems that exist in command lines! They don''t show the available options and the current state. The only thing that needs to be grafted on to these UI''s is a system for graphically representing the state. (both UI''s are modal, and both UI''s use single- and double-keystroke commands to select from the available options in the current mode) I could explain this, but it''s somewhat offtopic at this point.

Back on topic, there are additional issues you haven''t considered: data persistence and network transparency.

Data Persistence:
You had the right idea, that all applications should be able to read files using services provided to each component. You didn''t consider the flipside of this. Let''s say you have a server status application that updates the cells in a spreadsheet component, and you decide you want to add a cell to the bottom that averages some of the other cells, and then send that average to a logging program.

How do you save that mess to disk? If you just saved the spreadsheet, you''d probably just get the numbers whenever you saved it, and the numbers would not continue to update. Similar things happen if you try to save the other components individually. So you need to save them all into the same file and preserve their connections.

I''m pretty sure that''s what the GDOM project is about.

Network transparency:
You''ve already allowed for the creation of a unique identifier for each service. Why not just extend that to the full URI spec? Then you could have components in your pipeline that were not on your computer. They could be rendered over a remote X connection, and you''d never notice the difference, if it weren''t for latency. Or these things would be provided to a local component via XML, which would be decoded and GUIfied. It doesn''t matter which, because there''s not difference between programs and pipelines.

This is almost free from the way everything else''s tied together (X and Corba both have network transparency. Bonobo probably does too). Some stuff that isn''t free might be provided by Mono.

Share this post


Link to post
Share on other sites
quote:
Original post by CoffeeMug
It''s called Object Linking and Embedding, aka OLE. The very thing that allows you to put an Excel spreadsheet into a Word document that contains a Visio diagram.

Why doesn''t anyone on windows actually use OLE (besides Microsoft, of course) already? As you say, the advantages are obvious with Office.

Or do they and just never tell me about it?

Share this post


Link to post
Share on other sites
quote:
Original post by Flarelocke
Original post by CoffeeMug
It''s called Object Linking and Embedding, aka OLE. The very thing that allows you to put an Excel spreadsheet into a Word document that contains a Visio diagram.

Why doesn''t anyone on windows actually use OLE (besides Microsoft, of course) already? As you say, the advantages are obvious with Office.

Or do they and just never tell me about it?


They do. They just don''t advertise it much. Microsoft don''t lard their adverts with technical jargon for a very good reason: Most people who hold the purse-strings for a business would be put off by terms like "COM+", "ActiveX" and "OLE". These people wouldn''t know the difference between the Windows Scripting Host and a game show host.

The people who DO know all this stuff tend to ignore marketing anyway and know where to find the info that cuts to the chase.

As for what else uses OLE: Internet Explorer is another example. In fact, anything that is an "ActiveX" object is scriptable to some extent. And Windows has tons of such objects. Need a rich text box? It''s there: just drop it into your app. Need to play an MPEG video in your app? Drop in the Media Player component.

Windows'' Scripting Host is capable of running scripts in any supported language, which is one of the main reasons for wanting a CLI in the first place. (The Host comes with support for VBScript and ECMAScript as standard, but support for other scripting languages, including PERL, is already out there.)

Microsoft may have their faults, but they''re not stupid. Many users are not IT savvy, so there''s no point blinding them with science. The MS publicity machine only ever talks about the tip of the Windows features iceberg.

As for KDE and GNOME, those teams still can''t get cut and paste working properly. I wouldn''t hold my breath waiting for them to deliver robust drag and drop.

--
Sean Timarco Baggaley

Share this post


Link to post
Share on other sites
Cut and paste works just fine.

quote:
They do. They just don''t advertise it much. Microsoft don''t lard their adverts with technical jargon for a very good reason: Most people who hold the purse-strings for a business would be put off by terms like "COM+", "ActiveX" and "OLE". These people wouldn''t know the difference between the Windows Scripting Host and a game show host.

Perhaps I wasn''t clear, but I meant, who makes OLE components (that''s what it''s for, isn''t it?) besides Microsoft? Kinda boring if everything that can be included in a program has to be included in the monopoly, too.

Share this post


Link to post
Share on other sites
Just out of curiosity, why do people think that a GUI as a frontend to a CLI program is a bad thing? I mean, if the wheel already exists, you don''t need to reinvent it. You just need to make it look nicer.

Granted, this only applies for existing software that works well, but I certainly agree with you if you''re talking about new programs, or improving on aging ones or something. But otherwise, I don''t see why it''s so bad.

The Artist Formerly Known as CmndrM

http://chaos.webhop.org

Share this post


Link to post
Share on other sites
quote:
Original post by Flarelocke
Perhaps I wasn''t clear, but I meant, who makes OLE components (that''s what it''s for, isn''t it?) besides Microsoft?

OLE is an old (about 1994) term for COM (Component Object Model). There''s A LOT of software for windows that takes advantage of COM components. COM components aren''t necessarily scriptable, but they can be if the developer chooses to go the extra mile. Most business/financial software is based on COM architecture because many third parties need to take advantage of its functionality programatically. The reason why many applications are not based on COM mainly has to do with the fact that COM components are rediculously hard to make in C++. However, I believe everything made in VB is based on COM. With Microsoft pushing .NET the scene is changing. Every class essentially becomes a component that natively supports scripting without COM''s pain of macros and dual interfaces. We can''t really take advantage of that yet in Unix''s CLI sense because the shell framework isn''t in place. There are rumors that Windows Longhorn will have a very powerful shell, most likely based on VB or C# (although I think it will support all .NET languages). This will probably be the time when we see a modern CLI, much more powerful and intuitive then the old unix one.
quote:
Original post by Flarelocke
You''ve already allowed for the creation of a unique identifier for each service. Why not just extend that to the full URI spec?


This is already in place. DCOM (Distributed COM) allows you to do just that. This isn''t often used on home machines but in financial software you often don''t know whether the component you''re using resides on your local machine or somewhere else on the network.
quote:
Original post by Flarelocke
How do you save that mess to disk?


.NET components natively support serialization. In three lines of code you can save very complex hierarchies to disk and load them next time you need them. Of course it also provides very powerful interfaces for customizations, in case default serialization doesn''t suit you.
quote:
Original post by Strife
Just out of curiosity, why do people think that a GUI as a frontend to a CLI program is a bad thing?

Because CLI is designed to be just that: a command line interface. What works well for CLI doesn''t work well for GUIs and vice versa. Look at all the Linux GUI tools that use CLI programs on the backend. They suck, and they do so for a reason. Make a small experiment, take a simple CLI command and try to design a good GUI for it. Chances are your GUI won''t be very intuitive because it has to fit a design with a completely different philosophy.

Share this post


Link to post
Share on other sites
quote:
Original Post by 9 Tenets of the Unix Philosophy
The program is loaded into memory, accomplishes its function, and then gets out of the way to allow the next single-minded program to begin.


This is exactly why a GUI cannot be designed as a frontend for CLI. The philosophy is entirely different. Note, that the philosophy from the quote above might have been ok fourty years ago when computers were mainly used as time sharing machines for processing data. Today, when personal computers are used to browse the web, write documents, do video conferencing, etc. this philosophy is simply outdated.

I disagree with many other standpoints in that article because they''re designed for entirely different purposes. Computers today have different purposes than they did fourty years ago, software engineering has also changed.

Share this post


Link to post
Share on other sites
quote:
OLE is an old (about 1994) term for COM (Component Object Model)
I was under the impression that OLE was a protocol (for lack of a better word) that utilized COM for the purpose of embedding heterogenous types of documents. Googling is inconclusive on the matter.

quote:
The reason why many applications are not based on COM mainly has to do with the fact that COM components are rediculously hard to make in C++.
I''m also inclined to suspect that application developers are unwilling to risk their applications becoming fungible and invisible, or that developers will coopt the functionality of their software and include features that will obviate any need to upgrade. Then again I could just be spiteful or paranoid.

quote:
.NET components natively support serialization. In three lines of code you can save very complex hierarchies to disk and load them next time you need them. Of course it also provides very powerful interfaces for customizations, in case default serialization doesn''t suit you.
Nontrivial applications of this ability, by any developer other than Microsoft, seem to be vaporware at the moment. There''s no reason to suspect Microsoft will beat Mono and Gnome to useful application of these ideas.

quote:
Just out of curiosity, why do people think that a GUI as a frontend to a CLI program is a bad thing? I mean, if the wheel already exists, you don''t need to reinvent it. You just need to make it look nicer.
Two reasons: data types and persistence.

Data Types:
With a few exceptions, everything going through the pipes that connect programs are text. The exceptions would be tarballs sent through an unzipper on their way to tar. Naturally, it would be inefficient to mandate that every piece of software accept every type of data and ignore the types it does not understand. It would be unstable and unuserfriendly to have the programs operate on the data anyway.

So you could send a mimetype through a pipe at the beginning, but this would likely break at least partially the existing software base, which was the impetus in the first place.

UI Persistence:
You don''t enter data in a spreadsheet, tell some other program to read the data, and expect the original data to disappear, or for the spreadsheet itself to disappear, which is the equivalent of what happens in a pipe. You normally expect that a spreadsheet and its data will, barring any mishaps, remain in place and intact where you left them until dismissed. This is a necessary and desirable aspect of many applications, and only the most hardcore CLI addict could not do without its console analogs -- vi and emacs.

Share this post


Link to post
Share on other sites
quote:
Original post by Flarelocke
I was under the impression that OLE was a protocol (for lack of a better word) that utilized COM for the purpose of embedding heterogenous types of documents.

Google is unclear because there are no clear definitions. OLE generally refers to drag and drop and object embedding, and was based entirely on COM. Later on Microsoft dropped the term "OLE" in favor of "COM".

quote:
I''m also inclined to suspect that application developers are unwilling to risk their applications becoming fungible and invisible, or that developers will coopt the functionality of their software and include features that will obviate any need to upgrade.

Perhaps, but I really don''t think it''s the primary reason. It''s very hard to design your application in a way that allows exposing your API to others for a number of reasons. First, because architecturally all internal API is generally not ready for public use. It would take a lot more time and money to develop usable APIs along with documentation. Second, because scripting is hard to support in languages like C and C++. With COM it requires a lot of dirty, repetitive work. Of course ATL simplified the job, but it''s still not something I''d like to do. Plus, COM was rediculously hard to manage due to registration, etc.
quote:
Nontrivial applications of this ability, by any developer other than Microsoft, seem to be vaporware at the moment. There''s no reason to suspect Microsoft will beat Mono and Gnome to useful application of these ideas.

I really hope you''re kidding. The serialization functionality is available in .NET right now, today, this very second. A lot of people are taking advantage of this functionality. I''ve looked at Bonobo and frankly it''s a crappier version of COM. I can''t image why would anyone choose to use it for simple software. Essentially the developer is forced to do all the housekeeping while with .NET it''s done for you automatically in a clean, elegant and consistant manner.

Share this post


Link to post
Share on other sites
I''m taking a look right now at a system called Athene... it seems all the programs/scripts/system calls in it can be written in DML. Maybe this would create a better integration between GUI and CLI, since with DML (i think) you could write things independent of how it is presented on screen. Well, maybe i got it all wrong, in that case just ignore me...

Victor.

Share this post


Link to post
Share on other sites
You know, whenever I see the letters "ML" at the end of something, I automatically become very causious about real, practical use of the technology. I still think XML has no real practical use even though it''s used everywhere for some screwed up reason.

Share this post


Link to post
Share on other sites
quote:
Original post by CoffeeMug
I really hope you''re kidding. The serialization functionality is available in .NET right now, today, this very second. A lot of people are taking advantage of this functionality. I''ve looked at Bonobo and frankly it''s a crappier version of COM. I can''t image why would anyone choose to use it for simple software. Essentially the developer is forced to do all the housekeeping while with .NET it''s done for you automatically in a clean, elegant and consistant manner.

Mono is essentially a clone of .Net, with ties to Gnome. I don''t remember the acronyms, but they''ve got a working implementation of an interpreter for whatever that bytecode that C# compiles down to is called.

Yes, Bonobo has many unfortunate similarities with COM.

quote:
You know, whenever I see the letters "ML" at the end of something, I automatically become very causious about real, practical use of the technology. I still think XML has no real practical use even though it''s used everywhere for some screwed up reason.

Yeah, I thought so too, at first. Then I learned about XSLT, and it started to make a lot more sense.

Share this post


Link to post
Share on other sites
quote:
Original post by CoffeeMug
You know, whenever I see the letters "ML" at the end of something, I automatically become very causious about real, practical use of the technology. I still think XML has no real practical use even though it''s used everywhere for some screwed up reason.


lol; with me it''s exactly the opposite...

Victor.

Share this post


Link to post
Share on other sites
quote:
Original post by -vic-
Just like Strife, i don''t understand what''s the problem in writing GUI front-ends to CLI utilities. Someone said the GUI front-ends in Linux suck, but i have to disagree. Yes, some of them suck, but not all of them. Examples: the Gnome System Tools; Komba, etc.


This was exactly my point. I know that many suck, but many don''t. I think a lot of it is up to how the GUI is designed. It can either be done crappily or well. I still don''t see how CLIs just don''t work well for making them into a GUI frontend.

And again, I do agree that new software should not be frontends as much as possible. But in some circumstances (e.g., programs like mkisofs and cdrecord), why reinvent the wheel?

The Artist Formerly Known as CmndrM

http://chaos.webhop.org

Share this post


Link to post
Share on other sites
quote:
Original post by CoffeeMug
OLE is an old (about 1994) term for COM (Component Object Model).

No it isn''t. OLE is built on top of COM, and comprises a bunch of COM interfaces that components must implement in order to be linkable and embeddable. OLE is more like an old term for ActiveX, which has now evolved into dotNet.

Share this post


Link to post
Share on other sites
.NET holds an interesting prospect for scripting. Since it includes, natively, a couple of languages (VB and Jscript.NET that I know of) with bindings to all .NET languages, and the scripts are JIT''d, any application that wishes to script their components can, and very easily. Say I have a C# program that has a bunch of functionality, but I want to be able to manipulate it through scripts. I build my C# program and provide simple (it''s not more than 25 lines of code) for running a script. I add references in a given script to my C# assembly, perhaps a few more .NET assemblies, and there I have it, fulyly scriptable. It doesn''t get much simpler than that.



Gamedev for learning.
libGDN for putting it all together.
An opensource, cross platform, cross API game development library.

Share this post


Link to post
Share on other sites
quote:
Original post by SabreMan
OLE is built on top of COM, and comprises a bunch of COM interfaces that components must implement in order to be linkable and embeddable. OLE is more like an old term for ActiveX, which has now evolved into dotNet.

Yes, OLE and ActiveX can be used interchangably. However, as you said above OLE is based on COM and according to Eller (he was responsible for MS systems group at some point, forgot his first name) Microsoft decided to drop marketing OLE (and ActiveX) in favor of marketing the underlying technology - COM. You're correct though, my statement that OLE and COM can be used interchangably was misleading. Sorry about that.

[edited by - CoffeeMug on July 3, 2003 5:20:13 PM]

Share this post


Link to post
Share on other sites
quote:
Original post by Flarelocke
Why doesn''t anyone on windows actually use OLE (besides Microsoft, of course) already? As you say, the advantages are obvious with Office.

Or do they and just never tell me about it?


COM is the next-generation of OLE - everyone uses it. (and .Net the next-generation there-after-that).

Share this post


Link to post
Share on other sites