Linux's Critical Path To The Desktop

I've been using Linux on and off for about 5 years now. And everytime I try it, it's much better than last time. So much so that people are now seriously comparing it to Windows, and seeing if "normal" users could make the transition to Linux and still get their emailing, word processing, video watching etc done.

Well my personal experience is that, even with my reasonable knowledge of Linux operation I still struggle with tasks in Linux that are obscenely simple on Windows, and STILL just "hard" on Linux. Even though once you've mastered it on Linux it works faster and more efficently than Windows AND you have that warm fuzzy feeling of not contributing to the damnation of the world's software via paying tax to the beast ;).

So this is my own little list of critical things that we in the Linux community need to push/fix so that the Linux desktop is a reality.

Now I may be missing some "obvious" things here, but the overall impression I get is still somewhat accurate. Get the "spirit" of what I'm saying rather than bitching about how "wrong" I am.

Desktop Configuration Formats/Files KDE, Gnome and Rox all have their own MIME format database, and their own application menu definition. This is just silly, applications that want to read/write to these configurations would need 3 or more different bits of code to do so. If a user switches between several different desktops the config is not shared.

It just makes sense for all desktops to use the same file formats and locations for configuration of MIME data and application menus (and any other sharable configuration).

So Free Desktop have put together some projects to define once and for a common MIME database format/location and Application Menu standard.

These are going to be critical to getting Linux's ease of use even close to that of Windows. The user is not going to put up with editing this information in difference places, let alone understand why there are different places for the same information in the first place.

Media As JWZ so succiently sums up the "pathetic state of Linux usability" in regards to video, I just have to agree. After some 6 hours of playing with the applications available for playing a DVD I pretty much gave up and headed back to Windows... where I can use polished programs that "just work". Ok so the kernel recompiles were my own fault for using Gentoo but, things like TV out are a mind bending exploration of driver related command line hell (which is far faaar worse than DLL hell). We aren't there yet, users can't configure a frame buffer via the command line, and it's seems neither can I, a seasoned computer user.

As a side issue, sound is somewhat sorted out. Things tend to work "out of the box" on a modern distribution. I even had sound when I first booted after installing. We are getting there with sound.

The system should come with startardised codecs for audio, video compression/decompression, usable in any application.

The sound system should have unlimited virtual channels of audio muxed together to output on the sound device. An application should be able to claim exclusive access to the sound IO device for specific purposes. Ala a DVD player might want to lock out all system sounds while DVD playback is in progress.

The video hardware's overlay functionality should be exported to applications via an easy to use API.

Application Install *sigh*, where do I start? I've been a believer in "simple is better" approach to software distribution for a long time now. All my own applications come in a zip file which if unzipped on your disk will run without any install as such. Now this is marginally more complicated then a Window's "setup.exe" but it's got it's benifits outweigh the disadvantages.

When looking at package management, application install and library dependencies I feel we are "over engineering" a solution to what has become a barrier to entry for new users. I hate to admit it, but Windows has a far easier system for application installation, so much so that any grandma can install a behemoth application like Microsoft Office, without so much as blinking.

I've given up with packages as completely pointless, thats why I switched to Gentoo solely to take advantage of it's portage package management, which on the whole has served me well (apart from it's data tree breaking at one point). A simple command "emerge mozilla" download's and compiles what is a fairly hefty app. Thats usability folks. Now on non-source distributions Mozilla comes as an X11 install app with all the data files in the download. This is pretty much equivilent to the Windows way of doing it, and I salute the Mozilla developers for making life so easy.

Good requirements for application management/deployment are:

  • A one file download that can be run or decompressed into place.
  • An application and all it's configuration can be moved or backed up from a single directory. This allows the ultimate uninstall - delete the directory, and puts the user in control, rather than holding them hostage to some (usually buggy) package manager.
  • Application dependencies should either be included inside the install, part of the base OS or locatable by the application itself. If you go for the final option the application should run without the library and reduce functionality to match the installed libraries. This may mean that your app's download is pushed out by a few libraries you really need, but the side effect is that users won't ever have to hunt for some dependency that is missing. Which invariably conflicts with something else, even a different version of the same library.
Text Encoding In 5 years time I don't think we'll remember when exactly we stopped caring about the encoding that text was in. But sooner or later everything is going to be unicode. It's not perfect, but it is so much better than the alternatives that I don't see why we should put up with charset based encodings. Sure legacy apps are still going to use charset/8bit, but I'd like to see all new applications written to handle unicode natively, and make the OS push apps to conform via providing the easiest API's for unicode. For instance, it'd be nice if X11 automatically added a unicode version of strings provided by applications for selections and DND. Ideally it should be simple to write an app that only works with unicode and the OS provides IO conversion for that app.

Likewise all keyboard input should be converted to utf-32 by the GUI before the applications have to process it.

Fonts X11 fonts have been one of the less stella things about working in Linux. But the situation is rapidly getting better thanks to FontConfig and Xft/XRender. These tools provide the sort of font management and rendering that Windows users take for granted. The sooner everyone uses them the sooner the headache of X11 fonts will go away. The Linux port of my widget set is already Xft only and will use FontConfig when it's finished. Some legacy API's are so bad that it is best to demand something better or just not run at all to force people into the new millenium. This is the approach I've taken, certain libraries are just part of a modern Linux setup.
Server Administration Some server capabilities are becoming standard functions in a Linux install. Things like Samba, NAT, Apache, FTP, Mail and so on. Users don't want to know about text configuration files for setting up NICs and NAT and the associated parameters. It's so simple under Windows, yet after all these years it still takes quite a bit of research and fiddling around to get basic network functions like Samba and NAT working. I've just been there, and what is a single checkbox on Windows is a complicated config file hidden away in /etc.

I guess my real complaint here is that desktops need to provide a robust UI for administering services like I've mentioned above. And the problem is that even if one desktop does it well, which is unlikely in the near future the rest of them have to do their own implementation.

Maybe a simple solution is to make a common "NetConfig" project like the FontConfig project that puts a lot of similar network related administration and setup under the one root, regardless of the desktop, or distribution.

Desktop GUI X is going to die... one way or another. People have been looking for an alternative for years and there have been a few. Personally a lot of my enjoyment of an environment is related to the GUI's strengths. BeOS was an interesting take of what a graphical environment could be, as are many of the non-X, non-M$ OS's seem to do. Everytime I fire up X I cringe, it brings the beefiest machines to their knees. Try explaining the benifits of X (like remote display) to a user who only ever uses their computer live in person at the keyboard. People don't need remote display, and even when they do you can tack on a product on top of the OS to give you that functionality.

X needs replacing and no-one thinks it can be done. But seriously how many apps use Xlib directly? Not that many, most use a widget set to hide the from the pain that is Xlib. So if we produced a truely modern GUI I think that it would be simple enough to port GTK and Qt to that new API and you'd instantly have enough apps to run a resonable desktop. On top of that you'd be able to implement a legacy Xlib implementation that connects to the new GUI server to allow legacy apps to run natively.

It's a big project, but the rewards would be worth it IMHO. X11 is big and slow, and people aren't going to put up with it forever.

It would be an excellent oportunity to fix a swath of problems: (the server is the graphical envronment, the client is the app in that environement, unlike X's wrong nomenclature - get it right geez)

  • Windows should be clipped to the invalid (dirty) region by the server when repainting.
  • Windows should be able to have a "Popup" flag so that the server closes them when the user clicks somewhere else.
  • DND should be standardized by the GUI API, not a ad-hoc extension like XDND, which as good as it is, really makes life hard for the independant app. A lot of that code should really just be done for you by the server.
  • The latency between dragging/resizing a window is an order of magnitude too high.
  • Message passing throughput is abysimal.
  • Hardware graphics features should be easily accessable via API's. Things like mapping a window's client area to the TV out should be a simple one line call in an application.
  • Something like Xft should be the default API for text. Missing glyphs should be replaced from a different font by the server.
  • The GUI natively supports popups/menus (which counts X out).
  • The GUI doesn't allow a non-topmost window to have focus.
  • All file move, copy and delete processes should be in the same progress window. An API should be provided to facilitate this.
  • The GUI should provide transparency and shadows etc using graphics hardware.
  • The system should provide standard dialogs to applications for printing, print preferences, colour selection, find/replace, open/save file and selecting a directory.
  • The system should provide an easy way for applications to share system wide look and feel settings. Including colours for everything visible on the screen, widths and heights for standard components and other settings that effect look and feel.
  • GUI mouse / keyboard capture shouldn't be exclusive. If a widget/window want's to know about ALL mouse or keyboard events, then it should just be able to set a flag and get those events regardless of whether the mouse is over the widget or the widget has the focus.
  • The GUI should never ever EVER take the focus away from what the user is doing. Ever!
  • Startand config for mouse, keyboard, display, network, sound and media settings should be available in a control panel.

There probably quite a few more things in there that I either don't know about or can't remember right now but either X11 has a lot of changing to do or we need to start from scratch.

File System All files have their mime-type stored as metadata. Additionally the user can associate an application with a file using the applications mime-type, a separate peice of metadata: "open with".

I would like a case preserving but not case sensitive file system. Users think this way, now the OS should work this way too.

The file system browser should be standard, and usable from within other apps. It should basically be the same as M$ Explorer. Which still is an order an mangnitude better than anything the OS community has offered up. Although M$ hasn't got the threading right so that you can still do work while the network drivers are timing out.

Kernel Lower and upper limits should be available for process resources, like HD bandwidth, % CPU usagem network throughput etc.

There should always be enough CPU to type in commands at a term. This may mean limiting the throughput of virtual memory swapping so it doesn't lock things up under massive load.

The system should provide a standard interface to see what applications are running, what resources their using and give the user the option to terminate applications. And the terminate option should work all the time. Not just when the system feels like it (which counts the linux kernel out). This UI should probably let the user edit process limits as well. A process should have access to it's own limits to tell the system what it really actually NEEDS to run properly. Here are a few applications for resource limits that I know don't work well on current systems:

  • DVD players often don't get enough CPU to decode their data streams every now and then, causing them to skip.
  • Video capture should be able to allocate a set amount of HD bandwidth to prevent other applications stalling their pipeline to write video.
  • Mp3 player might want to allocate a slice of CPU to garentee playback without skips.
  • A broken application might get out of control allocating memory in an infinite loop, you should be able to limit the memory it can allocate.