KDE 4: Beyond Hierarchical Data The Desktop as a Searchable Web of Context Talk at FOSDEM 2005 by Scott Wheeler

These are my rough notes only, corrections welcome.

Scott's goal is to test all the presentation templates by coming up with the longest most pretentious names possible and he seems to have won again.

He gave a talk to akademy and again in Chile but his ideas have changed since then, this talk is similar to the one at Linux Bangalore but probably nobody from there is in here (although he's said that before and Linux Bangalore people were there).

Screenshot of MS-DOS prompt. People Scott's age and older will remember that. Gopher screenshot. Gopher was a hierarchy based predecessor to the web. Browsed hierarchy like the filesystem. KControl is a hierarchy, tree on left then tabs then advanced buttons. GConf is the same thing. Yahoo tried to do the same for the web, nobody would try this now and Google took over soon enough. Windows registry editor screenshot, he he he. Same sort of structure. Windows control panel same thing. Does anyone remember the original Windows 3 control panel? It had 4 groups with a dozen settings in each and heirarchies work fine for that but KControl has maybe 1000 options in it and it stops being useful. KDE Developers get lost in KControl so that probably means the users are too. With file managers the only difference between those of today and from original Mac System is file previews and anti aliased fonts.

These days filesystems have thousands of files, his has about half a million. All the desktops are struggling with what to do with settings. We have choices: we can remove settings which can help, we can hide them like GConf and Windows Registry tried to hide them from "normal users". He doesn't quite know what a normal user is, he's never met one. He doesn't think there really is such a thing, everyone has quirks to what their usage is. We have to approach a more general thing than helping "a normal user".

Users remember how to find settings and files. He can remember how to change his mouse settings, regions->peripherals->mouse settings but he can't remember most things. He can remember where he saves most files. Same thing happened with gopher and yahoo, you could remember where it was that you mostly used. If you remember the early 90s the web was a few hundred gigabytes and it already needed search. Now most hard disks are the same size. The web is huge and hierarchical sorting has broken down, we're moved to metadata and data aware approaches for information.

There used to be loads of search engines but now everyone uses google. What are they doing right? There is a pagerank algorithm which determines relationships not just content. Metadata can be fooled but the link makes a web of context that can be used. Can we do the same sort of links on the desktop? Links can work much better on the desktop, the web is restricted by HTML and network time, issues we don't have on the desktop. "Why can I find things faster on the web via google than on my own hard drive?" if he is searching for a header file he looks on Google then finds that file on his hard disk. The information is already on the hard disk but he can't find it, there is something wrong there.

So he started writing a desktop index and search tool, this was before google desktop search, beagle etc. Even though he had this search tool on the desktop he could still find it faster in google. So something was wrong.

This is not a search tool as we've seen coming from google and beagle etc in the last 9 months. It's not a database or storage tool. It's a mechanism for storing relationships and framework for building applications (such as a search tool).

At one point keyboard layout and language settings were in different parts of KControl, they've been moved around a lot in KControl but there's no way to show that relation. On the web you can just put in a link to show a relationship. If someone navigates from keyboard to language a lot that information is lost. So links can be made by usage patterns. Links can also be made explicitly by users.

KNotes is a sticky notes program for KDE. With real sticky notes you stick them to things, your monitor or fridge, but if we had a framework for links you could stick a sticky note and stick it to an e-mail. Unlike real sticky notes we can search for it the other way.

With e-mails you get attachments so you just save the file. That throws away a lot of information such as who it came from and when. If you are searching for things it would be useful to have all that information. So we can track back through the web of context using what we know about that e-mail. Currently called KLink, some basic structural elements exist but nothing fixed. Target is KDE 4 which is a year and more away.

The data is represented as a graph with nodes and edges. The web of context is represented as a large bi-directional graph. Google has pages not just an infinite list of links, this is because with a graph you get to what feels like an infinite number of results so you can't just search it all at once, pages gives it a time to stop the search.

These ideas aren't completely new, semantic desktop and semantic web research have been researched in universities. Mockups use pretend desktops with pretend data and pretend applications, in KDE we can do this for real beyond a research prototype.

Question about reindexing, that takes a lot of time so they'll try to keep that to a minimum. Links can be updated during saves and load.

Why use nodes and links instead of the RDF model (not RDF XML format). Mostly this is because he didn't know about RDF when he was first looking into this. He is looking at that now. JonFlux says RDF is about building a model, klink you don't worry about building a model because it should happen naturally.

Has he looked at object filesystems etc? He's read some research papers and it's all pretty interesting but it's not in real use, until you can send e-mail with it nobody will be interested.

Reiser 4? It's cool but not something we can rely on yet because it's not used by enough people.

Metadata? Of course that's useful but you need links based on something else to find answers which are not obvious to those making the metadata.

Desktop independent? Yes at some point but he wants to get it working first then start convincing other people that it's worth using so it can be used beyond KDE on servers and other desktops.

What's fun about this is that when he starts talking to people he can see their minds thinking then they come up with some cool use for these ideas.

fosdem-2005-search-talk

KDE 4: Beyond Hierarchical Data The Desktop as a Searchable Web of Context Talk at FOSDEM 2005 by Scott Wheeler