D-Bus in the kernel, faster

nuxi7 · on Sept 17, 2010

+5 for realizing that dbus-daemon is stupid because IPC is the kernel's job

-10 for fixing it by putting dbus-daemon into the kernel

To redeem yourself please try again with the following requirements:

* No dbus-daemon

* No changing or adding kernel interfaces

Hint: put all the brains inside of libdbus

FooBarWidget · on Sept 17, 2010

How do you want to implement broadcasting without a daemon or some other intermediary? It's called, you know, dbus.

sandGorgon · on Sept 17, 2010

I have a question - with systemd [http://www.freedesktop.org/wiki/Software/systemd], do you still need a daemonized message bus ?

OSX's launchd, creates listening sockets and when it receives a connection, it starts the service and hands over the socket to the service.

Given such a scenario, can you not simply work with plain-jane IPC ?

FooBarWidget · on Sept 18, 2010

I don't see how that can be compared with a message bus. They do different things.

humbledrone · on Sept 18, 2010

An intermediary-free broadcast system can easily be implemented by setting up a memory segment that is shared between processes. One process writes to the memory segment, and many other processes can read from it (usually after being signaled to do so via a condition variable).

FooBarWidget · on Sept 18, 2010

I dispute the notion that it is "easy" and the implicit notion that it's better than a system with an intermediary.

* Inter-process condition variables are not supported on all operating systems. The next best thing would be sockets or message queues but these don't support broadcasting.

* Having multiple processes read and modify the same memory segment requires some kind of locking. Inter-process locks are expensive (or at least, more expensive than local-process locks) on most operating systems. On Linux a lock file is more expensive than a POSIX mutex. This can really kill performance compared to a system in which sockets are used to communicate with an intermediary and other bus listeners.

* Any process can corrupt the shared memory segment and screw up the state of all other processes that use it.

* With shared memory it becomes impossible to properly detect dead processes. Suppose a process connects to the bus and increases some kind of reference counter in the shared memory. If the process crash without properly decrementing the counter then other processes would never know that the number of processes would have gone down. You can add checks like checking whether other PIDs still exist but this is unreliable and kills performance.

If you can prove me wrong, that'd be great, because I'm looking for an IPC mechanism that's faster than Unix domain sockets. I tried inventing my own with shared memory and inter-process locks and couldn't get it nearly as fast as sockets.

humbledrone · on Sept 22, 2010

I didn't mean to implicate that using shared memory would be better than an intermediary. I do still think that implementing a simple broadcast mechanism with shared memory is fairly easy, provided that you have access to inter-process condition variables. If not, things do get a lot more complicated (e.g. polling, yuck).

I don't think I can prove you wrong, per se. I do have a suggestion, though: have you tried using shared memory for data transfer, and using a UNIX socket for synchronization? Only a tiny bit of data would have to be transferred over the UNIX socket (shared memory address to read from, etc), and the rest of the data could be read via the shared memory.

humbledrone · on Oct 1, 2010

Another thing you might consider: lock free data structures. This C library for exactly that just hit the front page:

http://www.liblfds.org/

wmf · on Sept 17, 2010

How much D-Bus traffic is broadcast and how much is point-to-point?

nitrogen · on Sept 17, 2010

I imagine a lot of D-Bus traffic is at least multicast, such as window state notifications, media player status notifications, etc.

kijiki · on Sept 18, 2010

a) netlink socket. b) multicast.

wnoise · on Sept 17, 2010

This really seems like something that shouldn't belong in the kernel. The D-Bus protocol is a bit more complex than other IPC, and it makes sense to me to keep that level of complexity in userspace. (Of course if it were a simpler model -- just a bunch of multicast channels, rather than all this filtering, it might make sense to put bits in the kernel).

someone_here · on Sept 17, 2010

I'm surprised it has such an impact on an N900 signing on to Jabber. I had no idea it was so reliant on d-bus.

sqrt17 · on Sept 17, 2010

That's the downside of having VM-based system emulation easily available: People who think that userland scripting is too complex take this complex functionality that occurs in semi-critical areas into C code, possibly into the kernel. Because they can debug it just fine when they're trying it out in the VM.

Fast forward: the thing gets deployed by default on Ubuntu (or SuSE, or Fedora ... whatever), and people start getting hard-to-debug crashes. Someone, somewhere, will run into an ugly corner case. Or some app that has been quietly chugging along in the corner suddenly causes serious breakage.

That's the moment when you get all those bugs closed with WORKSFORME, INVALID or other non-solutions, because users can't get the debugging info on their production system, and the enthusiastic developers can't reproduce it on their system and are not as enthusiastic about figuring out which obscure corner case caused the sytem to misbehave.

As an example, see here: https://bugs.launchpad.net/ubuntu/+source/ureadahead/+bug/48... Under various conditions, the boot process just gets stuck (e.g., when something has juggled the hard drive letters and what used to be /dev/sda is now /dev/sdb and vice versa, or you have a routine fdisk check, or something else), and when you switch off the splash screen (or don't have it switched on at all, for obvious reasons), all you see is the (spurious) message from ureadahead-other. Net result: this kind of spurious hangs on boot, which were reasonably easy to diagnose in the time of SysVinit, are now a serious PITA and the developers which are responsible for the boot process don't give a fart because people don't have the knowledge that's needed to debug the subtle interplay of dbus, upstart, plymouth (it's obvious that "plymouth" is the splash screen software and not a device driver for a CRT-based face tanning, isn't it?), and possibly others. (Full disclosure: I twice or thrice spent a whole morning figuring out a random boot problem without getting anywhere. After several hours, I found out that part of the problem was caused by mountall not finding something, then asking whether you want to continue booting without actually displaying a prompt that says so. Back then, I managed to find the corresponding bug on launchpad among the 5+ bugs about weird, hard-to-diagnose boot problems where all you see is the ureadahead-other message).

TL/DR: if you consider moving X into the kernel, please imagine yourself in a situation where X breaks in production and you have to guide a user through the debugging of X and various assorted components that talk to X in order to find out whether it's X or Y or ABC and what exactly is breaking. If that thought would give you a bad dream, please refrain from putting X into the kernel. Thank you.

swolchok · on Sept 17, 2010

"X" is a really bad variable to use, seeing as how it's the windowing system.

xentronium · on Sept 17, 2010

I think you got it the other way round.

X is a really bad name for the windowing system, considering they used it for generic variable name for centuries.

IgorPartola · on Sept 17, 2010

UNIX has a history of names that don't make sense to an outsider. The first time I came across bitchx I was very confused.

jey · on Sept 17, 2010

Mach messages sneaking into Linux? </sarcasm>

nodata · on Sept 17, 2010

D-Bus in kernel, kernel slightly slower?