good-taste

We have now two implementations, one good enough for a standalone project, another one good enough for a library. But we are not done.

Can we do better than GLib?

I’ve often argued that the gold standard for development is the Linux project (by “Linux” I’m always referring to the kernel, because that’s the name of the project). Every time I’ve seen a debate whether of development practices, code-style, tooling, you name it, Linux is doing it right. And never have I seen a case where another project did better.

Of course this is subjective and a matter of taste, but taste is precisely what we are exploring, so how is Linux approaching this?

The Linux way

struct llist_head {
	struct llist_node *first;
};

struct llist_node {
	struct llist_node *next;
};

Well, that’s certainly different.

For starters it’s two structures rather than one (I’ll explain why later), but more importantly: there’s no data pointer. Where are we supposed to put our data?

It turns out that Linux developers know C, and I mean: really know C.

What we can do is embed a structure inside another structure, for example:

struct base {
	int id;
	char *name;
};

struct special {
	void *extra;
	struct base base;
};

In this case the base structure is inside the special structure, so we can access the members of struct base through the special structure, like this: x.base.name.

That means we can embed the generic node inside a custom node:

struct node {
	int value;
	struct llist_node node;
};

To access the next element of our custom node, we do x.node.next. Problem solved.

Not so fast.

The next element is a struct llist_node *, not a struct node * – it’s the generic node, not the custom node. So how can we access the value of the next element – if the value is in the custom stucture and all we have is a pointer to the next generic structure?

This is where it gets tricky.

If the address of x.node is 0x1008 and node is at offset 8 in the struct, then x is at 0x1000. But how can we calculate that programmatically? There’s a macro for that: offsetof.

offsetof(struct node, node) gives us 8, so all we have to do is substract that from the address of the generic node: &x.node - offsetof(struct node, node) == &x.

Do we really have to do this calculation? No, linux has a convenient macro:

#define container_of(ptr, type, member) ((type *)((void *)(ptr) - offsetof(type, member)))

So we can do container_of(&x.node, struct node, node) to get our custom node data. Linux has a llist_entry macro to help us, but that’s basically the same thing as container_of, and since we know the specific type of our structure and its member, we can create our own macro:

#define node_entry(ptr) container_of(ptr, struct node, node)

So we simply do node_entry(&x.node). Given a generic struct llist_node pointer p, we can fetch its corresponding custom value with node_entry(p)->value.

Didn’t I tell you they really know C? Here is a blog post from Greg Kroah-Hartman explaining the macro.

But we are not done yet. If we rearrange our structure to have the struct llist_node on top like this:

struct node {
	struct llist_node node;
	int value;
};

Then offsetof(struct node, node) will always be zero, therefore the container_of macro is essentially doing nothing:

#define container_of(ptr, type, member) ((type *)((void *)(ptr) - offsetof(type, member)))
#define container_of(ptr, type, member) ((type *)(ptr))

This is just a cast, so a struct node * can be treated like a struct llist_node *.

If we have these structures with similar members:

struct base {
	int id;
	char *name;
};

struct special {
	int id;
	char *name;
	void *extra;
};

As long as the extra members are stored after the base ones, variables of type struct special can be considered struct base. So if we have a variable struct special *x, these two statements are equivalent:

x->name
((struct base *)x)->name

Why? Because in struct base the name member is at offset 8, and in struct special it’s also at offset 8.

This doesn’t change if we completely embed one structure inside the other:

struct special {
	struct base base;
	void *extra;
};

Before the member extra was at offset 16, and since sizeof(struct base) is 16, it still is.

So we can embed the list node structure inside a custom structure, add any elements we want, and it will still behave as if it was a list node:

struct node {
	struct llist_node node;
	int value;
};

So, instead of this:

struct node {
	int value;
	struct llist_node node;
};
#define container_of(ptr, type, member) ((type *)((void *)(ptr) - offsetof(type, member)))
#define node_entry(ptr) container_of(ptr, struct node, node)

node_entry(p)->value

We can do this:

struct node {
	struct llist_node node;
	int value;
};

((struct node *)p)->value

That’s a lot of complexity just to declare a node, and we haven’t even created one. How is this “better”?

Well, in GLib this is how we created a node:

node_data a = { .value = 0 };
GSList link_a = { .data = &a };

Note: in the real world we would be allocating this memory, so instead of two initializations we would have two malloc()s and therefore subsequently we would need to do two free()s.

How do you create a node in the linux version?

struct node a = { .value = 0 };

By pushing all the complexity into the library, the users of that library become really simple. The best programmers in the planet have spent countless hours improving these libraries, it would be really hard to find any possible area of improvement, this is quite simply: state of the art.

Now it’s time to actually do something with these data.

Delete

In Linux there’s multiple implementations of linked lists, and the closest one to our needs is the lock-less single linked list (llist). We are going to ignore the lock-less part part, and focus the meat of the code.

This is what the code for deleting an element looks like:

Actually, there is no code for that.

Because Linux cares a lot about performance, if you want to delete a single element inside a list, a single linked list is simply not the right data structure for it: you want a doubly linked list. By not providing an implementation for this, linux is forcing you to pick the right data structure.

But let’s pretend that deleting an element inside a single linked list does make sense, what would that code look like?

This is my best initial attempt:

void llist_del(struct llist_head *list, struct llist_node *entry)
{
	struct llist_node *p;

	for (p = list->first; p; p = p->next) {
		if (p->next != entry) continue;
		p->next = entry->next;
		return;
	}
}

This code looks different than the implementations in part 1 and 2 because now we have two structures, is that even necessary?

Before making any judgements, let’s compare it with the best version we have so far:

GSList *g_slist_remove_link(GSList *list, GSList *link)
{
	GSList **p;

	for (p = &list; *p; p = &(*p)->next) {
		if (*p != link) continue;
		*p = (*p)->next;
		break;
	}

	return list;
}

One obvious difference: one version returns a GSList * the other one doesn’t return anything.

At this point it should be obvious that we are forgetting something.

Tests

OK, we already explored how to declare nodes, so:

struct node a = { .value = 0 };
struct node b = { .value = 1 };
struct node c = { .value = 2 };

To link the nodes:

a.node.next = &b.node;
b.node.next = &c.node;

So far nothing particularly special, but we haven’t done anything regarding that other structure struct llist_head:

struct llist_head list = { .first = &a.node };

Our list is in fact a data structure that contains a pointer to the first node in the list.

OK, but why? Couldn’t we just have a pointer to the first struct llist_node? We’ll come back to that, first let’s finish the tests.

// Remove the second node
llist_del(&list, &b.node);
if (a.node.next != &c.node) return false;

// Remove the first node
llist_del(&list, &a.node);
if (list.first != &c.node) return false;

// Remove the same node again
llist_del(&list, &a.node);
if (list.first != &c.node) return false;

If we run the tests they actually fail, so clearly we missed something.

The head

If we do a little debugging, we quickly find out that removing the first node doesn’t work.

Let’s go back to the code:

void llist_del(struct llist_head *list, struct llist_node *entry)
{
	struct llist_node *p;

	for (p = list->first; p; p = p->next) {
		if (p->next != entry) continue;
		p->next = entry->next;
		return;
	}
}

What happens if we try to remove the first element in the list? Shouldn’t list->first be updated in that case?

It turns out that we need an extra check for the special case where we want to remove the first element:

if (entry == list->first) {
	list->first = entry->next;
	return;
}

If we run the tests, now they pass. We finally have a version that works.

So far it seems this separate struct head * is only creating headaches for no gain.

But hold on… Linus Torvalds told us that it’s better to avoid special cases, can we refactor the code to avoid this special case for the first element?

To do that we would need the for loop:

for (p = list->first; p; p = p->next) {
	if (p->next != entry) continue;
	p->next = entry->next;
	return;
}

To somehow do this as well:

if (entry == list->first) {
	list->first = entry->next;
	return;
}

Well, these two lines seem similar enough:

list->first = entry->next;
p->next     = entry->next;

All we need is list->first to somehow be equivalent to p->next, but not only are the members different, list is a struct head * and p is a struct node *, so it seems we hit a dead end.

But we already saw that in C one structure can pretend to be another structure, so a struct llist_head * can pretend to be a struct llist_node *, as long as the first members of the structures are equivalent, but are they?

struct llist_head {
	struct llist_node *first;
};

struct llist_node {
	struct llist_node *next;
};

Actually, the first member of both structures is a struct llist_node *, the only difference is the name.

So in theory this is true:

&p->next == &list->first

Then all we need to do is pretend that (struct llist_node *)p == (struct llist_head *)list.

So this:

if (entry == list->first) {
	list->first = entry->next;
	return;
}

Becomes this:

p = (struct llist_node *)list;
if (entry == p->next) {
	p->next = entry->next;
	return;
}

Now the code is doing something very similar to what the for loop is doing:

for (p = list->first; p; p = p->next) {
	if (p->next != entry) continue;
	p->next = entry->next;
	return;
}

In both cases p->next is compared with entry, and if they are equal, then p->next is updated with entry->next.

The difference is that the loop starts with list->first, and then continues with p->next in the negative case. The special check starts with list and does not continue.

We can pretend list is a node pointer, then the code in the for loop will do the same as above in the affirmative, and go to p->next in the negative, which is the same as list->first, thus continuing with the behavior of the original loop.

So all we need to do is start with list instead of list->first (and cast it):

void llist_del(struct llist_head *list, struct llist_node *entry)
{
	struct llist_node *p;

	for (p = (struct llist_node *)list; p; p = p->next) {
		if (p->next != entry) continue;
		p->next = entry->next;
		return;
	}
}

Believe it or not, this does exactly the same thing as this:

void llist_del(struct llist_head *list, struct llist_node *entry)
{
	struct llist_node *p;

	if (entry == list->first) {
		list->first = entry->next;
		return;
	}

	for (p = list->first; p; p = p->next) {
		if (p->next != entry) continue;
		p->next = entry->next;
		return;
	}
}

Another way to think about this is that struct llist_head acts as a fake first node in the list, a node that can never be removed and points to the actual first node.

Good bye special case.

Now we are starting to see the usefulness of a separate structure for the head of the list.

What about GLib?

OK, so a separate list head seems to be useful, but why doesn’t GLib do this?

Let’s comapre the two versions:

void llist_del(struct llist_head *list, struct llist_node *entry)
{
	struct llist_node *p;

	for (p = (struct llist_node *)list; p; p = p->next) {
		if (p->next != entry) continue;
		p->next = entry->next;
		return;
	}
}

GSList *g_slist_remove_link(GSList *list, GSList *link)
{
	GSList **p;

	for (p = &list; *p; p = &(*p)->next) {
		if (*p != link) continue;
		*p = (*p)->next;
		break;
	}

	return list;
}

The only reason GLib is updating the first GSList * is in the special case that we want to remove the first node. In that case the local variable list is updated to the new head of the list, and that value is returned so the caller can update the head:

head = g_slist_remove_link(head, &a);

If we followed the linux way, the head would be the first node, the head->next would be updated and there’s no need to return anything, and the call becomes simpler:

llist_del(&list, &a);

Also, because we are always updating a p->next pointer, there’s no need for an indirect pointer, and the implementation becomes simpler.

Moreover, by having a separate structure for the head, the compiler can check for mistakes itself (in case the user passes a node instead of a head).

The linux way is much better, but that’s not all.

Linux has macros to work with these lists, and there’s one to traverse it:

#define llist_for_each(pos, node) for ((pos) = (node); pos; (pos) = (pos)->next)

If we use it, the code becomes even more simple:

void llist_del(struct llist_head *list, struct llist_node *entry)
{
	struct llist_node *p;

	llist_for_each(p, (struct llist_node *)list) {
		if (p->next != entry) continue;
		p->next = entry->next;
		return;
	}
}

So the real reason why GLib developers didn’t do this clearly superior approach is simply that they didn’t think of it.

Conclusion

So we went from this:

typedef struct node {
	int value;
	struct node *next;
} node;

void remove_list_entry(node **head, node *entry)
{
	node *prev, *walk;

	prev = NULL;
	walk = *head;

	// Walk the list

	while (walk != entry) {
		prev = walk;
		walk = walk->next;
	}

	// Remove the entry by updating the
	// head or the previous entry

	if (!prev)
		*head = entry->next;
	else
		prev->next = entry->next;
}

To this:

struct llist_head {
	struct llist_node *first;
};

struct llist_node {
	struct llist_node *next;
};

struct node {
	struct llist_node node;
	int value;
};

#define llist_for_each(pos, node) for ((pos) = (node); pos; (pos) = (pos)->next)

void llist_del(struct llist_head *list, struct llist_node *entry)
{
	struct llist_node *p;

	llist_for_each(p, (struct llist_node *)list) {
		if (p->next != entry) continue;
		p->next = entry->next;
		return;
	}
}

Which is superior to what one of the most widely used libraries (GLib) is doing, and we can be relatively certain that if there are any possible improvements, they are very likely few.

By thinking a bit more about data structures we ended up with code that even the best programmers in the world would approve of.

This is good taste.