We have now two implementations, one good enough for a standalone project, another one good enough for a library. But we are not done.
Can we do better than GLib?
I’ve often argued that the gold standard for development is the Linux project (by “Linux” I’m always referring to the kernel, because that’s the name of the project). Every time I’ve seen a debate whether of development practices, code-style, tooling, you name it, Linux is doing it right. And never have I seen a case where another project did better.
Of course this is subjective and a matter of taste, but taste is precisely what we are exploring, so how is Linux approaching this?
The Linux way
struct llist_head {
struct llist_node *first;
};
struct llist_node {
struct llist_node *next;
};
Well, that’s certainly different.
For starters it’s two structures rather than one (I’ll explain why later), but more importantly
there’s no data
pointer. Where are we supposed to put our data?
It turns out that Linux developers know C, and I mean really know C.
What we can do is embed a structure inside another structure, for example:
struct base {
int id;
char *name;
};
struct special {
void *extra;
struct base base;
};
We can access the members of struct base
like x.base.name
. Great, so all the information of one
structure is contained in another structure, so we can do:
struct node {
int value;
struct llist_node node;
};
To access the next element of our custom node, we do x.node.next
. Problem solved.
Not so fast. The next element is a struct llist_node *
, not a struct node *
, so how can we
access the value of the next element? This is where it gets tricky.
If the address of x.node
is 0x1008
and node
is at offset 8 in the struct, then x
is at
0x1000
. But how can we calculate that programmatically? With offsetof.
offsetof(struct node, node)
gives us 8, so &x.node - offsetof(struct node, node) == &x
. Do we
really have to do this calculation? No, Linux has a convenient macro:
#define container_of(ptr, type, member) ((type *)((void *)(ptr) - offsetof(type, member)))
So we can do container_of(&x.node, struct node, node)
to get our custom node data. Linux has a
llist_entry
macro to help us, but that’s basically the same thing as container_of
, and since we
know the type of our structure and its member, we can create our own macro:
#define node_entry(ptr) container_of(ptr, struct node, node)
So we simply do node_entry(&x.node)
. Given a generic struct llist_node
pointer p
, we can
fetch its corresponding custom value with node_entry(p)->value
.
Didn’t I tell you they know C? Here is a blog post from Greg Kroah-Hartman explaining the macro.
But we are not done yet. If we rearrange our structure to have the struct llist_node
on top like
this:
struct node {
struct llist_node node;
int value;
};
Then offsetof(struct node, node)
will always be zero, therefore the container_of
macro is
essentially doing nothing:
#define container_of(ptr, type, member) ((type *)((void *)(ptr) - offsetof(type, member)))
#define container_of(ptr, type, member) ((type *)(ptr))
This is just a cast, so a struct node *
can be treated like a struct llist_node *
? Yes.
If we have these structures:
struct base {
int id;
char *name;
};
struct special {
int id;
char *name;
void *extra;
};
As long as the extra elements are stored after the base ones, variables of type struct special
can be considered struct base
. So if we have a variable struct special *x
, these two statements
are equivalent:
x->name
((struct base *)x)->name
Why? Because in struct base
the name
member is at offset 8, and in struct special
it’s also at
offset 8.
This doesn’t change if we completely embed one structure inside the other:
struct special {
struct base base;
void *extra;
};
Before the member extra
was at offset 16, and since sizeof(struct base)
is 16, it still is.
So we can embed the list node structure inside a custom structure, add any elements we want, and it will still behave as if it was a list node:
struct node {
struct llist_node node;
int value;
};
So, instead of this:
struct node {
int value;
struct llist_node node;
};
#define container_of(ptr, type, member) ((type *)((void *)(ptr) - offsetof(type, member)))
#define node_entry(ptr) container_of(ptr, struct node, node)
node_entry(p)->value
We can do this:
struct node {
struct llist_node node;
int value;
};
((struct node *)p)->value
That’s a lot of complexity just to declare a node, and we haven’t even created one. How is this “better”?
Well, in GLib this is how we created a node:
node_data a = { .value = 0 };
GSList link_a = { .data = &a };
Note: in the real world we would be allocating this memory, so instead of two initializations we
would have two malloc()
s and therefore subsequently we would need to do two free()
s.
How do you create a node in the Linux version?
struct node a = { .value = 0 };
By pushing all the complexity into the library, the users of that library become really simple. The best programmers in the planet have spent countless hours improving these libraries, it would be really hard to find any possible area of improvement, this is quite simply: state of the art.
Now it’s time to actually do something with these data.
Delete
In Linux there’s multiple implementations of linked lists, and the closest one to our needs is the lock-less single linked list. We are going to ignore the lock-less part part, and focus the meat of the code.
This is what the code for deleting an element looks like:
Actually, there is no code for that.
Because Linux cares a lot about performance, if you want to delete a single element inside a list, a single linked list is simply not the right structure for it: you want a doubly linked list. By not providing an implementation for this, Linux is forcing you to pick the right data structure.
But let’s pretend that deleting a single element inside a single linked list does make sense, what would that code look like?
This is my best initial attempt:
void llist_del(struct llist_head *list, struct llist_node *entry)
{
struct llist_node *p;
for (p = list->first; p; p = p->next) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
}
This code looks different than the implementations in part 1 and 2 because now we have two structures, is that even necessary?
Before making any judgements, let’s compare it with the best version so far:
GSList *g_slist_remove_link(GSList *list, GSList *link)
{
GSList **p;
for (p = &list; *p; p = &(*p)->next) {
if (*p != link) continue;
*p = (*p)->next;
break;
}
return list;
}
One obvious difference: one version returns a GSList *
the other one doesn’t return anything.
That’s because one version is supposed to be called like:
head = g_slist_remove_link(head, &b);
While the other:
llist_del(&list, &b);
At this point it should be obvious that we are forgetting something.
Tests
OK, we already explored how to declare nodes, so:
struct node a = { .value = 0 };
struct node b = { .value = 1 };
struct node c = { .value = 2 };
To link the nodes:
a.node.next = &b.node;
b.node.next = &c.node;
So far nothing particularly special, but we haven’t done anything regarding that other structure
struct llist_head
:
struct llist_head list = { .first = &a.node };
Our list is in fact a data structure that contains a pointer to the first node in the list.
OK, but why? Couldn’t we just have a pointer to the first struct llist_node
node?
Let’s go back to the code:
void llist_del(struct llist_head *list, struct llist_node *entry)
{
struct llist_node *p;
for (p = list->first; p; p = p->next) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
}
What happens if we try to remove the first element in the list?
p
is going to equal list->first
, which is going to equal entry
, but is not comparing p
to
entry
, it’s comparing p->next
.
Let’s see the call:
llist_del(&list, &a.node);
OK, so we are going to traverse list
until the next
pointer of p
is equal to &a.node
, which
can never happen because we are starting at &a.node
and the next
point of &a.node
can never be
itself.
And right now we are only updating p->next
, which is inside a struct node *
. Where are we updating the
first
pointer of struct head *
?
It turns out that we need a special check for the special case where we want to remove the first element:
struct llist_node *p;
if (entry == list->first) {
list->first = entry->next;
return;
}
for (p = list->first; p; p = p->next) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
This code updates the first
member of the list, but only when entry
is indeed the first element.
We finally have a version that works, and to test it:
llist_del(&list, &b.node);
if (a.node.next != &c.node) return false;
llist_del(&list, &a.node);
if (list.first != &c.node) return false;
llist_del(&list, &a.node);
if (list.first != &c.node) return false;
But wait, this is a bit cumbersome. We have a special case in llist_del()
, but Linus Torvalds told
us in his TED talk that it’s better to avoid special cases. Can we do that?
To avoid the special case we need this code:
for (p = list->first; p; p = p->next) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
To somehow do this:
if (entry == list->first) {
list->first = entry->next;
return;
}
Well, these two lines seem similar:
list->first = entry->next;
p->next = entry->next;
All we need is list->first
to somehow be equivalent to p->next
, but not only are the members
different, list
is a struct head *
and p
is a struct node *
, so game over.
Not quite. We already explained that in C one structure can pretend to be another structure, so a
struct llist_head *
can pretend to be a struct llist_node *
, as long as the first members of the
structures are equivalent, but are they?
struct llist_head {
struct llist_node *first;
};
struct llist_node {
struct llist_node *next;
};
Actually, the first member of both structures is a struct llist_node *
, the only difference is the
name.
So in theory this is true:
&p->next == &list->first
So all we need to do is pretend that (struct llist_node *)p == (struct llist_head *)list
.
Then this:
if (entry == list->first) {
list->first = entry->next;
return;
}
Becomes this:
p = (struct llist_node *)list;
if (entry == p->next) {
p->next = entry->next;
return;
}
And our for loop is basically doing the same:
for (p = list->first; p; p = p->next) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
Except p
is starting from list->first
, instead of list
, so we just adjust the starting
position:
void llist_del(struct llist_head *list, struct llist_node *entry)
{
struct llist_node *p;
for (p = (struct llist_node *)list; p; p = p->next) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
}
Believe it or not, this works perfectly.
If we want to delete the first element list->first == entry
, therefore ((struct llist_node
*)list)->next == entry
. And if we are updating p->next
only when p->next == entry
, then we are
updating p->next
in this case.
Another way to think about this is that struct llist_head
is a fake first node in the list, a node
that can never be removed and points to the actual first node.
Good bye special case.
This works great, but we were supposed to be comparing the Linux version versus the GLib version, let’s go there.
GLib isn’t that great
OK, so our objective is to compare these versions:
void llist_del(struct llist_head *list, struct llist_node *entry)
{
struct llist_node *p;
for (p = (struct llist_node *)list; p; p = p->next) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
}
GSList *g_slist_remove_link(GSList *list, GSList *link)
{
GSList **p;
for (p = &list; *p; p = &(*p)->next) {
if (*p != link) continue;
*p = (*p)->next;
break;
}
return list;
}
Surely GLib is returning a GSList *
for a reason, isn’t it?
Actually, no. The only reason GLib is updating the first GSList *
is in the special case that
we want to remove the first node.
If we followed the Linux way the head would be the first node, the head->next
would be updated and
there’s no need to return anything. And because we are always updating a p->next
pointer, there’s
no need for an indirect pointer and the code becomes simpler.
Additionally, by having a separate structure for the head the compiler can check for mistakes itself, in case the user passes a node instead of a head.
The Linux way is much better, but that’s not all.
Linux has macros to work with these lists, and there’s one to traverse it:
#define llist_for_each(pos, node) for ((pos) = (node); pos; (pos) = (pos)->next)
If we use it, the code becomes even simpler:
void llist_del(struct llist_head *list, struct llist_node *entry)
{
struct llist_node *p;
llist_for_each(p, (struct llist_node *)list) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
}
The only difference in the behavior with GLib’s version is that if we pass a NULL
entry, the code
crashes, but that’s actually good. Code shouldn’t be making stupid unnecessary calls, and in the
kernel they don’t want this to ever happen.
Conclusion
So we went from this:
typedef struct node {
int value;
struct node *next;
} node;
void remove_list_entry(node **head, node *entry)
{
node *prev, *walk;
prev = NULL;
walk = *head;
// Walk the list
while (walk != entry) {
prev = walk;
walk = walk->next;
}
// Remove the entry by updating the
// head or the previous entry
if (!prev)
*head = entry->next;
else
prev->next = entry->next;
}
To this:
struct llist_head {
struct llist_node *first;
};
struct llist_node {
struct llist_node *next;
};
struct node {
struct llist_node node;
int value;
};
#define llist_for_each(pos, node) for ((pos) = (node); pos; (pos) = (pos)->next)
void llist_del(struct llist_head *list, struct llist_node *entry)
{
struct llist_node *p;
llist_for_each(p, (struct llist_node *)list) {
if (p->next != entry) continue;
p->next = entry->next;
return;
}
}
Which is superior to what one of the most widely used libraries (GLib) is doing, and we can be relatively certain that if there any possible improvements, they are very likely few.
By thinking a bit more about data structures we ended up with code that even the best programmers in the world would approve of.
This is good taste.