A linguistic introduction to d3.js

It was the creation/deletion of nodes based on data.

When we first run a d3 program in a blank document, usually we have fewer nodes than data points, so that if we want to map nodes to data points, then we must create some.

If it is not the first time we have run the program in a document (if, for example, our program is hooked up to some ‘live’ data that is regularly updating), and we have fewer data points than before, then it will likely mean that we now have more nodes in the document than we need.

In this case it will be necessary to delete some of the now-redundant nodes that we created previously.

Let’s focus on the specific case in hand, in which we seek to map each data point in our data array onto a <rect> node in the document.

We need some way to represent each of three cases (assuming that the data array may be subject to change over time):A node that we need to create, because a particular data point has not yet been mapped onto a node in the documentA node that we need to remove, because its associated data point has been removed, and along with it must go the associated nodeA node that we need to update, because its associated data point has been updatedAfter having passed our data array (it always needs to be an array type) to the selection of nodes that we wish to map to it by going selection.

data(data), d3 has all the information it needs to sort the the current node selection into the above three cases.

d3 uses the terms, enter, exit and update to refer to these three cases, respectively.

Since a selection is d3’s term for a collection of nodes, we refer to a collection of nodes that fall under case 1 as being the enter selection, a collection of nodes that fall under case 2 as the exit selection, and the collection of nodes that fall under case 3 as the update selection.

Here’s a diagram that I have shamelessly lifted straight from Mike Bostock, the creator of d3's, site:A ‘binding’ is a special selection that provides methods that allow access to three selections that map to the three cases above3.

‘Pseudo-nodes’ (noun)An astute reader will recognise at this point that the enter selection represents a selection of nodes that are yet to actually exist.

It is worth dwelling on this point for a second, because it gets right to the heart of what people often find difficult to grasp when they are learning d3.

‘A collection of nodes that don’t yet exist’ is a difficult one to get your head around.

If you think of it as a ‘collection of pseudo-nodes’ then that might help.

In any case, it will be necessary at this point for us to update our definition of a selection as ‘a collection of nodes, whether they actually exist in the document or not’.

Users of React or other ‘Virtual DOM’ libraries might find this concept natural, to represent nodes as javascript objects before they are actually created.

A selection is similar, in this case.

In contrast, the update and exit selections, by definition, will always refer to nodes that already currently exist in the document.

So let’s review where we are at.

By calling svg.

selectAll(“rect”).

data(data) for the first time, we end up with a binding, with references to:an enter selection that consists of 10 as-yet-uncreated nodes corresponding to the 10 elements of data.

The type of these nodes (i.

e.

whether they are <rect>s, or whether they are something else like <circle>s) has yet to be specified.

This selection is obtained by calling .

enter().

an exit selection that is empty.

The document had no <rect> nodes to start with — svg.

selectAll(‘rect’) returned an empty selection — so there was no chance of this selection including any redundant <rect> nodes.

This selection is obtained by calling .

exit().

an update selection that is empty.

Again, since the document had no <rect> nodes to start with, there are none that require updating with new data either.

This selection is what is returned to you by default — you don’t need to call any additional methods to select it.

So a binding is simply a selection, consisting of our update nodes, and featuring some special methods that give us access to the enter and exit selections.

Hence, you may come across patterns like this:This can be interpreted to mean, in plain English: “select any existing <rect> nodes that descend from our <svg> node, and set the __data__ attribute of the ith rect element with the ith data point in data.

For any data point that lacks an associated node, create a new <rect> element to represent it”.

Let’s expand on this briefly.

We learned earlier that selection.

append serves to both mutate the nodes in a selection (by appending a new node to them) and return the newly appended nodes in the form of a fresh new selection.

In this case, the call to .

append(‘rect’) says “for each of the pseudo-nodes in our enter selection, append a <rect>, and return this fresh new selection of <rect>s”.

4.

Propagate (verb)In the previous example, the ith <rect> node will end up storing the ith data point as its __data__ attribute.

This result is not actually obvious, and is worthing highlighting explicitly.

We should expand our definition of selection.

select as ‘for every node in this selection, select the first descendant node that matches x and propagate data to it’.

As we mentioned before, the .

append method is actually a wrapper around .

select, and so .

append has this property as well.

In the case above, it serves to create a <rect> node for every one of our pseudonodes (each of which, if you remember, had __data__ bound to them), propagate the data points along to the newly created ‘real’ nodes in a 1:1 fashion, and then return these newly created nodes in a new selection.

(note that selection.

selectAll does not have this data-propagation property — perhaps it is because there is no such 1:1 mapping to the newly-selected nodes and so it’s not easy to tell which data should go where).

selection.

select has the side-effect of propagating data to the newly selected nodes, so that they have the same __data__ attribute as the node they were selected from.

selection.

selectAll does NOT have this propertyAny additional method calls that may be chained on the end, will serve to operate on these new <rect> nodes.

Maybe you will want to give them a size, a position, or a colour.

In the case below, we will give them a height that is dependent on their __data__:If the call to svg.

selectAll(‘rect’) had returned some nodes, (i.

e.

, not an empty selection), then the subsequent call to .

data(data) would have updated some of them with new data.

By this I mean, their __data__ attribute would have been reassigned to some new value.

But this would be imperceptible to a user unless we perform a data-driven cosmetic transformation to these nodes.

Since the binding is itself a selection of these ‘updated’ nodes, we just need to call some cosmetic methods on it in order to make the new data visible.

For example:Much of the d3 examples you will see online tend to only make references to the enter selection.

If your intention is to create a visual representation of some static data (i.

e.

, assuming that the data array will never change), then something like enterWithStyling.

js (the code snippet two blocks above), which only works with the enter selection, will suffice.

By operating on the update and exit selections as well, you are setting up a dynamic mapping, that will be responsive to updates in the data.

There are at least two ways of combining enterWithStyling.

js and visibleUpdate.

js (i.

e.

to both create and style the <rect> nodes ‘from scratch’ and to update them when the data changes).

The ‘old’ way is to do this:Here we save a reference to the binding, which is in effect an update selection, and from it, we obtain the enter selection (which we use to create <rect> nodes from scratch), and then merge it (think of it like a union of sets, if you are mathematically inclined) back with update selection.

If we ask ourselves ‘what is the selection referring to’ after the ‘merge’ operation, we can say that it refers to ‘newly created rect nodes, along with the nodes that were already there in the first place’.

Now that the selection is referring to this combination of cases, we can style this selection of nodes in one fell swoop using selection.

attr.

See here for an example of this occurring (I setup a ‘dynamic’ dataset in order to show the value of doing things this way).

You might have noticed that the code makes no reference to the exit selection.

Indeed, when you look at the example, you can see that over time, the number of <rect> nodes on the page never decreases, even though sometimes it should, in cases when the data array shortens in length.

Let’s fix that by making reference to the exit selection:Each selection has a .

remove method available on it, which serves to remove all of the containing nodes.

(For current purposes, it is hard to see why one would want to remove any nodes other than the exit selection).

Let’s see how that looks.

Ok, that’s better.

We can now see the number of <rect> nodes on the page increasing and decreasing in tandem with the length of the data array.

We finally have ourselves a ‘dynamic mapping’.

Newer releases of d3 introduce a new selection method, called selection.

join which makes this whole process a lot easier and with less ‘boilerplate’.

You might recognise this method from Code Snippet One.

With it, we can rewrite the previous example like this:selection.

join will append the specified element to each of the ‘pseudo-nodes’ in the enter selection, and it will return you these new nodes and the updated nodes (if any) in a single selection, so that subsequent lines can be used for styling them both at the same time.

It will also remove any nodes that have been made redundant by the latest data binding.

Wrapping upIf you’ve made it this far, it means you have all the mental models and terminology necessary to understand Code Snippet One.

The only thing I haven’t covered yet is the yScale that features at the top of the code snippet, but I’m assuming you already know what a scale is.

In d3, creating a scale requires writing the following ‘skeleton’:generator().

domain().

range()And you end up with a function that maps values in your dataset to an output range that usually refers to SVG coordinates.

It is worth highlighting that in SVG coordinates, the y value increases as you move down the page.

The diagram below should help explain why we need to set the y coordinate of each rect according to height — yScale(d) (the y coordinate refers to the position of the top edge of the <rect> node):It should go without saying that d3 also provides a myriad of ‘helper’ functions like these, that make the construction of all sorts of different chart types a whole lot easier.

Have fun exploring them all!This should be enough for you to get going.

You’re next ‘level up’ will involve coming to a better mental model of what a selection actually is.

I still need to introduce you to another noun, called a group.

This will help you come to a more sophisticated understanding of the data-binding process.

If you can’t wait to find out about this, check out this article.

If this tutorial was helpful to you, please do let me know below in the comments, and I will write another article that explains selections in more depth, and perhaps plug up any leaky abstractions.

Also, I would be interested to know what you think about this way of structuring a ‘how-to’ article in terms of nouns and verbs.

Was it helpful?’Till next time.

.

. More details

Leave a Reply