Displaying Vertical Data Point Lines in ggplot/R
Sometimes the case arises where a vertical line makes for a better visual than a dot, e.g. when you’re trying to clearly demarcate intervals in plotted discrete data, but don’t need error bars or axis ticks (or anything that would overcomplicate things). It’s probably not something you need often (or if there is, there’s probably a better way to do it), but it can be very handy in certain situations.
First make sure your packages are up to date by runningupdate.packages()
. If you don’t already have the R devtools installed, you’ll need to do that first, as we use a library currently served from GitHub for this.
Then we’ll install the ‘ungeviz’ library, a collection of great functions designed to “help you visualize uncertainty”. They’re also helpful and easy to use for many other scenarios. To do this, run devtools::install_github(“wilkelab/ungeviz”)
Now we have access to all those functions in the namespace ungeviz::
. If you don’t want to use the scoping operator, just import the library to your session (library(ungeviz)
) but watch for masking issues/overlapping object names.
So if we want to add vertical lines in our plot, we can chain geom_vpline()
into our ggplot()
call like this:+ ungeviz::geom_vpline(data=marks, aes(x=time, y=distance), color="blue", size=0.5)
The ‘size’ parameter controls the width; generally 0.5 is a good starting point. Use ‘height’ to control height, and the standard ggplot color options with ‘color’. If you’re doing a simple overlay on existing data, keep parameters like color and size out of the aes()
function- just put them after as shown above (putting them in aes()
will cause a legend to display; you can remove it, but no reason to bother in the first place unless you’re doing more complex things).
We’ll just use some arbitrary test data of discrete values over a linear time-series:ggplot(vert_sample) + geom_point(aes(x=time, y=value)) + labs(title=”Vertical Lines Test Graph”)
Now, to bring in the vertical marks, we’ll need to either add another column, or slice some data to a new DataFrame. Say we want to more easily see where each group of ten seconds is demarcated. To do this, we’ll need a column containing the ‘value’ at only the ten-second intervals; there are quite a few ways to do this. One would be to create and autofill a column of all ten-second intervals within the time range, i.e. start at 0 and increment by ten. Then source the values from the original data at each of those increments.
You could also simply add a new column that contains the values only in the relevant rows-
With the data ready to go, we can add the vertical lines; if you used a separate DataFrame, don’t forget to source it in the geom_vpline function — geom_vpline(data=marks, aes(x=time, y=mark))
If you used the same DataFrame, we can elide the ‘data’ parameter, and just set the aes() values to the relevant column names (the x axis will be the same as your points, and y will be the mark values).
Let’s generate the final graph: ggplot(vert_sample) + geom_point(aes(x=time, y=value)) + geom_vpline(aes(x=time, y=mark), size=1, color=”darkred”, height=1) + labs(title=”Vertical Lines Test Graph”)
Great! Now our eyes can easily tell where the ten-second lines are without having to look away from the data.