Graphing IP Geolocation Data in R — 2021

dylan hudson
4 min readSep 24, 2021

This article has two sections: Retrieving IP geolocation data from within the R environment, and then graphing the results on a world map. The latter can be generalized to any latitude/longitude data, so if you’re just looking for info on world-map graphing, skip to Section 2.

Section 1: Geolocating IP Addresses from R
If you have a dataset with a list of IP address, you can use the ‘ip_api()’ function from the rgeolocate package to natively make the API calls from inside R and return the results into a data frame. I prefer this one over the flat-file Maxmind library option.

First, we need to install the ‘rgeolocate’ package- install.packages(‘rgeolocate')
require(rgeolocate)

You can now call ip_api() on IP addresses contained in a vector or column from a data frame. In this example, I have a data frame of IP addresses, and a corresponding ‘count’ column, like you might see if you were collecting information about website traffic.

Screenshot of dataframe with count and IP columns.

So we’ll pass in the IP column and save the results in a data frame: geo_results <- ip_api(ip_data$IP, as_data_frame = TRUE, delay = FALSE)

The resulting data frame should look something like this-

Screenshot of dataframe returned by freegeoip function.

The data will be in the same order as the list you passed in, so to set ourselves up for easy graphing, I like to combine the relevant data into a single data frame. We can do this by creating a new column in ip_data and populating it with the desired columns from the geo_results … in this case, we’ll need the latitude and longitude columns if we want to graph the locations.
ip_data$latitude <- geo_results$Latitude
ip_data$longitude <- geo_results$Longitude

Now we’re in good shape to start graphing the information on a map.

Section 2: Graphing Coordinates on a World Map

There are many ways to visualize map data in R. We’re just going to look at an easy way to get a visualization quickly, or as rdocumentation.org says, “a good place to start if you need some crude reference lines.” For publishing or presenting, you’ll want to eventually learn ggmap, but the simpler method is often handy when you’re in the middle of analyzing data and trying to develop your insights, or maybe just creating a quick graphic for your coworkers.

We’ll need the ggplot2 package if you don’t already have it-install.packages("ggplot2")
require(ggplot2)
and we’ll need some map data…
install.packages(“maps”)
to create the map (swap the formatting parameters as desired):
world_map<- borders(“world”, color”gray”, fill=”gray”)

We can use ggplot to plot the map, and check it out-
map_plot <- ggplot() + world_map
map_plot

map of the world

Now we’re ready to overlay our data. Since these are discrete locations, geom_point() is the best choice here, but you could certainly use other functions like geom_line() to show things like a flight path or itinerary. Since we have a scalar for each IP, i.e. the ‘count’ column, we can make the graph’s point size proportional to the value of each row’s count- bigger dots for larger amounts. Let’s add the points to our map_plot object, just like we would on a typical Cartesian plot. Note that longitude is represented on the X-axis, and latitude on the Y.
map_plot <- map_plot + geom_point( aes(x=ip_data$longitude, y=ip_data$latitude), color="red", size=ip_data$count)
Except, the default discrete values for point size have pretty large graduations, so to compress the range and make things more normal-sized for this map scale, we’re going to divide each ‘count’ value by 3- maintaining the proportion, but achieving a more granular increase.
map_plot <- map_plot + geom_point( aes(x=ip_data$longitude, y=ip_data$latitude), color="red", size=(scale<-ip_data$count/3))

Note: if your coordinate data is causing an Error: Discrete value supplied to continuous scale message, you just need to replace the lat. and long. columns as numeric data- e.g.ip_data$latitude <- as.numeric(ip_data$latitude)

World map with IP geolocation data in red points
a

Great! Much easier to get a sense of where website visitors are, or threat actors, or potential customers, etc…
Always title and label your graphics (even quick ones) to avoid wasting time with later confusion or misunderstandings; we can add those to the plot, just as you would with other ggplot2 graphics:
map_plot <- map_plot + labs(title=”Traffic Map”, x=”East-West”, y=”North-South”)
map_plot

World map with IP data points, titled IP Map, x=axis labeled “East-West”, and y axis labeled “North-South”

Hopefully this was a helpful intro to a few practical techniques for working with geolocation data in R. If you are having trouble debugging or found errors in the article, please leave a comment, and I’ll take a look. Thanks for reading!

--

--