Using HyperLogLog with Leaflet-Hexbins
Leaflet-Hexbins is a powerful tool to aggregate data and visualize it quickly on a map. It can size the hexbin radius based on a feature values. But what if we used HyperLogLog data to preserve user-privacy? This post will provide a convenient way to quickly perform on-the-fly HyperLogLog unions for leaflet-hexbins.
No more backend-constraint for HyperLogLog
There are many HyperLogLog implementations out there but few work as well as postgres-hll and js-hll. They can even talk to each other by exchanging hexstrings! Usually postgres-hll is pretty much the standard extension to use, when dealing with HyperLogLog but is obviously bound to a postgres database running somewhere. With js-hll, the backend-constraint for HyperLogLog is gone, just so that users can perform any HyperLogLog action in their frontend!
Leaflet-hexbins - classic
Take a look at these jsfiddles from a fromer blog post making use of the standard implementation of hexbins.
A fixed-radius example with default settings looks like this. Find the jsfiddle here.
The radius is set to a fixed size so it is covering the whole area. The color scale instead is based on the number of points in a bin. Now, we can modify the hexbin by assigning a particular value to the radius. If i.e. a bin contains 3 points
and we would like to set the radius to the total amount of posts (40+5+55), we would pass a reduce function summing up the values.
Altogether it looks like this for some sample data. Find the jsfiddle here.
The crucial part of the function looks like this. Note that one can pass values not only for the radius but also for the color scale and hence switch around between different visualizations if needed. In this example, the radius is set to the posts sum and the color (range) to the users sum.
So far so good. Now what’s the noise with HyperLogLog?
Let’s assume our sample data from above was derived from a postgres-hll database. I.e. we performed some union operation in our database to group posts by geohashes or coordinates. We send the cardinalities (=distinct counts) right to our frontend.
The issue with HyperLogLog cardinalities
If we simply summed up user cardinalities from coordinate A and coordinate B
we would count a user two times if this user posted something for both locations!
If we performed simple cardinality additions, the unit would change from
[distinct users/hexbin] to
[distinct location users/hexbin]. For our example of a hexbin aggregating data from two locations where one user posted something for both locations, A and B, this would mean
1 distinct user but
1+1 distinct location users so
2. If you’d like a nice visual explanation, watch this video.
As we are interested in distinct users in a hexbin, we must only count such a user one time. But with a standard js reduce function this is not possible.
Js-hll HyperLogLog unions
Luckily there is js-hll to solve this problem. As postgres-hll and js-hll work well with each other by exchanging hexstrings, we can simply pass the respective hexstrings as well and perform the union right in the frontend and on the spot!
Leaflet-Hexbins with HyperLogLog
The whole trick is to make leaflet-hexbins union the hllSets on-the-fly. For every zoom level, this will happen automatically. All we need to do is to pass the right function.
Based on the above sample data, let’s provide the hexstrings in our input array. Theoretically we could drop the cardinalities entirely as we can quickly estimate the cardinality anytime but I prefer to keep them anyway.
Let’s call the hexlayer with hexbins radius set to the number of points in a bin and the color to the number of distinct users.
And that’s it! Tweak around with the different visualization options, colors and radius limits as you please, add a tooltip with information as needed and a custom legend.
Creating a custom leaflet legend
If you want to add custom tooltips displaying the HyperLogLog cardinalitites, use the same function as in the main code and add a hover handler (after
After polishing everything your map could look as neat as this:
Stay tuned for more posts! If you got any questions, feel free to contact me via mail. I’m happy to get any kind of feedback! 🦘