Light-Speed Analytics with Redis Bitmaps and Lua

Here at Belly, we collect a lot of data (about 3GB per day) and for our in-house data analysis we typically run large map-reduce jobs in Hadoop that can take upwards of several minutes. However, when developing client-facing applications that utilize similar data, speed becomes a major factor.

In one of our recent projects, we were tasked with developing a dashboard that serves up analytics for each merchant location in our network.

When spec’ing out the project, the following requirements needed to be met:

  • Display all metrics (such as number of unique customers at a given location), as a total over date range, as well as by day / by month.
  • Be able to view these metrics over an ad-hoc date span. Ie. not pre-set time spans such as “Past Week” or “Past 30 days”, but rather “Oct 28 2013 – November 17th 2013”.
  • Should be able to scale to tens of thousands of unique customers per location.
  • This was going to be a client-facing application so speed matters.

Ad-hoc date spans meant that the more traditional avenue of batch jobs and pre-aggregated results were out of the question – every query needed to be performed ‘on-the-fly’. Additionally, performing complex ad-hoc MySQL queries proved to be quite expensive, and even at our current scale, querying for a single metric would sometimes take several seconds.

Enter Redis

As of Redis 2.6, two additional and quite powerful features were introduced: Bitmaps and Lua Scripting.

Bitmaps

Bitmaps are essentially an array of bits (zeros and ones). In Redis, you can SET and GET bits at a given index on a key using the SETBIT / GETBIT methods. For example, in our case, if customers with id = 2 and id = 6 checked in to merchant with id = 5 on November 12th, 2013 then we would issue the following command:

1
2
Redis.setbit('business:5:customers:2013-11-12', 2, 1)
Redis.setbit('business:5:customers:2013-11-12', 6, 1)

Which would set the bits on index 2 and 6 to 1 on the key “business:5:customers:2013-11-12”, and could be symbolically represented as: “0010001”

If we wanted to see how many unique users checked into this location on a given day we could perform the following:

1
Redis.bitcount('business:5:customers:2013-11-12')

Which performs an extremely fast population count (how many 1’s there are) on the bit-string, and in this case would result in a value of 2.

Not only can we perform bit counts, but also bit operations such as “AND”, “OR”, “XOR”, and “NOT”.

So lets get back to those ad-hoc queries…

To find the number of unique customers over a specific timespan, we can do the following:

1
2
3
4
5
6
# All the keys (dates) you want to include
keys = ['business:5:customers:2013-11-12', 'business:5:customers:2013-11-13',  to any date]
# Calculates the union of all the specified bitmaps and saves it to "business:5:customers:total"
Redis.bitop('OR', 'business:5:customers:total',  keys)
# Calculates the population count
Redis.bitcount('business:5:customers:total')

Which gives us the the number of unique customers across the requested timespan.

Further, if we wanted to take this result and calculate the number of males across the timespan, we can “AND” the resulting bitmap with a bitmap that maps to user id’s that are male:

1
2
3
4
5
6
# The keys to perform the operation on
keys = ['business:5:customers:total', 'users:males']
# Calculates the intersection of the specified bitmaps and saves it to "business:5:males:total"
Redis.bitop('AND', 'business:5:males:total', keys)
# Calculates the population count
Redis.bitcount('business:5:males:total')

So in summary, bitmaps are pretty freakin cool.

However, not everything that we were measuring needed to be a bitmap. Things such as “Number of Visits”, did not need to be unique to a specific user, and thus to save space, could just be stored as a counter in Redis. The counters could then be incremented on each visit via:

1
Redis.incr('business:5:visits:2013-11-12')

Initially, when calculating totals across a timespan, these summations were done in Ruby:

1
2
3
(start_date...end_date).each do |date|
  sum += Redis.get('business:5:visits:#{date}')
end

This proved to be somewhat slow across a large number of days, and resulted in a large number of requests to Redis for a single metric.

Enter Lua

Redis 2.6+ allows you to write Lua scripts that can be executed on Redis and also incorporate calls to Redis inside of the script. This means you can essentially create your own custom Redis commands. Thus, the sum function could be replaced by the following:

(Ran on Redis) sum.lua:

1
2
3
4
5
6
7
8
9
10
local sum = 0
# for each key we want to sum
for index, key in ipairs(KEYS) do
  # Do Redis.get(key) to get the value for the key
  local value = tonumber(redis.call('GET', key))
  if value then
    sum = sum + value
  end
end
return sum

(Ran in Ruby):

1
2
3
4
5
6
keys = []
(start_date...end_date).each do |date|
  keys << 'businesses:5:visits:#{date}''
end
# In production we actually evaluate the precompiled lua script using evalsha, not eval.
Redis.eval(`cat sum.lua`, keys)

This small change means that a query for a year’s worth of data goes from 365 requests to just 1 request to Redis, and substantially increases performance.

Pros and Cons

As with anything, there are pros and cons for using the above method.

Pros:

  • Redis is all in memory, and is just a key-value store, so it is extremely fast.
  • Using bitmaps you can store a large amount of data about users in a really small amount of space. Additionally, they perform calculations EXTREMELY fast. In Spool’s blog post, they state: “In a simulation of 128 million users, a typical metric such as ‘daily unique users’ takes less than 50 ms on a MacBook Pro and only takes 16 MB of memory.” Pretty powerful stuff.
  • Lua scripts allow you to take advantage of Redis’ speed while still being able to perform complex queries.
  • Really easy to make “real-time.” All events we want to track simply just increment a counter or set a bit when received.

Cons:

  • Memory is more expensive than disk.
  • Redis is just a key data store, so keys can get messy quickly. There are no relations, and the use of “ORM-type” modeling typically goes out the window (although this looks interesting). Be sure to organize your keys and namespaces in a fashion that makes sense.
  • Since Redis is in-memory, in use cases such as ours (where it is not simply just a cache, but a more persistent data store) backing up to disk is crucial, and not always so trivial (see http://redis.io/topics/persistence

In Summary

Jim Gray’s words that “Memory is the new disk, and disk is the new tape” have been propagated a lot in recent years; perhaps indicating that as the amount of data applications must utilize increases, traditional methods of reading application data from primary data stores on disk may no longer be optimal for those who wish to achieve near real-time speeds.

Whether or not this is proving to be true, taking that approach in this use case seemed to work for us. By utilizing an in-memory data store such as Redis, and condensing our data down to either a counter or a bitmap, we were able to successfully create data-heavy client dashboards with relatively low latency.

More Resources

Ask a question or share this article, we’d love to hear from you!