A recent work project involved persisting user data in Redis. We wanted to estimate the amount of memory we expected to use to determine whether the infrastructure currently available would be appropriately sized.
Data Details
- We’re using Redis inside the user’s request, so it needs to be fast
- We’re using
hset
1 to store a hash for each user (Time complexity: O(1)), with a unique key per user - Each key is composed of a prefix and a UUID representing the user
- We are setting up to 3 field and value pairs, which represent user interaction events
- We’re using
hgetall
2 (O(N) where N is the size of the hash) to read the field and value pairs back out
Determining Memory Use Per Key
In the article Estimate the memory usage for repeated keys in Redis, the technique uses the INFO
command to look at how much the memory increased by after adding a key.
Since Redis Version 4, there is a more direct method, using the memory usage
command 3. I set a couple of keys and checked their size, and they were about 225 bytes per key.
For 1,000,000 users then, we’d expect to use about 250 MB (225 bytes * 1_000_000
) of memory.
Redis Mass Insertion Script
I modified the script below, to insert 1,000,000 items in Redis that are equivalent to what the application will insert. The Redis Mass Insert example uses a simple SET
operation with a key and value, so I changed it to perform an HSET
with realistic values and an EXPIRE
operation.
#
# Usage:
#
# $ ruby redis_insert_example.rb | redis-cli --pipe
#
def gen_redis_proto(*cmd)
proto = ""
proto << "*"+cmd.length.to_s+"\r\n"
cmd.each{|arg|
proto << "$"+arg.to_s.bytesize.to_s+"\r\n"
proto << arg.to_s+"\r\n"
}
proto
end
(0...10).each{|n|
STDOUT.write(gen_redis_proto("SET","Key#{n}","Value#{n}"))
}
In order to clean this up, the following bash script will delete the keys, however it operates one by one, so it will be slow for large data sets.
for key in `echo 'KEYS Key*' | redis-cli | awk '{print $1}'`
do echo DEL $key
done | redis-cli
Memory Used
Below we’ve loaded equivalent data for 1_000_000
users, which involved 3 key/value pairs per user, as well as 1 EXPIRE
operation at the hash key level, to set a TTL on the key. This is why the script output below shows 4,000,000 replies (4 operations per user).
~/Projects/redis-mass-insert $ ruby redis_insert_groupon.rb | redis-cli --pipe
All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 4000000
Now we can check the used memory, which we see is about 290 MB, fairly close to our estimate above from checking a single key. The additional memory may be due to storing all the TTL values.
~/Projects/redis-mass-insert $ redis-cli -p 6379 -a password info| egrep "used_memory_human|total_system_memory_human"
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
used_memory_human:294.24M
total_system_memory_human:16.00G
Here is the Redis info keyspace
output, where we can confirm that we have over 1 million keys with expiration values.
127.0.0.1:6379> info keyspace
# Keyspace
db0:keys=1000006,expires=1000000,avg_ttl=7775864775
Comments