Using the Redis NoSql database with .NET Part 4: key name searches and the set data type
April 10, 2017 Leave a comment
Introduction
In the previous post we looked at the list data type in Redis. Lists are implemented as linked lists in Redis meaning that each item has a link to its immediate neighbours, i.e. the previous and the next items. It’s efficient to operate on the first or last item in a linked list: add to and retrieve from the head or tail of a linked list is very fast. Redis lists therefore lend themselves very well to queues and stacks. Index-based operations, e.g. getting the 3rd item in the list, are not as efficient. We then looked at a couple of list-related Redis commands such as LINSERT, RPOP and LRANGE. We also looked at how to add the Redis folder to the environment variables on Windows so that we can call upon the Redis executables without having to navigate to the folder in a command prompt.
In this post we’ll first look at how to retrieve the key names in our Redis database. Then we’ll continue our discussion of the Redis data types with sets.
Extracting keys
It can be necessary sometimes to retrieve all the key names from Redis. There are at least two commands to achieve that: KEYS and SCAN.
KEYS accepts a search parameter which can include regex placeholders. The * character stands for any number of unknown characters in the key name whereas ? denotes exactly one unknown character. The following will find all key names with no filter:
KEYS *
In my case it returns…
1) “yesterdays-greeting”
2) “customer:643”
3) “todolist”
4) “tomorrows-greeting”
5) “name”
6) “current-greeting”
7) “customer-count”
…after the demos we’ve gone through so far in the series.
The command…
KEYS *ee*
…will find all key names that include ‘ee’:
1) “yesterdays-greeting”
2) “tomorrows-greeting”
3) “current-greeting”
Using ‘?’ like…
KEYS n???
…will match all key names that start with n followed by 3 unknown characters. There’s one match in our key set: name.
Note that the Redis documentation of the KEYS command doesn’t recommend using it on a large key-set in a production system since it can degrade performance. SCAN is a better option so let’s see a basic application of the command. Let’s first add some new keys:
MSET city “Minsk” country “Belarus” language “Belarusian” money “ruble” population “9.5m” internet-code “by”
So now we have the following keys, 13 in total:
1) “city”
2) “language”
3) “tomorrows-greeting”
4) “population”
5) “name”
6) “current-greeting”
7) “money”
8) “customer-count”
9) “todolist”
10) “customer:643”
11) “yesterdays-greeting”
12) “internet-code”
13) “country”
The SCAN command accepts an integer which stands for the cursor. The first iteration always starts with cursor position 0 like here:
SCAN 0
…which returns two values:
1) “13”
2) 1) “population”
2) “name”
3) “internet-code”
4) “country”
5) “city”
6) “money”
7) “todolist”
8) “customer:643”
9) “yesterdays-greeting”
10) “current-greeting”
The integer 13 is the cursor value to be provided for the next iteration to retrieve the next batch. Note that by default we get 10 elements back. So in order to continue with the key retrieval we need to call SCAN again with the updated cursor position:
SCAN 13
…which returns
1) “0”
2) 1) “language”
2) “tomorrows-greeting”
3) “customer-count”
The next cursor position is 0 meaning we’ve reached the end of the list. So if we need to get all the key names then we continue calling the SCAN function with the updated cursor until it returns 0 as the next cursor position.
The COUNT option sets the number of items returned:
SCAN 0 COUNT 5
…gives…
1) “10”
2) 1) “population”
2) “name”
3) “internet-code”
4) “country”
5) “city”
There’s also a MATCH option which accepts a filter argument that has the same format as the SCAN filter argument. Example:
SCAN 0 MATCH *ee*
…returns…:
1) “13”
2) 1) “yesterdays-greeting”
2) “current-greeting”
…meaning we have to continue the scan with…
SCAN 13 MATCH *ee*
…which then returns the last matching element:
1) “0”
2) 1) “tomorrows-greeting”
We can combine the MATCH and COUNT arguments:
SCAN 0 MATCH *ee* COUNT 5
1) “10”
2) (empty list or set)
…which means that there are no located items in the first 5 keys, we have to keep searching with the updated position of 10:
SCAN 10 MATCH *ee* COUNT 5
…which returns…
1) “13”
2) 1) “yesterdays-greeting”
2) “current-greeting”
…so finally we can run a last iteration:
SCAN 13 MATCH *ee* COUNT 5
1) “0”
2) 1) “tomorrows-greeting”
Sets
Sets are similar to lists but sets can only contain unique string values. Also, the ordering of the items in a set is not guaranteed to be the same as when they were inserted. So if the ordering of the items is important then a list is the better option.
We’ll build two to-do lists, or to-do sets to be exact. The set related commands are all documented here and we’ll look at the most commonly used ones in this post.
SADD will add one or more items to a set. We’ll try to add some duplicate entries:
SADD to-do-monday “cook” “study” “walk” “write” “study” “cook”
…which returns 4 meaning that 4 items were inserted in the set. SMEMBERS reads the values from a set:
SMEMBERS to-do-monday
1) “write”
2) “study”
3) “cook”
4) “walk”
…so the duplicates were removed. Item values are case-sensitive meaning…
SADD to-do-monday “COOK”
…will add “COOK” to the set:
1) “study”
2) “cook”
3) “COOK”
4) “write”
5) “walk”
We can remove it with SREM:
SREM to-do-monday “COOK”
We’ll now create a similar set for Tuesday:
SADD to-do-tuesday “cook” “sleep” “run” “study” “read”
Redis has commands related to sets such as finding the union, intersection and difference of two sets. We can find the union of unique action items for Monday and Tuesday as follows:
SUNION to-do-tuesday to-do-monday
1) “run”
2) “read”
3) “write”
4) “study”
5) “walk”
6) “sleep”
7) “cook”
Here’s how to find the common elements, i.e. the intersection:
SINTER to-do-tuesday to-do-monday
We have two action items to be performed on both days:
1) “study”
2) “cook”
SDIFF will take the first set argument and return a reduced set which will be the difference compared to the other provided sets. So the command…
SDIFF to-do-monday to-do-tuesday
…takes the items from the Monday list and returns the ones that are only found for Monday and not in the other sets in the argument list:
1) “walk”
2) “write”
These two do not figure in the Tuesday list whereas “study” and “cook” do.
Consequently the command…
SDIFF to-do-tuesday to-do-monday
…returns…
1) “sleep”
2) “read”
3) “run”
…which are action items for Tuesday but not Monday.
The SINTER, SDIFF and SUNION commands can take more than 2 arguments. We create two more sets:
SADD to-do-wednesday “cook” “work” “run” “study” “eat” “watch tv”
SADD to-do-thursday “walk” “french” “swim” “study” “bake” “play”
Here’s the intersection for all 4 to-do lists:
SINTER to-do-monday to-do-tuesday to-do-wednesday to-do-thursday
…which returns the only common action item “study”.
SUNION to-do-monday to-do-tuesday to-do-wednesday to-do-thursday
…will return 14 unique action items:
1) “eat”
2) “swim”
3) “run”
4) “read”
5) “french”
6) “play”
7) “study”
8) “work”
9) “walk”
10) “cook”
11) “bake”
12) “write”
13) “sleep”
14) “watch tv”
SDIFF to-do-monday to-do-tuesday to-do-wednesday to-do-thursday
…returns a single item “write”. It is the only action item that is scheduled for Monday but not the other 3 days.
SUNION, SDIFF and SINTER have all a “STORE” extension command:
SUNIONSTORE
SDIFFSTORE
SINTERSTORE
…which doesn’t only return the resulting set but creates a new set with the provided key. Here’s an example for SUNIONSTORE:
SUNIONSTORE to-do-overall to-do-monday to-do-tuesday to-do-wednesday to-do-thursday
…which puts the 14 unique items from the 4 sets and puts them in a new set called to-do-overall.
SCARD gets the number of items in a set:
SCARD to-do-thursday
…which returns 6.
We can finally look at SMOVE which moves an item from the first set to the second. Let’s say we want to learn French on Wednesday instead of Thursday:
SMOVE to-do-thursday to-do-wednesday “french”
Read the next part here.
You can view all posts related to data storage on this blog here.