Using Amazon DynamoDb for IP and co-ordinate based geo-location services part 4: lng/lat range strategy
April 19, 2015 Leave a comment
Introduction
In the previous post we discussed our strategy to save the IP ranges in DynamoDb. We saw that it would be very inefficient to store the IPs as strings and run our queries based on some string manipulation. Instead we’ll store the IP ranges as lower and upper limit integers.
In this post we’ll discuss our strategy to save the longitude-latitude ranges for cities.
Lng/lat ranges
Let’s restate what we’d like to achieve. Given a longitude-latitude co-ordinate pair we’d like to find…:
- …all “objects” within a certain radius of those coordinates
- …or find the “object” that lies nearest to those coordinates
Actually those are much the same goals as we’ll see later. Here “object” can be anything: city, movie, restaurant etc. We’ll concentrate on cities as we said before.
Achieving these goals are mathematically quite challenging. We’re lucky again as these are not new problems and have been solved by others. We’ll use Amazon and Google components to format the data records and query the data set. The Amazon component requires a certain format so that it can run the searches. Each data record must have the following columns:
- A column called “geoHash” of type Number: this is a large number that’s calculated based on the lng and lat values and is a result of come complex maths
- A column called “hashKey” of type Number which is a short hash key generated from the geohash
- A column called “geoJson” which will store the lng/lat pairs in a human readable JSON format
- A column called “rangeKey” of type String which we can define but must be unique for each record. We’ll set it to an integer ID from 1 and increment it with every new record
These are the obligatory columns for the AWS queries to work. It’s possible to add any other columns you need. We’ll only add one more and that will be the geoname ID so that we’ll be able to find the correct city in the Cities table.
Just to give you a teaser here’s a printout from DynamoDb about the table we’re going to build:
Summary
We have now discussed our strategy to build the IP and lng/lat range tables. We’ll eventually build 3 tables in DynamoDb:
- One which holds the cities along with their geoname IDs
- One which will hold the IP ranges and have reference to the cities table through the geoname ID
- …and finally one that will store the longitude-latitude pairs with the same reference to the cities table
In the next post we’ll start building the IP range table.
View all posts related to Amazon Web Services and Big Data here.