Using Amazon DynamoDb for IP and co-ordinate based geo-location services part 9: uploading the co-ordinate range to DynamoDb
May 9, 2015 Leave a comment
Introduction
In the previous post we successfully created the lng/lat import file that DynamoDb can understand and process.
In this post we’ll upload this file to DynamoDb. The process will be the same to what we saw in this post where we inserted the demo data into the IPv4 range table. If necessary then re-read that post to refresh your memory about the process. We’ll follow the strategy we laid out in this post.
Table setup in DynamoDb
Log onto the AWS web console and navigate to DynamoDb. Create a new table with the following characteristics:
- Name: geo-ip-coords-test
- Primary key type: hash and range
- Hash attribute name: “hashKey” of type Number
- Range attribute name: “rangeKey” of type String
- (Click Continue to come to the indexes page)
- Select Index type “Local secondary index”: this will automatically add “hashKey (Number)” as the index hash key
- As index range key insert “geohash” of type Number
- The index name will be autocompleted to geohash-index, that’s good enough
- Make sure that “All attributes” is selected in the Project attributes drop-down list
- Click “Add index to table”
- (Click continue)
- Specify read and write capacity units at 5. The same remarks apply as before: the write throughput should be much larger when you’re ready to import the real database. The read throughput should be large as well in production.
- (Click continue)
- You can select to set up a basic alarm – this is not vital for this demo exercise but is very useful for production databases. I let you decide whether you want to be notified in case of a throughput limit breach
- Click continue to reach the Review pane and click Create
Wait for the table to reach status ACTIVE.
Importing the records
This is the same process we saw before when we uploaded the IPv4 range table. I’ll copy the relevant sections with minor modifications here.
Click the “Export/Import” button in the menu:
Select the lng/lat range table “geo-ip-coords-test” and click to import into DynamoDb:
You’ll be directed to the “Create Import Table Data Pipeline” view. Select the input folder of the DynamoDb input JSON file we created previously. Provide an S3 log folder as well. These logs can provide important information in case the import process fails. You can set the throughput rate to 100% as no other process will need access to the IP range table during this time. Leave the execution timeout at 24 hours. You can set your email to send notifications to.
Follow the rest of the instructions in the post referenced above. You can view the progress in Data Pipeline and/or DynamoDb. Wait for the process to finish. If all goes well then the import process should finish without exceptions:
Go back to the tables list in DynamoDb and check the contents of geo-ip-coords-test:
Note how the coordinates were stored as JSON strings. You’ll also see the calculated geo hash values.
In the next post we’ll see how to query this table using the Java AWS SDK to find the geoname ID of a single lng/lat coordinate pair.
View all posts related to Amazon Web Services and Big Data here.