Using Amazon DynamoDb for IP and co-ordinate based geo-location services part 3: IPv4 range strategy

Introduction

In the previous post we went through the details of the CSV source files that show the IP and lng/lat ranges and the actual locations. We saw that the two source files are linked by the location ID.

The next task is to import the source into DynamoDb. Recall that we want to handle queries based on IPs and lng/lat pairs separately, those are the primary goals of this series. The way to query an IP database is very different from querying a lng/lat database. An IP will fit into some IP range and we’d like to find that record. Whereas if you have a lng/lat co-ordinate pair and would like to find the nearest city/school/hospital/etc. within a certain radius then that query will involve some complex maths instead.

Read more of this post

Randomly rearrange a string in .NET C#

Say you’d like to randomly rearrange the contents of a string, i.e. put the constituent characters into new, random positions. The following method can help you with that using a combination of LINQ and Random:

private static string RearrangeString(string startingPoint)
{
	Random num = new Random();
	string rand = new string(startingPoint.
		OrderBy(s => (num.Next(2) % 2) == 0).ToArray());
	return rand;
}

Let’s test it with my name:

string myName = "Andras Nemes";
string myNewName = RearrangeString(myName);

So my new name can be something like “drsNeeAna ms” or “AdaNmesnrs e” or even “ArsNmenda es”, I haven’t decided yet.

Filip Ekberg drew my attention to the following alternative solution:

string rand = new string(startingPoint.OrderBy(x => Guid.NewGuid()).ToArray());

View the list of posts on LINQ here.

Using Amazon RedShift with the AWS .NET API Part 10: RedShift in Big Data

Introduction

In the previous post we discussed how to calculate the more complex parts of the aggregation script: the median and nth percentile if the URL response time.

This post will take up the Big Data thread where we left off at the end of the series on Amazon S3. We’ll also refer to what we built [at the end of the series on Elastic MapReduce]. That post took up how to run an aggregation job via the AWS .NET SDK on an available EMR cluster. Therefore the pre-requisite of following the code examples in this post is familiarity with what we discussed in those topics.

In this post our goal is to show an alternative to EMR. We’ll also see how to import the raw data source from S3 into RedShift.

Read more of this post

Breaking up a collection into smaller fixed size collections with C# .NET

Occasionally you may need to break up a large collection into smaller size collections or groups. Consider the following integer list:

List<int> ints = new List<int>() { 1, 4, 2, 5, 2, 6, 5, 43, 6, 234, 645, 2, 12, 45, 13, 5, 3 };

The goal is to break up this list into groups of size 5 and if necessary an additional list for the remaining elements. The following generic method will do the trick:

Read more of this post

5 ways to concatenate strings with C# .NET

There are multiple ways to build a string out of other strings in .NET. Here come 5 of them.

Let’s start with the most obvious one that language learners encounter first, i.e. concatenation done by the ‘+’ operator:

string concatenatedOne = "This " + "is " + "a " + "concatenated " + "string.";

Read more of this post

Using Amazon RedShift with the AWS .NET API Part 9: data warehousing and the star schema 3

Introduction

In the previous post we started formulating a couple of Postgresql statements to fill in the dimension tables and the aggregation values. We saw that it wasn’t particularly difficult to calculate some basic aggregations over combinations of URL and Customer. We ignored the calculation of the median and percentile values and set them to 0. I’ve decided to dedicate a post just for those functions as I thought they were a lot more complex than min, max and average.

Median in RedShift

Median is also a percentile value, it is the 50th percentile. So we could use the percentile function for the median as well but median has its own dedicated function in RedShift. It’s not a compact function, like min() where you can pass in one or more arguments and you get a single value.

Read more of this post

Using Amazon DynamoDb for IP and co-ordinate based geo-location services part 2: MaxMind source files

Introduction

In the previous post we outlined the goals of this series and the tools that we’re going to use. We’ve got as far as downloading MaxMind’s free version of their geo-location source with IPs and longitude-latitude co-ordinates. We saw that the downloaded package had a number of CSV files.

In this post we’ll start off by looking at the structure of those files and how they are connected.

The source files

Read more of this post

Using Amazon DynamoDb for IP and co-ordinate based geo-location services part 1: introduction and goals

Introduction

There are a lot of applications out there that involve finding a point on a map. Putting hotels, restaurants, metro stations etc. on a map on your mobile device has become commonplace. Queries that find the nearest hospital, theatre or school need to be executed in a fast and efficient manner.

In this series we’ll discuss a possible solution to the following geo-location related scenarios:

  • You have a pair of longitude (lng) and latitude (lat) co-ordinates and you’d like to find all locations in a circle around that point or just the nearest relevant location, e.g. the nearest city
  • You have an IP address and you’d like to find the location details of that address, such as New York or Sydney

The series is centred around Amazon cloud based tools. Even if you’re not familiar with Amazon Cloud but looking for a solution to questions similar to the ones outlined above I encourage you to read on – you might just find something useful.

Read more of this post

Combinable enumerations in C# .NET

You’ve probably encountered cases with combined enum values using the pipe character, i.e. the “bitwise or” operator ‘|’:

Size.Large | Size.ExtraLarge

Let’s see an example of how to create such an enum.

The enumeration is decorated with the Flags attribute like in the following example:

Read more of this post

Using Amazon RedShift with the AWS .NET API Part 8: data warehousing and the star schema 2

Introduction

In the previous post we discussed the basics of data warehousing and the different commonly used database schemas associated with it. We also set up a couple of tables: one raw data table which we filled with some raw data records, two dimension tables and a fact table.

In this post we’ll build upon the existing tables and present a couple of useful Postgresql statements in RedShift. Keep in mind that Postgresql in RedShift is very limited compared to the full version so you often need to be resourceful.

Fill in the dimension tables

Recall that we have 2 dimension tables: DimUrl and DimCustomer. Both are referenced from the fact table by their primary keys. We haven’t added any data into them yet. We’ll do that now.

Read more of this post

Elliot Balynn's Blog

A directory of wonderful thoughts

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

WEB APPLICATION DEVELOPMENT TUTORIALS WITH OPEN-SOURCE PROJECTS

Once Upon a Camayoc

ARCHIVED: Bite-size insight on Cyber Security for the not too technical.