Amazon | Exercises in .NET with Andras Nemes

Using Amazon DynamoDb for IP and co-ordinate based geo-location services part 1: introduction and goals

April 11, 2015 1 Comment

Introduction

There are a lot of applications out there that involve finding a point on a map. Putting hotels, restaurants, metro stations etc. on a map on your mobile device has become commonplace. Queries that find the nearest hospital, theatre or school need to be executed in a fast and efficient manner.

In this series we’ll discuss a possible solution to the following geo-location related scenarios:

You have a pair of longitude (lng) and latitude (lat) co-ordinates and you’d like to find all locations in a circle around that point or just the nearest relevant location, e.g. the nearest city
You have an IP address and you’d like to find the location details of that address, such as New York or Sydney

The series is centred around Amazon cloud based tools. Even if you’re not familiar with Amazon Cloud but looking for a solution to questions similar to the ones outlined above I encourage you to read on – you might just find something useful.

Read more of this post

Filed under Amazon, Java Tagged with amazon, amazon cloud, aws, dynamodb, geolocation, Java

Using Amazon RedShift with the AWS .NET API Part 8: data warehousing and the star schema 2

April 9, 2015 1 Comment

Introduction

In the previous post we discussed the basics of data warehousing and the different commonly used database schemas associated with it. We also set up a couple of tables: one raw data table which we filled with some raw data records, two dimension tables and a fact table.

In this post we’ll build upon the existing tables and present a couple of useful Postgresql statements in RedShift. Keep in mind that Postgresql in RedShift is very limited compared to the full version so you often need to be resourceful.

Fill in the dimension tables

Recall that we have 2 dimension tables: DimUrl and DimCustomer. Both are referenced from the fact table by their primary keys. We haven’t added any data into them yet. We’ll do that now.

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, postgresql, redshift, star schema

Using Amazon RedShift with the AWS .NET API Part 7: data warehousing and the star schema

April 5, 2015 1 Comment

Introduction

In the previous post we dived into Postgresql statement execution on a RedShift cluster using C# and ODBC. We saw how to execute a single statement or many of them at once. We also tested a parameterised query which can protect us from SQL injections.

In this post we’ll deviate from .NET a little and concentrate on the basics of data warehousing and data mining in RedShift. In particular we’ll learn about a popular schema type often used in conjunction with data mining: the star schema.

Star and snowflake schemas

I went through the basic characteristics of star and snowflake schemas elsewhere on this blog, I’ll copy the relevant parts here.

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, postgresql, redshift

Using Amazon RedShift with the AWS .NET API Part 6: Postgresql to master node using ODBC

April 2, 2015 3 Comments

Introduction

In the previous post we tested how to connect to the master node in code using the .NET AWS SDK and ODBC. We also executed our first simple Postgresql remotely. In this post we’ll continue in those tracks and execute some more Postgresql statements on our master node.

Preparation

We’ll execute most of the scripts we saw in this blog post. Prepare a text file called postgresscript.txt with the following content and save it somewhere on your harddrive:

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, redshift

Using Amazon RedShift with the AWS .NET API Part 5: connecting to master node using ODBC

March 30, 2015 5 Comments

Introduction

In the previous post we went through some basic C# code to communicate with Amazon RedShift. We saw how to get a list of clusters, start a new cluster and terminate one using the .NET AWS SDK.

We haven’t yet seen how to execute Postgresql commands on RedShift remotely from code. That is the main goal of this post.

Installing the ODBC driver

In this section we’ll prepare our Windows environment to be able to connect to RedShift using ODBC. At times this can be a frustrating experience so I’ll try to give you as much detail as I can.

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, redshift

Using Amazon RedShift with the AWS .NET API Part 4: code beginnings

March 26, 2015 Leave a comment

Introduction

In the previous post we looked into how to connect to the Amazon RedShift master node using a tool called WorkBenchJ. We also went through some very basic Postgresql statements and tested an equally basic aggregation script.

In this post we’ll install the .NET SDK and start building some test code.

Note that we’ll be concentrating on showing and explaining the technical code examples related to AWS. We’ll ignore software principles like SOLID and layering so that we can stay focused. It’s your responsibility to organise your code properly. There are numerous posts on this blog that take up topics related to software architecture.

Installing the SDK

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, reshift

Using Amazon RedShift with the AWS .NET API Part 3: connecting to the master node

March 23, 2015 2 Comments

Introduction

In the previous post of this series we quickly looked at what a massively parallel processing database is. We also launched our first Amazon RedShift cluster.

In this post we’ll connect to the master node and start issuing Postgresql commands.

If you don’t have any RedShift cluster available at this point then you can follow the steps in the previous post so that you can try the example code.

Connecting to RedShift

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, redshift

Using Amazon RedShift with the AWS .NET API Part 2: MPP definition and first cluster

March 19, 2015 5 Comments

Introduction

In the previous post we went through Amazon RedShift at an introductory level. In general we can say that it is a highly efficient data storage and data mining tool especially suited for Big Data scenarios. However, it also comes with serious limitations regarding the available Postgresql language features.

In this post we’ll first summarise an important term in conjunction with RedShift: MPP. We’ll then go on and create our first database in the RedShift GUI.

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, redshift

Using Amazon RedShift with the AWS .NET API Part 1: introduction

March 16, 2015 3 Comments

Introduction

Amazon RedShift is Amazon’s data warehousing solution and is especially well-suited for Big Data scenarios where petabytes of data must be stored and analysed. It follows a columnar DBMS architecture and it was designed especially for heavy data mining requests.

This is the fifth – and probably for a while the last – installment of a series dedicated to out-of-the-box components built and powered by Amazon Web Services (AWS) enabling Big Data handling. The components we have looked at so far are the following:

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, redshift

Using Amazon Elastic MapReduce with the AWS .NET API Part 8: connection to our Big Data demo

March 12, 2015 Leave a comment

Introduction

In the previous post we saw how to start an Amazon EMR cluster and have it execute a Hive script which performs a basic aggregation step.

This post will take up the Big Data thread where we left off at the end of the previous 2 series on the blob storage component Amazon S3 and the NoSql data store DynamoDb. Therefore the pre-requisite of following the code examples in this post is familiarity with what we discussed in those topics.

Read more of this post

Filed under .NET, Amazon, Big Data Tagged with amazon, amazon cloud, aws, big data, c#, elastic mapreduce

← Older posts

Newer posts →

Exercises in .NET with Andras Nemes

Using Amazon DynamoDb for IP and co-ordinate based geo-location services part 1: introduction and goals

Using Amazon RedShift with the AWS .NET API Part 8: data warehousing and the star schema 2

Using Amazon RedShift with the AWS .NET API Part 7: data warehousing and the star schema

Using Amazon RedShift with the AWS .NET API Part 6: Postgresql to master node using ODBC

Using Amazon RedShift with the AWS .NET API Part 5: connecting to master node using ODBC

Using Amazon RedShift with the AWS .NET API Part 4: code beginnings

Using Amazon RedShift with the AWS .NET API Part 3: connecting to the master node

Using Amazon RedShift with the AWS .NET API Part 2: MPP definition and first cluster

Using Amazon RedShift with the AWS .NET API Part 1: introduction

Using Amazon Elastic MapReduce with the AWS .NET API Part 8: connection to our Big Data demo

My profile

Andras Nemes

Verified Services

Follow my blog via email

Top Posts & Pages

History

My tweets

Blogs I Follow

Share:

Share:

Share:

Share:

Share:

Share:

Share:

Share:

Share:

Share:

My profile

Verified Services

Follow my blog via email

Top Posts & Pages

History

Keywords

Blogs I Follow