Using Amazon RedShift with the AWS .NET API Part 5: connecting to master node using ODBC

Introduction

In the previous post we went through some basic C# code to communicate with Amazon RedShift. We saw how to get a list of clusters, start a new cluster and terminate one using the .NET AWS SDK.

We haven’t yet seen how to execute Postgresql commands on RedShift remotely from code. That is the main goal of this post.

Installing the ODBC driver

In this section we’ll prepare our Windows environment to be able to connect to RedShift using ODBC. At times this can be a frustrating experience so I’ll try to give you as much detail as I can.

First off we’ll need an ODBC driver. Navigate to the Postgresql page with a list of ODBC drivers. You’ll see a long list of downloadable zip files. Scroll all the way down to see the latest versions.

ODBC drivers come in two versions: 64-bit and 32-bit and this is an important distinction as connectivity exceptions will arise if you’re trying to connect with the wrong architecture. At the time of writing this post these are the latest versions of both:

32 and 64 bit ODBC drivers on Postgresql home page

The one that ultimately worked for me was the very last file, i.e. psqlodbc_09_03_0400.zip which is actually for 32-bit systems even though I have a 64-bit system type. I cannot explain why this is the case but the 64-bit version didn’t work for me. However, if the below setup process fails for you then you’ll have to repeat it with psqlodbc_09_03_0300-x64-1.zip.

The downloaded package will include a setup file:

32 and 64 bit ODBC drivers on Postgresql home page

Double-click it and step through the installation.

We’ll now check if the driver has been properly registered. There’s an ODBC management tool on Windows. Actually, there are 2 of them: one 32-bit and one 64-bit and it really does matter which one you choose. On Windows 7 the 32-bit tool is located here:

c:\windows\sysWOW64\odbcad32.exe

…and the 64-bit version is located here:

c:\windows\system32\odbcad32.exe

Like above, I went with the 32-bit version. It opens the ODBC data source administrator window:

ODBC data source administrator tool

Select the Drivers tab and scroll down in the list of drivers. You should see two Postgresql drivers:

Postgresql drivers in ODBC tool

We’ll need to reference the Unicode one by its name in the connection string eventually. If you have to redo this process with the 64-bit version then the driver will be called PostgreSQL Unicode(x64) instead.

So far so good. We’ll now test the connectivity to our “urls” database. In RedShift if you click on the cluster name it will open a page where you can view the JDBC and ODBC connection strings. You’ll also see the connection endpoint which will look similar to the following:

RedShift connection endpoint

You’ll also find the JDBC and ODBC connection strings further down on the same page. The ODBC one is not entirely correct, however.

In the ODBC Data source administrator tool select the tab called “System DSN” and click Add.

Create new ODBC DSN

Scroll down and locate the driver called PostgreSQL Unicode and click Finish. This will open a setup window:

POstgreSQL ODBC driver setup window

Give the data source a name, like RedShiftUrls. The database name will be “urls”, the server is the endpoint you saw in RedShift, the port will be 5439 if you accepted the defaults. Fill in the master username and password fields. When you’re done click “Test”. If all goes well you’ll get a confirmation:

ODBC connection successful

Click OK and the click Save on the setup window. The system DSN has been registered:

PostgreSQL connection registered in ODBC tool

Click OK to close the ODBC tool.

Testing the connection in code

Let’s test the connectivity in code as well. Open the RedShift demo console application we started working on in the previous post and add the following method to RedShiftDemoService.cs:

public void TestConnectivityOnMasterNode(string clusterName)
{
	using (IAmazonRedshift redshiftClient = GetRedShiftClient())
	{
		try
		{
			DataSet ds = new DataSet();
			DataTable dt = new DataTable();
			DescribeClustersRequest describeClustersRequest = new DescribeClustersRequest() { ClusterIdentifier = clusterName };
			DescribeClustersResponse describeClustersResponse = redshiftClient.DescribeClusters(describeClustersRequest);
			Cluster firstMatch = describeClustersResponse.Clusters[0];

			String mainDbName = firstMatch.DBName;
			String endpointAddress = firstMatch.Endpoint.Address;
			int endpointPort = firstMatch.Endpoint.Port;
			string masterUsername = firstMatch.MasterUsername;
			string password = "your master password";


			string odbcConnectionString = string.Concat("Driver={PostgreSQL Unicode}; Server=", endpointAddress
				, "; Database=", mainDbName, "; UID=", masterUsername, "; PWD=", password
				, "; Port=", endpointPort);					
			string odbcConnectionDsn = "DSN=UrlsRedShiftDb";
			string query = "SELECT datname FROM pg_database WHERE datistemplate = false;";
			using (OdbcConnection conn = new OdbcConnection(odbcConnectionString))
			{
				try
				{
					conn.Open();
					OdbcDataAdapter da = new OdbcDataAdapter(query, conn);
					da.Fill(ds);
					dt = ds.Tables[0];
					foreach (DataRow row in dt.Rows)
					{
						Console.WriteLine(row["datname"]);
					}

				}
				catch (Exception ex)
				{
					Console.WriteLine("Exception caught while communicating with RedShift master node: ");
					Console.WriteLine(ex.Message);
				}
			}
		}
		catch (AmazonRedshiftException e)
		{
			Console.WriteLine("Postgresql command execution on master node has failed.");
			Console.WriteLine("Amazon error code: {0}",
				string.IsNullOrEmpty(e.ErrorCode) ? "None" : e.ErrorCode);
			Console.WriteLine("Exception message: {0}", e.Message);
		}
	}
}

We first locate the cluster using the DescribeClustersRequest object we saw previously. We then build up the connection string from various elements we can find from the cluster: the DB name, the endpoint, the port and the master user name. You cannot extract the password of course so you’ll need to fill that in. Check the first part of the odbcConnectionString variable: Driver={PostgreSQL Unicode}. If you need to retest the process using the 64-bit driver then this must be “Driver={PostgreSQL Unicode(x64)}” instead.

You’ll see an unused variable “odbcConnectionDsn” in the code. I only left it in as an alternative to building the connection string. We saved the DSN settings in the previous paragraph. You can refer to those settings by its name in a very short connection string like in the example. You can then pass odbcConnectionDsn into the OdbcConnection constructor instead of odbcConnectionString if you so wish.

So, we have a simple query that we already tested before, it gives the database names of the cluster. We open the connection, send the query and print out the results.

Call this method from Main:

static void Main(string[] args)
{
	RedShiftDemoService demoService = new RedShiftDemoService();
	demoService.TestConnectivityOnMasterNode("a-first-test");

	Console.WriteLine("Main done...");
	Console.ReadKey();
}

If everything goes well then we’ll see the same 3 database names as before:

Extracting table names using remote command execution with ODBC

Possible exceptions

I’ve encountered two different exceptions during my tests:

ERROR [IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified

The above occurred when I tried to access the RedShift master node with the wrong Postgresql driver. Seemingly everything was fine, I registered the 64-bit ODBC driver but the driver simply wasn’t found.

ERROR [IM014] [Microsoft][ODBC Driver Manager] The specified DSN contains an architecture mismatch between the Driver and Application

And the above indicates that the ODBC driver was registered but it’s not of the correct architecture, i.e. 32-bit instead of 64 or the way around. You’ll then need to install the correct ODBC driver type.

In the next post we’ll execute some other PostgreSQL requests through our ODBC connection.

View all posts related to Amazon Web Services and Big Data here.

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

5 Responses to Using Amazon RedShift with the AWS .NET API Part 5: connecting to master node using ODBC

  1. Mahendra says:

    I am actually making a Amazon Redshift database connection through .net and I ran into this error – “ERROR [IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified.”

    Well, it worked when I installed Amazon Redshift ODBC 32 bit Driver on my 64 bit machine.

  2. Mahendra says:

    I got the Amazon Redshift database connectivity working in .net on 64 bit machine with the 32 bit Amazon Redshift driver.

  3. Chris says:

    This is great – thanks!

    Your site is an ***amazing*** resource and your time and effort are ***greatly*** appreciated.

Leave a comment

Elliot Balynn's Blog

A directory of wonderful thoughts

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

WEB APPLICATION DEVELOPMENT TUTORIALS WITH OPEN-SOURCE PROJECTS

Once Upon a Camayoc

Bite-size insight on Cyber Security for the not too technical.