Globalization | Exercises in .NET with Andras Nemes

Using NumberStyles to parse numbers in C# .NET Part 2

March 13, 2015 1 Comment

In the previous post we looked at some basic usage of the NumberStyles enumeration. The enumeration allows to parse other representations of numeric values.

Occasionally negative numbers are shown with a trailing negative sign like this: “13-“. There’s a solution for that:

string number = "13-";
int parsed = int.Parse(number, NumberStyles.AllowTrailingSign);

“parsed” will be -13 as expected.

Using NumberStyles to parse numbers in C# .NET Part 1

March 11, 2015 1 Comment

There are a lot of number formats out there depending on the industry we’re looking at. E.g. negative numbers can be represented in several different ways:

-14
(14)
14-
14.00-
(14,00)

…and so on. Accounting, finance and other, highly “numeric” fields will have their own standards to represent numbers. Your application may need to parse all these strings and convert them into proper numeric values. The static Parse method of the numeric classes, like int, double, decimal all accept a NumberStyles enumeration. This enumeration is located in the System.Globalization namespace.

Getting the byte array of a string depending on Encoding in C# .NET

December 24, 2014 Leave a comment

You can take any string in C# and view its byte array data depending on the Encoding type. You can get hold of the encoding type using the Encoding.GetEncoding method. Some frequently used code pages have their short-cuts:

Encoding.ASCII
Encoding.BigEndianUnicode
Encoding.Unicode – this is UTF16
Encoding.UTF7
Encoding.UTF32
Encoding.UTF8

Once you’ve got hold of an encoding you can call its GetBytes method to return the byte array representation of a string. You can use this method whenever another method requires a byte array input instead of a string.

For backward compatibility the positions 0-127 are the same in most encoding types. These cover the standard English alphabet – both lower and upper case -, the numbers, punctuation plus some other characters. So if you only take characters from this range then the byte values in the array will be the same. You can view the ASCII characters here: ASCII character set.

The following function will print the same values for both the ASCII and Chinese encoding types:

string input = "I am feeling great";
byte[] asciiEncoded = Encoding.ASCII.GetBytes(input);
Console.WriteLine("Ascii");
foreach (byte b in asciiEncoded)
{
	Console.WriteLine(b);
}

Encoding chinese = Encoding.GetEncoding("Chinese");
byte[] chineseEncoded = chinese.GetBytes(input);
Console.WriteLine("Chinese");
foreach (byte b in chineseEncoded)
{
	Console.WriteLine(b);
}

If you’re trying to ASCII-encode a Unicode string which contains non-ASCII characters then you’ll get see the ASCII byte value of 63, i.e. ‘?’:

string input = "öåä I am feeling great";
byte[] asciiEncoded = Encoding.ASCII.GetBytes(input);
Console.WriteLine("Ascii");
foreach (byte b in asciiEncoded)
{
	Console.WriteLine(b);
}

The first 3 positions will print 63 as the Swedish ‘öåä’ characters cannot be handled by ASCII. E.g. whenever you visit a website and see question marks and other funny characters instead of proper text then you know that there’s an encoding problem: the page has been encoded with an encoding type that’s not available on the user’s computer when viewed.

View all posts related to Globalization here.

Filed under .NET, Globalization Tagged with c#, encoding, globalization

Getting the list of supported Encoding types in .NET

December 23, 2014 Leave a comment

Every text file and string is encoded using one of many encoding standards. Normally .NET will handle encoding automatically but there are times when you need to dig into the internals for encoding and decoding. It’s very simple to retrieve the list of supported encoding types, a.k.a code pages in .NET:

EncodingInfo[] codePages = Encoding.GetEncodings();
foreach (EncodingInfo codePage in codePages)
{
	Console.WriteLine("Code page ID: {0}, IANA name: {1}, human-friendly display name: {2}", codePage.CodePage, codePage.Name, codePage.DisplayName);
}

Example output:

Code page ID: 37, IANA name: IBM037, human-friendly display name: IBM EBCDIC (US-Canada)
Code page ID: 852, IANA name: ibm852, human-friendly display name: Central European (DOS)

View all posts related to Globalization here.

Filed under .NET, Globalization Tagged with c#, encoding, globalization

Finding the user’s supported cultures using the CultureInfo class in .NET C#

September 16, 2014 Leave a comment

The CultureInfo class has a static method to retrieve the supported locales on the user’s machine:

CultureInfo[] supportedCultures = CultureInfo.GetCultures(CultureTypes.AllCultures);

The GetCultures method accepts a CultureTypes enumeration:

AllCultures: show all cultures regardless of the type
FrameworkCultures: show all specific and neutral cultures that ship with .NET
InstalledWin32Cultures: all cultures installed on Windows
NeutralCultures: all language-specific, i.e. neutral cultures where the region which determines the specific culture is omitted
ReplacementCultures: custom cultures created by the user that replace an existing culture
SpecificCultures: the most precise culture type where both the language and regions are denoted
UserCustomCulture: custom cultures
WindowsOnlyCultures: deprecated, yields an array with 0 elements in .NET 4+

So in case you’d like to find all specific cultures and their names you can write as follows:

CultureInfo[] supportedCultures = CultureInfo.GetCultures(CultureTypes.SpecificCultures);
List<CultureInfo> ordered = supportedCultures.OrderBy(c => c.Name).ToList();
foreach (CultureInfo ci in ordered)
{
	Console.WriteLine(string.Concat(ci.Name, ": ", ci.EnglishName));
}

However, if you’re only interested in the supported languages then the following will do:

CultureInfo[] supportedCultures = CultureInfo.GetCultures(CultureTypes.NeutralCultures);
List<CultureInfo> ordered = supportedCultures.OrderBy(c => c.Name).ToList();
foreach (CultureInfo ci in ordered)
{
	Console.WriteLine(string.Concat(ci.Name, ": ", ci.EnglishName));
}

Read all posts related to Globalisation in .NET here.

Filed under .NET, Globalization Tagged with c#, cultureinfo

Finding the user’s current region using RegionInfo in .NET C#

August 15, 2014 4 Comments

The CultureInfo object helps a lot in finding information about the user’s current culture. However, on occasion it may not be enough and you need to find out more about that user’s regional characteristics. You can easily retrieve a RegionInfo object from CultureInfo which will hold information about a particular country or region.

You can find the current region in two ways from CultureInfo:

CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
RegionInfo regionInfo = new RegionInfo(cultureInfo.LCID);
// or 
regionInfo = new RegionInfo(cultureInfo.Name);

string englishName = regionInfo.EnglishName;
string currencySymbol = regionInfo.CurrencySymbol;
string currencyEnglishName = regionInfo.CurrencyEnglishName;
string currencyLocalName = regionInfo.CurrencyNativeName;

My computer is set to use Swedish-Sweden as the specific culture so I get the following values from top to bottom:

Sweden
kr
Swedish krona
Svensk krona

If I change the current culture to my home country, i.e. Hungary…

CultureInfo hungaryCulture = new CultureInfo("hu-HU");
Thread.CurrentThread.CurrentCulture = hungaryCulture;
regionInfo = new RegionInfo(hungaryCulture.LCID);
englishName = regionInfo.EnglishName;
currencySymbol = regionInfo.CurrencySymbol;
currencyEnglishName = regionInfo.CurrencyEnglishName;
currencyLocalName = regionInfo.CurrencyNativeName;

…then the values are of course adjusted accordingly:

Hungary
Ft
Hungarian Forint
forint

Read all posts related to Globalisation in .NET here.

Filed under .NET, C# language features, Globalization Tagged with c#, cultureinfo, regioninfo

Using DateTimeFormatInfo to localise date and time in .NET C#

August 13, 2014 Leave a comment

Every programmer loves working with dates and time, right? Whether or not you like it it is inevitable to show the dates in a format that the viewer understands. You should not show dates presented according to the US format in Japan and vice versa.

The DateTimeFormatInfo class includes a range of useful properties to localise date and time. The entry point to the DateTimeFormatInfo class is CultureInfo. E.g. if you’d like to format a date according to various cultures – Swedish, Hungarian and German – then you can do it as follows:

CultureInfo swedishCulture = new CultureInfo("sv-SE");
DateTimeFormatInfo swedishDateFormat = swedishCulture.DateTimeFormat;

CultureInfo hungarianCulture = new CultureInfo("hu-HU");
DateTimeFormatInfo hungarianDateFormat = hungarianCulture.DateTimeFormat;

CultureInfo germanCulture = new CultureInfo("de-DE");
DateTimeFormatInfo germanDateFormat = germanCulture.DateTimeFormat;
			
DateTime utcNow = DateTime.UtcNow;

string formattedDateSweden = utcNow.ToString(swedishDateFormat.FullDateTimePattern);
string formattedDateHungary = utcNow.ToString(hungarianDateFormat.FullDateTimePattern);
string formattedDateGermany = utcNow.ToString(germanDateFormat.FullDateTimePattern);

…which yields the following formatted dates:

den 12 juni 2014 20:08:30
2014. június 12. 20:08:30
Donnerstag, 12. Juni 2014 20:08:30

DateTimeFormatInfo includes patterns for other types of date representations, such as MonthDayPattern, LongDatePattern etc.

You can also get the names of the days and months:

string[] swedishDays = swedishDateFormat.DayNames;
string[] germanDays = germanDateFormat.DayNames;
string[] hungarianDays = hungarianDateFormat.DayNames;

string[] swedishMonths = swedishDateFormat.MonthNames;
string[] hungarianMonths = hungarianDateFormat.MonthNames;
string[] germanMonths = germanDateFormat.MonthNames;

You can do a lot more with DateTimeFormatInfo:

The Calendar associated with the culture
The date separator
Abbreviated month and day names
First day of the week

…and more. I encourage you to inspect the available properties of the DateTimeFormatInfo object with IntelliSense in Visual Studio.

Read all posts related to Globalisation in .NET here.

Filed under .NET, C# language features, Globalization Tagged with c#, cultureinfo, DateTimeFormatInfo

Comparing strings using the CompareInfo class in .NET C#

August 12, 2014 1 Comment

It’s important to be aware of the cultural settings in a globalised application when comparing strings. The CompareInfo class and the CompareOptions enumeration provide a useful way to compare strings based on specific cultures.

One way to get hold of the CompareInfo class belonging to a specific culture is through the CultureInfo class:

CultureInfo swedishCulture = new CultureInfo("sv-SE");
CompareInfo swedishCompareInfo = swedishCulture.CompareInfo;

CultureInfo hungarianCulture = new CultureInfo("hu-HU");
CompareInfo hungarianCompareInfo = hungarianCulture.CompareInfo;

CultureInfo germanCulture = new CultureInfo("de-DE");
CompareInfo germanCompareInfo = germanCulture.CompareInfo;

The CompareInfo object has a Compare method which returns 0 if the strings are equal, -1 if the first string is less than the second and 1 if the opposite is the case. The following comparison of two German strings returns -1 as by default the comparison is case-sensitive:

int comparison = germanCompareInfo.Compare("mädchen", "Mädchen");

This is where the CompareOptions enumeration proves useful. Here are the possible values:

IgnoreCase: make the comparison case-insensitive
IgnoreNonSpace: ignore diacritics, or officially non-spacing combining characters in Unicode. Example: “Madchen” will be equal to “Mädchen” with this flag
IgnoreSymbols: ignore symbols, like white-space, #, $, % etc. “Mädch$en” and “M#ädchen” will be considered equal with this flag
IgnoreKana and IgnoreWidth: concern mostly the Japanese language
None: the default value if the basic overload of Compare is called
Ordinal: quick but culture-insensitive comparison based on the Unicode value of each character
OrdinalIgnoreCase: same as Ordinal but the comparison is also case-insensitive
StringSort: use a sort algorithm where non-alphanumeric symbols, such as ‘-‘ come before the alphanumeric characters

Read all posts related to Globalisation in .NET here.

Filed under .NET, C# language features, Globalization Tagged with c#, compareinfo, culture, cultureinfo

Finding the current culture settings using the CultureInfo class in .NET C#

August 8, 2014 3 Comments

Finding the the culture settings – the locale – of a thread is straightforward in .NET:

CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;

We extract the current culture from the current thread. The CultureInfo class holds a number of properties to extract information from it. Examples:

string cultureName = cultureInfo.Name;
string cultureDisplayName = cultureInfo.DisplayName;
string nativeName = cultureInfo.NativeName;
string englishName = cultureInfo.EnglishName;
string cultureAbbreviation = cultureInfo.TwoLetterISOLanguageName;

As my computer is set to run with Swedish settings I got the following values from top to bottom:

sv-SE
Swedish (Sweden)
svenska (Sverige)
Swedish (Sweden)
sv

Cultures are represented by the following format:

Neutral culture, “sv” in the above case, is the ISO language name which is tied to the language
Specific culture, “SE” in the above case, denotes the geographical location of the culture
The two elements are connected with a hyphen “-“, i.e. “sv-SE” in the above example

Specific culture is the most precise description of the user’s locale. It not only designates the language but the region as well. E.g. Swedish is spoken in Finland as well so the neutral culture “sv” is not enough to locate the user. There’s a specific “sv-FI” format for that. This aspect becomes a lot more important with broadly used languages such as French or English. French spoken in France is different from French spoken in Canada. Therefore we need to use fr-FR and fr-CA for the purpose of formatting and proper localisation.

Besides neutral and specific cultures there’s a third type of culture called invariant culture. This is not tied to any specific culture but is closest to English. It may be tempting to use this culture for other purposes such as date and time comparisons but you’ll most likely get false results in a globalised, culture aware application.

The current culture is used for formatting purposes behind the scenes:

decimal price = 10.43M;
string formattedDecimalDefault = price.ToString("C");

I got “10,43 kr” where ‘kr’ denotes the currency in Sweden, i.e. Swedish krona.

You can also set the current culture of the thread:

CultureInfo usCulture = new CultureInfo("en-US");
Thread.CurrentThread.CurrentCulture = usCulture;
string formattedPriceUs = price.ToString("C");

This immediately yields “$10.43”.

The invariant culture is denoted by an empty string:

CultureInfo invariantCulture = new CultureInfo("");
Thread.CurrentThread.CurrentCulture = invariantCulture;
string formattedPriceUs = price.ToString("C");

Neutral cultures are denoted by the ISO language names, e.g.:

CultureInfo englishCulture = new CultureInfo("en");
Thread.CurrentThread.CurrentCulture = englishCulture;
string formattedPriceUs = price.ToString("C");

Which will format the price according to the en-US specific culture.

As you typed “Thread.CurrentThread.” in the editor you may have noticed the CurrentUICulture property. That also returns a CultureInfo object. As the name suggests, it denotes the culture used to show prices, dates, time etc. in the application UI. CurrentCulture and CurrentUICulture will most often be the same, but be aware that they can be different. You might perform calculations in a culture but show the results in another one. Also, you can always manipulate the CurrentCulture of the thread during an application’s lifetime. However, you can only set the current UI culture at the application’s start up.

Read all posts related to Globalisation in .NET here.

Filed under .NET, C# language features, Globalization Tagged with c#, culture, cultureinfo

Newer posts →

Exercises in .NET with Andras Nemes

Using NumberStyles to parse numbers in C# .NET Part 2

Using NumberStyles to parse numbers in C# .NET Part 1

Getting the byte array of a string depending on Encoding in C# .NET

Getting the list of supported Encoding types in .NET

Finding the user’s supported cultures using the CultureInfo class in .NET C#

Finding the user’s current region using RegionInfo in .NET C#

Using DateTimeFormatInfo to localise date and time in .NET C#

Comparing strings using the CompareInfo class in .NET C#

Finding the current culture settings using the CultureInfo class in .NET C#

My profile

Andras Nemes

Verified Services

Follow my blog via email

Top Posts & Pages

History

My tweets

Blogs I Follow

Share:

Share:

Share:

Share:

Share:

Share:

Share:

Share:

Share:

My profile

Verified Services

Follow my blog via email

Top Posts & Pages

History

Keywords

Blogs I Follow