Comparing strings using the CompareInfo class in .NET C#
August 18, 2017 Leave a comment
It’s important to be aware of the cultural settings in a globalised application when comparing strings. The CompareInfo class and the CompareOptions enumeration provide a useful way to compare strings based on specific cultures.
One way to get hold of the CompareInfo class belonging to a specific culture is through the CultureInfo class:
CultureInfo swedishCulture = new CultureInfo("sv-SE"); CompareInfo swedishCompareInfo = swedishCulture.CompareInfo; CultureInfo hungarianCulture = new CultureInfo("hu-HU"); CompareInfo hungarianCompareInfo = hungarianCulture.CompareInfo; CultureInfo germanCulture = new CultureInfo("de-DE"); CompareInfo germanCompareInfo = germanCulture.CompareInfo;
The CompareInfo object has a Compare method which returns 0 if the strings are equal, -1 if the first string is less than the second and 1 if the opposite is the case. The following comparison of two German strings returns -1 as by default the comparison is case-sensitive:
int comparison = germanCompareInfo.Compare("mädchen", "Mädchen");
This is where the CompareOptions enumeration proves useful. Here are the possible values:
- IgnoreCase: make the comparison case-insensitive
- IgnoreNonSpace: ignore diacritics, or officially non-spacing combining characters in Unicode. Example: “Madchen” will be equal to “Mädchen” with this flag
- IgnoreSymbols: ignore symbols, like white-space, #, $, % etc. “Mädch$en” and “M#ädchen” will be considered equal with this flag
- IgnoreKana and IgnoreWidth: concern mostly the Japanese language
- None: the default value if the basic overload of Compare is called
- Ordinal: quick but culture-insensitive comparison based on the Unicode value of each character
- OrdinalIgnoreCase: same as Ordinal but the comparison is also case-insensitive
- StringSort: use a sort algorithm where non-alphanumeric symbols, such as ‘-‘ come before the alphanumeric characters
Read all posts related to Globalisation in .NET here.