Replacing substrings using Regex in C# .NET: string cleaning example

We often need to sanitize string inputs where the input value is out of our control. Some of those inputs can come with unwanted characters. The following method using Regex removes all non-alphanumeric characters except for ‘@’, ‘-‘ and ‘.’:

private static string RemoveNonAlphaNumericCharacters(String input)
{
	return Regex.Replace(input, @"[^\w\.@-]", string.Empty);
}

Calling this method like…

string cleanString = RemoveNonAlphaNumericCharacters("()h{e??l#'l>>o<<");

…returns “hello”.

View all posts related to string and text operations here.

Advertisement

Extracting information from a text using Regex and Match in C# .NET

Occasionally you need to extract some information from a free-text form. Consider the following text:

First name: Elvis
Last name: Presley
Address: 1 Heaven Street
City: Memphis
State: TN
Zip: 12345

Say you need to extract the full name, the address, the city, the state and the zip code into a pipe-delimited string. The following function is one option:

Read more of this post

Reformatting extracted substrings using Match.Result in C# .NET

Say you have the following Uri:

http://localhost:8080/webapps/bestapp

…and you’d like to extract the protocol and the port number and concatenate them. One option is a combination of a regular expression and matching groups within the regex:

Read more of this post

Replacing substrings using Regex in C# .NET: phone number example

In this post we saw an application of Regex and Match to reformat date strings. Let’s check another example: change the following phone number formats…:

  • (xxx)xxx-xxxx: (123)456-7890
  • (xxx) xxx-xxxx: (123) 456-7890
  • xxx-xxx-xxxx: 123-456-7890
  • xxxxxxxxxx: 1234567890

…into (xxx) xxx-xxxx.

Here’s a possible solution:

private static string ReformatPhone(string phone)
{
	Match match = Regex.Match(phone, @"^\(?(\d{3})\)?[\s\-]?(\d{3})\-?(\d{4})$");
	return string.Format("({0}) {1}-{2}", match.Groups[1], match.Groups[2], match.Groups[3]);
}

If you call this function with any of the above 4 examples it will return “(123) 456-7890”.

View all posts related to string and text operations here.

Extracting information from a text using Regex and Match in C# .NET

Occasionally you need to extract some information from a free-text form. Consider the following text:

First name: Elvis
Last name: Presley
Address: 1 Heaven Street
City: Memphis
State: TN
Zip: 12345

Say you need to extract the full name, the address, the city, the state and the zip code into a pipe-delimited string. The following function is one option:

private static string ExtractJist(string freeText)
{
	StringBuilder patternBuilder = new StringBuilder();
	patternBuilder.Append(@"First name: (?<fn>.*$)\n")
		.Append("Last name: (?<ln>.*$)\n")
		.Append("Address: (?<address>.*$)\n")
		.Append("City: (?<city>.*$)\n")
		.Append("State: (?<state>.*$)\n")
		.Append("Zip: (?<zip>.*$)");
	Match match = Regex.Match(freeText, patternBuilder.ToString(), RegexOptions.Multiline | RegexOptions.IgnoreCase);
	string fullname = string.Concat(match.Groups["fn"], " ", match.Groups["ln"]);
	string address = match.Groups["address"].ToString();
	string city = match.Groups["city"].ToString();
	string state = match.Groups["state"].ToString();
	string zip = match.Groups["zip"].ToString();
	return string.Concat(fullname, "|", address, "|", city, "|", state, "|", zip);
}

Call the function as follows:

string source = @"First name: Elvis
Last name: Presley
Address: 1 Heaven Street
City: Memphis
State: TN
Zip: 12345
";
string extracted = ExtractJist(source);

View all posts related to string and text operations here.

Replacing substrings using Regex in C# .NET: phone number example

In this post we saw an application of Regex and Match to reformat date strings. Let’s check another example: change the following phone number formats…:

  • (xxx)xxx-xxxx: (123)456-7890
  • (xxx) xxx-xxxx: (123) 456-7890
  • xxx-xxx-xxxx: 123-456-7890
  • xxxxxxxxxx: 1234567890

…into (xxx) xxx-xxxx.

Here’s a possible solution:

private static string ReformatPhone(string phone)
{
	Match match = Regex.Match(phone, @"^\(?(\d{3})\)?[\s\-]?(\d{3})\-?(\d{4})$");
	return string.Format("({0}) {1}-{2}", match.Groups[1], match.Groups[2], match.Groups[3]);
}

If you call this function with any of the above 4 examples it will return “(123) 456-7890”.

View all posts related to string and text operations here.

Phone and ZIP format checker examples from C# .NET

It’s a common task to check the validity of an input in any application. Some inputs must follow a specific format, like phone numbers and ZIP codes. Here come two regular expression examples that will help you with that:

private static bool IsValidPhone(string candidate)
{
	return Regex.IsMatch(candidate, @"^\(?\d{3}\)?[\s\-]?\d{3}\-?\d{4}$");
}

The above regular expression will return true for the following formats:

  • (xxx)xxx-xxxx: (123)456-7890
  • (xxx) xxx-xxxx: (123) 456-7890
  • xxx-xxx-xxxx: 123-456-7890
  • xxxxxxxxxx: 1234567890

Let’s now see a possible solution for a US ZIP code:

private static bool IsValidZip(string candidate)
{
	return Regex.IsMatch(candidate, @"^\d{5}(\-\d{4})?$");
}

This function returns true for the following formats:

  • xxxxx-xxxx: 01234-5678
  • xxxxx: 01234

View all posts related to string and text operations here.

Replacing substrings using Regex in C# .NET: string cleaning example

We often need to sanitize string inputs where the input value is out of our control. Some of those inputs can come with unwanted characters. The following method using Regex removes all non-alphanumeric characters except for ‘@’, ‘-‘ and ‘.’:

private static string RemoveNonAlphaNumericCharacters(String input)
{
	return Regex.Replace(input, @"[^\w\.@-]", string.Empty);
}

Calling this method like…

string cleanString = RemoveNonAlphaNumericCharacters("()h{e??l#'l>>o<<");

…returns “hello”.

View all posts related to string and text operations here.

Replacing substrings using Regex in C# .NET: date format example

Say your application receives the dates in the following format:

mm/dd/yy

…but what you actually need is this:

dd-mm-yy

You can try and achieve that with string operations such as IndexOf and Replace. You can however perform more sophisticated substring operations using regular expressions. The following method will perform the required change:

private static string ReformatDate(String dateInput)
{
	return Regex.Replace(dateInput, "\\b(?<month>\\d{1,2})/(?<day>\\d{1,2})/(?<year>\\d{2,4})\\b"
		, "${day}-${month}-${year}");
}

Calling this method with “10/28/14” returns “28-10-14”.

View all posts related to string and text operations here.

Reformatting extracted substrings using Match.Result in C# .NET

Say you have the following Uri:

http://localhost:8080/webapps/bestapp

…and you’d like to extract the protocol and the port number and concatenate them. One option is a combination of a regular expression and matching groups within the regex:

private static void ReformatSubStringsFromUri(string uri)
{
	Regex regex = new Regex(@"^(?<protocol>\w+)://[^/]+?(?<port>:\d+)?/");
	Match match = regex.Match(uri);
	if (match.Success)
	{
		Console.WriteLine(match.Result("${protocol}${port}"));
	}
}

The groups are defined by “protocol” and “port” and are referred to in the Result method. The result method is used to reformat the extracted groups, i.e the substrings. In this case we just concatenate them. Calling this method with the URL in above yields “http:8080”.

However you can a more descriptive string format, e.g.:

Console.WriteLine(match.Result("Protocol: ${protocol}, port: ${port}"));

…which prints “Protocol: http, port: :8080”.

View all posts related to string and text operations here.

Elliot Balynn's Blog

A directory of wonderful thoughts

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

WEB APPLICATION DEVELOPMENT TUTORIALS WITH OPEN-SOURCE PROJECTS

Once Upon a Camayoc

Bite-size insight on Cyber Security for the not too technical.

%d bloggers like this: