Extracting information from a text using Regex and Match in C# .NET

Occasionally you need to extract some information from a free-text form. Consider the following text:

First name: Elvis
Last name: Presley
Address: 1 Heaven Street
City: Memphis
State: TN
Zip: 12345

Say you need to extract the full name, the address, the city, the state and the zip code into a pipe-delimited string. The following function is one option:

private static string ExtractJist(string freeText)
{
	StringBuilder patternBuilder = new StringBuilder();
	patternBuilder.Append(@"First name: (?<fn>.*$)\n")
		.Append("Last name: (?<ln>.*$)\n")
		.Append("Address: (?<address>.*$)\n")
		.Append("City: (?<city>.*$)\n")
		.Append("State: (?<state>.*$)\n")
		.Append("Zip: (?<zip>.*$)");
	Match match = Regex.Match(freeText, patternBuilder.ToString(), RegexOptions.Multiline | RegexOptions.IgnoreCase);
	string fullname = string.Concat(match.Groups["fn"], " ", match.Groups["ln"]);
	string address = match.Groups["address"].ToString();
	string city = match.Groups["city"].ToString();
	string state = match.Groups["state"].ToString();
	string zip = match.Groups["zip"].ToString();
	return string.Concat(fullname, "|", address, "|", city, "|", state, "|", zip);
}

Call the function as follows:

string source = @"First name: Elvis
Last name: Presley
Address: 1 Heaven Street
City: Memphis
State: TN
Zip: 12345
";
string extracted = ExtractJist(source);

View all posts related to string and text operations here.

Advertisements

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

2 Responses to Extracting information from a text using Regex and Match in C# .NET

  1. Reblogged this on public interface IReadable { IEnumerable Read(); } and commented:
    Regex has never been easy for me for some reason :/

  2. Andras Nemes says:

    It’s probably not easy for anyone for that matter, it’s so cryptic. I know a handful of them, otherwise I need to google for a solution.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

ultimatemindsettoday

A great WordPress.com site

iReadable { }

.NET Tips & Tricks

Robin Sedlaczek's Blog

Developer on Microsoft Technologies

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

the software architecture

thoughts, ideas, diagrams,enterprise code, design pattern , solution designs

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

Anything around ASP.NET MVC,WEB API, WCF, Entity Framework & AngularJS

Cyber Matters

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: