Extracting information from a text using Regex and Match in C# .NET
January 11, 2017 Leave a comment
Occasionally you need to extract some information from a free-text form. Consider the following text:
First name: Elvis
Last name: Presley
Address: 1 Heaven Street
City: Memphis
State: TN
Zip: 12345
Say you need to extract the full name, the address, the city, the state and the zip code into a pipe-delimited string. The following function is one option:
private static string ExtractJist(string freeText) { StringBuilder patternBuilder = new StringBuilder(); patternBuilder.Append(@"First name: (?<fn>.*$)\n") .Append("Last name: (?<ln>.*$)\n") .Append("Address: (?<address>.*$)\n") .Append("City: (?<city>.*$)\n") .Append("State: (?<state>.*$)\n") .Append("Zip: (?<zip>.*$)"); Match match = Regex.Match(freeText, patternBuilder.ToString(), RegexOptions.Multiline | RegexOptions.IgnoreCase); string fullname = string.Concat(match.Groups["fn"], " ", match.Groups["ln"]); string address = match.Groups["address"].ToString(); string city = match.Groups["city"].ToString(); string state = match.Groups["state"].ToString(); string zip = match.Groups["zip"].ToString(); return string.Concat(fullname, "|", address, "|", city, "|", state, "|", zip); }
Call the function as follows:
string source = @"First name: Elvis Last name: Presley Address: 1 Heaven Street City: Memphis State: TN Zip: 12345 "; string extracted = ExtractJist(source);
View all posts related to string and text operations here.