c# - Regex Problems, extracting data to groups -
how love regex!
i have string mangled form of xml, like:
<category>dir</category><location>dl123a</location><reason>because</reason><qty>42</qty><description>some desc</description><ipaddress>127.0.0.1</ipaddress>
everything on 1 line, 'headers' different.
so need extract information string above, putting dictionary/hashtable
--
string mystring = @"<category>dir</category><location>dl123a</location><reason>because</reason><qty>42</qty><description>some desc</description><ipaddress>127.0.0.1</ipaddress>"; //this extract name of label in header regex r = new regex(@"(?<header><[a-za-z]+>?)"); //create collection of matches matchcollection mc = r.matches(mystring); foreach (match m in mc) { headers.add(m.groups["header"].value); } //this try , values. r = new regex(@"(?'val'>[a-za-z0-9\s]*</?)"); mc = r.matches(mystring); foreach (match m in mc) { string match = m.groups["val"].value; if (string.isnullorempty(match) || match == "><" || match == "> <") continue; else values.add(match); }
-- hacked previous work regexes closest could. doesnt work way want it.
the 'header' pulls angle brackets in.
the 'value' pulls in lot of empties (hence dodgy if statement in loop). doesnt work on strings periods, commas, spaces, etc.
it better if combine 2 statements dont have loop through regex twice.
can give me info can improve it?
if looks xml, why not use xml parser functionalities of .net? need add root element around it:
string mystring = @"<category>dir</category><location>dl123a</location><reason>because</reason><qty>42</qty><description>some desc</description><ipaddress>127.0.0.1</ipaddress>"; var values = new dictionary<string, string>(); var xml = xdocument.parse("<root>" + mystring + "</root>"); foreach(var e in xml.root.elements()) { values.add(e.name.tostring(), e.value); }
Comments
Post a Comment