regex - Regular expression to extract text from XML-ish data using GNU sed -
i have file full of lines extracted xml file using "gsed regexp -i filename". lines in file of 1 of either format:
<field number='1' name='account' type='string'w/> <field number='2' name='advid' type='string'w>
i've inserted 'w' in end represents optional whitespace. order , number of properties not same in lines throughout file although "number" before "type".
what i'm searching regular expression "regexp" can give gnu sed command:
gsed regexp -i filename
gives me file lines looking this:
1 string
2 string
i don't care amount of whitespace in result long there after number , newline @ end of each line.
i'm sure possible, can't figure out how in reasonable amount of time. can help?
thanks lot, jules
i'm sure can optimized, works me , answers question:
sed "s/^.*number='\([0-9]*\)'.*type='\(.*\)'.*$/\1 \2/" <filename>
saying that, think others right, if have xml-file should use xml-parser.
Comments
Post a Comment