c# - regex: matching phrases without a > or white space -
I am parsing some html using regex and I want to match those lines which can be found in any html tag My first pattern was using the white space C # regex without a word:
pattern = @ "^ \ s * ([^ ^ lt;])"; which tries to capture all white space and then any non '& lt;' Character Unfortunately, if the line first '& lt;' All white space before it is 'lt;' Gives the last white space character before
don't use regular expression to parse HTML Make this a very bad idea and, at best, your code will be flat. Whatever your language / platform, you will have a full-functional HTML parser. Just use that.
Any regular expression can not be properly controlled by any type, unit use and so on all cases.
Comments
Post a Comment