php - Split a text into single words -
I would like to use PHP to divide one less text into one word. Do you have any idea how to get it?
My approach:
Function tokenizer ($ text) {$ text = trim (stroller ($ text)); $ Punctuation marks = '/ [^ a-z0- 9AUUUS -] /'; $ Result = preg_split ($ punctuation, $ text, -1, PREG_SPLIT_NO_EMPTY); ($ I = 0; $ i & lt; Count ($ result); $ i ++) {$ result [$ i] = trim ($ result [$ i]); } Return result; // contains one word) $ text = 'This is an example text, includes commas and full-stop exclamation marks, too! Question Mark? You know all the punctuation marks. '; Print_r (tokenizer ($ text)); Is this a good way? Do you have any ideas for improvement?
Thanks in advance!
class \ P {P} that matches any unicode punctuation, which \ Whitespace combines with class. "
$ result = preg_split ('/ ((^ ^ p {P} +) | (\ P {P} * \ s + \ p {P} *) | (\ p { P} + $)) / ', $ text, -1, PREG_SPLIT_NO_EMPTY); It will be split on one group of one or more white space, but any nearby punctuation must also be sucked in. It matches the punctuation marks in the beginning or end of the string. There is difference in cases like "no" and "he said, 'here!'"
Comments
Post a Comment