I need a regex for the href attribute for an mp3 file url in python -
Based on the last stack overflow question and contribution by cgoldberg, I came up with this regex using the python module again:
import re urls = re.finditer ('http: // (. *?) .mp3', htmlcode) variable url run one Is eligible item and if I have more than one in a mp3 file url, then i can use a loop: url in url
: mp3fileurl = url.group (0 ) This technique, however, only works sometimes Uje think regular expressions will not be reliable as fully-fledged parsing module. But, sometimes, this is not reliable for the same page.
For some URL entries I get everything a while ago.
I'm relatively new to regular expressions, so, I'm just thinking that there is a more reliable way to go about it.
Thanks in advance, new to stock overflow and some answers are also keen to contribute. As always, instead of regular expressions to remove the information from HTML files such as a html parser:
tree. Import to link in the Findindol tree lxml.html tree = lxml.html.fromstring (htmlcode) (".// a"): url = link.get ("href") if url.endswith (".mp3"): Print URL
Comments
Post a Comment