I am trying to get the address of a facebook page of websites using regular expression search on the html
usually the link appears as
<a href="http://www.facebook.com/googlechrome">Facebook</a>
but sometimes the address will be http://www.facebook.com/some.other
and sometimes with numbers
at the moment the regex that I have is
'(facebook.com)\S\w+'
but it won't catch the last 2 possibilites
what is it called when I want the regex to search but not fetch it? (for instance I want the regex to match the www.facbook.com part but not have that part in the result, only the part that comes after it
note I use python with re and urllib2