tormeu Posted July 29, 2007 Share Posted July 29, 2007 I'm trying to match the image and alt= and scr= tags with regular expressions... I'm reusing code from this forum, (Tks).. <p align="center"><a title="Gato con carrito de compra de supermercado invisible" class="imagelink" rel="attachment" id="p203" href="http://www.ecnc.com/blog/2007/07/23/voy-al-supermercado-me-compras-unas-toallas-femeninas-mi-amor/gato-con-carrito-de-compra-de-supermercado-invisible/"> <img alt="Gato con carrito de compra de supermercado invisible" id="image203" src="https://www.lecnc.com/blog/wp-content/uploads/2007/07/carrito_de_compra_invisible.jpg" /></a></p> <p>Fuente: <a target="_blank" title="Funny animals" href="http://www.flickr.com/photos/funny_animals/380169474/">Flickr Funny Animals </a></p> I'm trying to match alt= and scr= Code I'm using /\< *[img][^\>]*alt *= *[\"\']{0,1}([^>]*) *src *= *[\"\']{0,1}([^\"\'\ >]*)/ It's matching both, however match 1, the alt= is getting Gato con carrito de compra de supermercado invisible" id="image203" Now I need to take out the id="image203" from the selection Any suggestion is well appreciated? Link to comment Share on other sites More sharing options...
0 PKHelloNasty Posted July 29, 2007 Share Posted July 29, 2007 (edited) This is working for me. I don't know why people keep using the complicated regex from the other thread. \<img.+?alt="(.+?)".+?src="(.+?)".+?\/> Edited July 29, 2007 by PKHelloNasty Link to comment Share on other sites More sharing options...
0 +mrbester MVC Posted July 30, 2007 MVC Share Posted July 30, 2007 Both of those assume that the alt attribute appears before the src attribute. There is also the assumption that they are quoted, which we all know ain't necessarily so when it comes to Frontpage created pages (or HTML4 for that matter). PKHN's also assumes you're writing XHTML. To counter these assumptions makes for an annoyingly complicated regex, or a selection that drills down to what you're after: first get an image tag, then get the attributes and then look through those to finally retrieve what you're after. Lest we forget, > can appear unencoded in a quoted attribute value (though frowned upon), so /\<img[^>]*>/ might only match '<img alt="My pic" title="My pictures >' in '<img alt="My pic" title="My pictures > my pic" src=mypic.gif>' (note unquoted src attribute as there aren't any spaces in the value), meaning you have to check if the occurrence of > is before end of string or the next tag start. Link to comment Share on other sites More sharing options...
0 tormeu Posted July 30, 2007 Author Share Posted July 30, 2007 This is working for me. I don't know why people keep using the complicated regex from the other thread.\<img.+?alt="(.+?)".+?src="(.+?)".+?\/> Thanks... I tested this solution provided at regexadvice that is working fine, however like you said, it is not nice... <img[^>]*?alt=\x22([^\x22]*)\x22[^>]*?src=\x22([^\x22]*)[^>]*?>|<img[^>]*?src=\x22([^\x22]*)\x22[^>]*?alt=\x22([^\x22]*)\x22[^>]*?> it should match on both *src* and *alt* no matter in what order they appear in the *Img* tag. Link to comment Share on other sites More sharing options...
Question
tormeu
I'm trying to match the image and alt= and scr= tags with regular expressions...
I'm reusing code from this forum, (Tks)..
I'm trying to match alt= and scr=
Code I'm using
It's matching both, however match 1, the alt= is getting
Now I need to take out the id="image203" from the selection
Any suggestion is well appreciated?
Link to comment
Share on other sites
3 answers to this question
Recommended Posts