Home Tutorials Forums Articles Blogs Movies Library Employment Press
Old 08-18-2003, 10:32 AM   #1
red penguin
[^\d\D]
 
red penguin's Avatar
 
Join Date: Jun 2001
Location: Brooklyn, NY
Posts: 3,254
Default regexp help

Hello coders...

Setup the question: What I have thus far (working) but am looking for advice and/or improvements.

1. To search a file for any reference to an <a > tag. Grab the href value. done
2. Slap said values into an array. done
3. Display these in a checkbox form so the user can select as many as they'd like. done
4. Submit these values (an array) back to a script that will: grab the same very links we are speaking about from the very same file, figure out which links the user has selected, and replace the selected links with a different extension. done

Okay, so you ask, what's the problem? Well, the idea here is to allow the user to select links from a page (file) and change the extension from .htm/.html TO .php. Yes. This works fine via a few functions working closely together and passing a few variables around. (see above)

Now, the regexp that I use to REPLACE is:
PHP Code:
!\.html?! 
Easy, right? Originally, I was using:
PHP Code:
!\.\w{3,4}! 
This works fine on all relative links, which, btw, is really only what the user wants to select/change, however, is not working very well on a value such as http://www.foobar.com/contact/index.html. (There should be no real reason to change that absolute link anyway...)

(FYI: I am building an app that is writing some PHP dynamically to a page, even a page that is .html, and changes it's extension...if the users site is built around HTML and not PHP, there should be relative links within the site...this allows them to selectively change extensions quite easily)

So...Should I modify the search regexp to ONLY include relative links? Do I keep it as is and just tell the user that (s)he shouldn't select any absolute links (unless they are part of his/her site)?

Sorry if this is confusing...Again, it is working, however ALL href values are selected from a page. I am really only concerned with relative links. I'm thinking it'd be a better process if only those (rel links) were returned by my search regexp:
PHP Code:
## from my search function
        
if(preg_match_all("%<a href=(['\"]+[^<]*['\"]+)%",$orig_content,$args))
        {
            for(
$i=0;$i<count($args[1]);$i++) array_push($arr,$args[1][$i]);
        }
        return 
$arr
tia.
-red
__________________

komielan.com
red penguin is offline   Reply With Quote
Old 08-19-2003, 01:30 AM   #2
freddycodes
Master of Nothing
 
Join Date: Dec 2002
Location: San Diego, CA
Posts: 2,468
Default

Well I am too lazy to come up with the negating regex tonight. But this will get the job done.

Code:
function getRelativeLinks($data)
{
	$retVal = array();
	preg_match_all("%<a href=(['\"]+?[^>]*['\"]+)%",$data,$args);
	for($i=0;$i<count($args[1]);$i++) 
	{
		if(!preg_match("#^('|\")?(javascript|http|www|ftp|mailto)#", $args[1][$i], $sub))
		{
			array_push($retVal,$args[1][$i]);
		}
		else continue;
	}

	return $retVal;
}
freddycodes is offline   Reply With Quote
Old 08-25-2003, 01:42 AM   #3
red penguin
[^\d\D]
 
red penguin's Avatar
 
Join Date: Jun 2001
Location: Brooklyn, NY
Posts: 3,254
Default

Thanks freddy...

Have been vacationing and not doing much else but I shall take this function and try to fit it into the existing project.

I will post results to this thread.

Back to the beach!
__________________

komielan.com
red penguin is offline   Reply With Quote
Reply


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT. The time now is 03:35 AM.

///
Follow actionscriptorg on Twitter

 


Powered by vBulletin® Version 3.8.5
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Ad Management plugin by RedTyger
Copyright 2000-2013 ActionScript.org. All Rights Reserved.
Your use of this site is subject to our Privacy Policy and Terms of Use.