New User? Need help? Click here to register for free! Registering removes the advertisements.

Computer Cops
image image image image image image image image
Donations
If you found this site helpful, please donate to help keep it online
Don't want to use PayPal? Try our physical address
image
Prime Choice
· Head Lines
· Advisories (All)
· Dnld of the Week!
· CCSP News Ltrs
· Find a Cure!

· Ian T's (AR 23)
· Marcia's (CO8)
· Bill G's (CO12)
· Paul's (AR 5)
· Robin's (AR 2)

· Ian T's Archive
· Marcia's Archive
· Bill G's Archive
· Paul's Archive
· Robin's Archive
image
Security Central
· Home
· Wireless
· Bookmarks
· CLSID
· Columbia
· Community
· Downloads
· Encyclopedia
· Feedback (send)
· Forums
· Gallery
· Giveaways
· HijackThis
· Journal
· Members List
· My Downloads
· PremChat
· Premium
· Private Messages
· Proxomitron
· Quizz
· RegChat
· Reviews
· Google Search
· Sections
· Software
· Statistics
· Stories Archive
· Submit News
· Surveys
· Top
· Topics
· Web Links
· Your Account
image
CCSP Toolkit
· Email Virus Scan
· UDP Port Scanner
· TCP Port Scanner
· Trojan TCP Scan
· Reveal Your IP
· Algorithms
· Whois
· nmap port scanner
· IPs Banned [?]
image
Survey
How much can you give to keep Computer Cops online?

$10 up to $25 per year?
$25 up to $50 per year?
$10 up to $25 per month?
$25 up to $50 per month?
More than $50 per year?
More than $50 per month?
One time only?
Other (please comment)



Results
Polls

Votes: 1130
Comments: 21
image
Translate
English German French
Italian Portuguese Spanish
Chinese Greek Russian
image
 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile   Login to check your private messagesLogin to check your private messages   LoginLogin 

Problems with the "Kill off-site Images" filter

 
Post new topic   Reply to topic       Computer Cops Forum Index -> Proxomitron General
View previous topic :: View next topic  
Author Message
LWC

Trooper
Trooper



Joined: Feb 13, 2004
Posts: 27
Location: Israel

PostPosted: Mon Apr 12, 2004 10:26 am    Post subject: Problems with the "Kill off-site Images" filter
Reply with quote

This filter kills any form's <input fields, which have value=http:// entries
in them, but only if they lead to external URLs.

However, it also kills them if there's no URL after "http://" at all.
Now, which sites use "http://" with no URL, you ask?
Well, you must know those forms which ask you to provide an URL and
hints you to use "http://"...like in Google's translation page, for example
(i.e. ).

So the bottom line is this: said filter thinks it's ok if the URL is not
external. How can I convince it to think it's ok if there's
no URL at all too?
Back to top
View users profile Send private message
z12

Sergeant
Sergeant



Joined: Jul 17, 2002
Posts: 131
Location: USA

PostPosted: Tue Apr 13, 2004 8:48 am    Post subject:
Reply with quote

Hi LWC,

Modify the matching expression like so:
Code:

Match = "\0<i(mg|nput)(*alt=$AV(\2)|)*>\3"
        "&*=\w://([^/]+{1,*}/&&(^\h)*)"
        "&(^*(width=[#0:75]|height=[#0:20]))"


I don't understand the reasoning behind the height/width match, it seems a bit odd.

Also, matching for *src= and removing the anchor tag from the bounds check makes more sense to me. Depending on how your filters are arranged, Multi may not be needed either.

Of course, you can still get offsite images from js or background attributes.

HTH
Mike
Back to top
View users profile Send private message
LWC

Trooper
Trooper



Joined: Feb 13, 2004
Posts: 27
Location: Israel

PostPosted: Tue Apr 13, 2004 1:29 pm    Post subject:
Reply with quote

Hmm, the original filter has only two lines in "match".

I see you haven't touched the third one, changed
about half of the first one and added a new second line.

This filter has so many code in it that it looks likes chinese...can you
explain a little what you did? For example, why did you change
the "alt" part?

But most importantly, you've added a new parameter (number 3), but
haven't provided a new "replace" line so it's not even used, is it?

I think the only thing that needs to be changed is the first line:
Code:

\1<i(mg|nput)(*alt="\0"|)*>\2&*http://(^\h)

and within it, only this part
Code:

http://(^\h)

the h (host) part is clever ("if it's the same host, it's ok"), but I want to
convince it to support empty URLs too (just "http://").
In other words, what it should be is:
Code:

http://(^\h|^)

But, unfortunately it doesn't work...
Back to top
View users profile Send private message
z12

Sergeant
Sergeant



Joined: Jul 17, 2002
Posts: 131
Location: USA

PostPosted: Tue Apr 13, 2004 2:47 pm    Post subject:
Reply with quote

Hi LWC,

hmm, It seems that maybe we have different filters. I always backup the default config before I switch to mine, but it's possible that I've modifed it.

Here is the filter I was referring to:
Code:

Name = "Kill off-site Images"
Active = FALSE
Multi = TRUE
Bounds = "<(a\s[^>]++href=*</a>|i(mg|nput)\s*>)"
Limit = 800
Match = "\1<i(mg|nput)(*alt="\0"|)*>\2&*http://(^\h)"
        "&(^*(width=[#0-75]|height=[#0-20]))"
Replace = " \1<font size=1>[\0]</font>\2"


I see I should have included the new Replacement expression, so here's the whole thing:
Code:

Name = "Kill off-site Images2"
Active = TRUE
Multi = TRUE
Bounds = "<(a\s[^>]++href=*</a>|i(mg|nput)\s*>)"
Limit = 800

Match = "\0<i(mg|nput)(*alt=$AV(\2)|)*>\3"
        "&*=\w://([^/]+{1,*}/&&(^\h)*)"
        "&(^*(width=[#0:75]|height=[#0:20]))"

Replace = " \0<font size=1>[\2]</font>\3"


Note: I added an empty line in this post above & below the matching expression for clarity.

The first tweak was to replace the alt match with $AV() to make sure the alt tag value was captured no matter what quotes were used.

The second tweak was to remove
Code:

 &*http://(^\h)

from the first line. Apparently I changed the variables in the first line for no good reason. Laughing

The next tweak was to insert the new matching code for \h
Code:

        "&*=\w://([^/]+{1,*}/&&(^\h)*)"


         This part:
         &*=\w://

         replaces the old
          &*http://

I used \w cause it will match any quotes and protocol, not just http
(I would use &*src= to limit matching to image attributes)

This:
Code:

[^/]+{1,*}/

is used to capture the domain name up to and including the / character.
This won't match unless there is a domain name & path delimiter following the :// character sequence. (since we don't want to match "http://" without a domain name)

Finally we use "&&" to match the domain name captured above to the host name as follows:
Code:

([^/]+{1,*}/&&(^\h)*)


For the matching the width & height values, I changed the format from [#0-75] to [#0:75] which is the newer way for doing numeric matches.

In the replacement text, I changed the variables to match the new code.
As far as using the replacement text, I probably wouldn't, but thats just me. You'll have to try it and see how you like it.

I also don't like the matching expression for width & height, I would probably delete it, or at least replace it. It really limits blocking of off-site images to fairly large one (probably necessary to fit in the replacement text).

HTH
Mike
Back to top
View users profile Send private message
LWC

Trooper
Trooper



Joined: Feb 13, 2004
Posts: 27
Location: Israel

PostPosted: Tue Apr 13, 2004 6:18 pm    Post subject:
Reply with quote

Well, I see now. Your version is sort of a compromize.
When it's not the same host, it still accepts it if there's no slash in it
(i.e. domain alone).

It's a lot better than before, but can't there be a solution that
doesn't even accept a domain?

Also, the Google page does work now, but Altavista's translation
page still doesn't work because the code there is:
Code:

<input type="text" size="45" style="width:400" name="url" value="http://" />

See that useless slash at the end? They just had to go and put it...
it's not even correct HTML and because of that little slash, even
your version still rejects the entire code.
It'll be great if you could tell the filter to expect after the URL line
a space or a quote sign.

Thanks!


Last edited by LWC on Wed Apr 14, 2004 5:34 am, edited 1 time in total
Back to top
View users profile Send private message
Lepus

Trooper
Trooper



Joined: Mar 02, 2004
Posts: 15
Location: USA

PostPosted: Tue Apr 13, 2004 7:13 pm    Post subject:
Reply with quote

one quick'n'dirty fix might be to change...

Code:
(^h)


to...

Code:
(^\h)[a-z0-9]


an inverted match using (^....) doesn't use up any of the text - it's just a true/false test at that spot. so adding [a-z0-9] after it will start at the first character of the hostname and make sure there's at least one valid letter or number after the "http://".

Quote:

See that useless slash at the end? They just had to go and put it...
it's not even correct HTML and because of that little slash, even
your version still rejects the entire code.


Actually it's valid (and even required) for XML/XHTML.
Back to top
View users profile Send private message
z12

Sergeant
Sergeant



Joined: Jul 17, 2002
Posts: 131
Location: USA

PostPosted: Tue Apr 13, 2004 9:52 pm    Post subject:
Reply with quote

Lepus, thanks for jumpin in!

Yeah, I was going the quick & dirty route with [^/]+{1,*}/

In most all of my replacement code, I use something like <proxo killed blah blah /> so I don't know where my head was at. I guess I assumed there would always be a path to an image.

I tried (^\h)[a-z0-9] on my filter but no joy. It sounds interesting, perhaps you can post something. The old \h never seems to work the way I think it would.

Anyway, heres a new filter to test drive:
Code:

Name = "New offsite-image killer 1"
Active = TRUE
Bounds = "<i(mg|mage|nput)*>"
Limit = 1024
Match = "<(\w)\0*((*alt=$AV(\1))|)*"
        "&*src=\w://(([^/"' ]+{1,*})\2&&(^\h)*)"
Replace = "<proxo killed="\0" with="\2" />"

I'm sure this isn't the final version!

For replacement text, there's plenty of options we can try:
1. Just Kill it.
2. Put the alt tag on the page.
3. Replace the image with a local ptron gif
4. Any or all of the above
5. Do something else instead.

Also, we can add image dimension checks.

Let me know how this works.

Mike
Back to top
View users profile Send private message
Lepus

Trooper
Trooper



Joined: Mar 02, 2004
Posts: 15
Location: USA

PostPosted: Tue Apr 13, 2004 11:03 pm    Post subject:
Reply with quote

z12 wrote:

I tried (^\h)[a-z0-9] on my filter but no joy. It sounds interesting, perhaps you can post something.


How were you using it? I tested using a very simple match of...

Code:

<tag * http://(^\h)[a-z0-9] * >


which seemed to work (at least in the tester window).

Code:

<tag src="http://" >    (no match)
<tag src="http://shonen.knife.com" >   (no match)
<tag src="http://offsite.com" >   (match)


The tester window's idea of the current URL seems to be "www.Shonen.Knife.com" Smile

You might want to make sure some of your other logic isn't affecting this somehow.
Back to top
View users profile Send private message
LWC

Trooper
Trooper



Joined: Feb 13, 2004
Posts: 27
Location: Israel

PostPosted: Wed Apr 14, 2004 5:42 am    Post subject:
Reply with quote

Lepus, your reverse logic worked ("must meet a positive value" instead of
"must ignore a negative value")!

Your 8 letters fixed both Altavista's and Google's translation pages.
I wonder if it fixed any other "http://" only input tags pages, but I can't
think of any others right now.
Back to top
View users profile Send private message
z12

Sergeant
Sergeant



Joined: Jul 17, 2002
Posts: 131
Location: USA

PostPosted: Wed Apr 14, 2004 7:09 am    Post subject:
Reply with quote

Hi,

Simplicity, gotta love it. That works great Lepus.

Code:

Name = "offsite-image replacer"
Active = TRUE
Bounds = "<i(mg|mage|nput)*>"
Limit = 1024
Match = "(\#( src=\w://(^\h)[a-z0-9]\w))+{1}\#"
Replace = "\# src="http://Local.ptron/clear.gif" \@"


Mike
Back to top
View users profile Send private message
Display posts from previous:   
Post new topic   Reply to topic       Computer Cops Forum Index -> Proxomitron General All times are GMT - 5 Hours
Page 1 of 1

 
 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB 2.0.8a © 2001 phpBB Group

Version 2.0.6 of PHP-Nuke Port by Tom Nitzschner © 2002 www.toms-home.com
Version 2.2 by Paul Laudanski © 2003-2004 Computer Cops