|
Donations |
|
|
|
|
|
If you found this site helpful, please donate to help keep it online
Don't want to use PayPal? Try our physical address
|
|
|
Survey |
|
|
|
|
|
|
|
|
Translate |
|
|
|
|
|
|
|
|
|
|
View previous topic :: View next topic |
Author |
Message |
SK12
Trooper
Joined: Mar 08, 2004
Posts: 11
Location: USA
|
Posted: Mon Mar 08, 2004 2:22 am Post subject: Please help - Bounds not working |
|
|
Here is the bounds match:
Code: |
<TD*CLASS="story"*</TD>
|
And the Matching expression:
Code: |
<BR>$SET(1=<br><br>)
|
The idea is to change all <br>:s to 2x <br>:s in order to make text more readable. Here is the HTML code , that must be recognized/modified:
Code: |
<TD COLSPAN="2" valign="top" CLASS="story">
<P><br><BIG CLASS="heading">The heading</BIG><BR>
<BIG CLASS="line">
Subtext
</BIG>
<BR>
<P CLASS="story">
<TABLE BORDER="0" CELLSPACING="0" CELLPADDING="0" WIDTH="150" ALIGN="right">
<tr>
<td>&&</td>
<td align="right"><p style="font-size : 10px;">ESS/the user</p></td>
</tr>
<TR>
<TD>&&</TD>
<TD><BR>The heading<BR>&</TD>
</TR>
</TABLE>
<P CLASS="story">
Here is the story.<BR>
Another Story.<BR>
Thrd one.<BR><BR>
</TD>
|
I am amazed that <TD COLSPAN="2" valign="top" CLASS="story"> is not matched! I've been trying for hours to do something with it. Any help would be much appreciated. Thanks!
|
|
Back to top |
|
|
z12
Sergeant
Joined: Jul 17, 2002
Posts: 135
Location: USA
|
Posted: Mon Mar 08, 2004 6:24 am Post subject: |
|
|
What I gather from this is your trying to replace:
Code: |
<BR> with <BR><BR>
within the <td*class="story">
|
Theres a couple of issues:
first, the Bounds Match will stop at the first
that's encountered.
Second, the Matching Expression will always fail because the initial
Code: |
<TD COLSPAN="2" valign="top" CLASS="story">
|
is not matched by anything within the Matching Expression
Try this,
Code: |
Bounds Match =$NEST(<TD*CLASS="story"*>,</td>)
Matching Expression=(\#(<BR>)$SET(\#=<BR><BR>))+\1
Replacement Text=\@\1
|
I advise you include a URL Match and make the Byte Limit big enough to do the job.
Mike
|
|
Back to top |
|
|
SK12
Trooper
Joined: Mar 08, 2004
Posts: 11
Location: USA
|
Posted: Mon Mar 08, 2004 11:04 am Post subject: |
|
|
Thanks It seems much clearer to me now. There is a side effect, though,when I set the bytes limit too high. <BR><BR> starts to appear in many other <TD>- areas (having one <BR>), too, although they don't match the bounds.
I am trying to clean a Finnish newspaper site, that has lot's of banners and just one <BR> used in main story area which makes the text hard to read - here is the link to it: http://www.ess.fi. Now setting the bytes limit to 10 000 makes the text readable,but makes extra spaces between left menu items. I've been having this same problem before,too,when I tried to clean up other pages. Is it somehow related to the $NEST - command? |
|
Back to top |
|
|
z12
Sergeant
Joined: Jul 17, 2002
Posts: 135
Location: USA
|
Posted: Mon Mar 08, 2004 11:55 am Post subject: |
|
|
I haven't had a chance to look at the link you posted, but the $NEST command will match all the way to the end of the code you had posted earler.
Is this the section of code you want to match?
Code: |
<P CLASS="story">
Here is the story.<BR>
Another Story.<BR>
Thrd one.<BR><BR>
</TD>
|
Mike
Edit:
I checked out the link, I can't read it read it!
|
|
Back to top |
|
|
SK12
Trooper
Joined: Mar 08, 2004
Posts: 11
Location: USA
|
Posted: Mon Mar 08, 2004 2:40 pm Post subject: |
|
|
Yeah,it is not very common language you might encounter I kind of simplified the code,actually. I hoped to make it easier for the others to help me solve this problem of mine this way. The real code is like this (still ripped out most of the Finnish text):
Code: |
<TD COLSPAN="2" valign="top" CLASS="juttu">
<!-- tulosta_juttu --><!-- hae_tamapaivays --><!-- /hae_tamapaivays --><!-- muokkaa_ja_esita_juttu --><P><br><BIG CLASS="otsikko">Heading</BIG><BR>
<P CLASS="juttu">
The article text and single <BR>:s are here
<P><BIG CLASS="lisarivi">Subheading</BIG><P CLASS="juttu">
The text going on, with <BR>:s
<P CLASS="juttu"><A HREF="index.pl?osasto=1&pvm=2004/03-08&juttu=00308173020">Lue lisää...</A><BR>&
<!-- /muokkaa_ja_esita_juttu --><!-- /tulosta_juttu --> <BR>
</TD>
|
So I would like it to match everything inside the outmost <TD>:s. In the mean time I also tried to set the byte limit very high (like 20 000) and this messed the page up a lot.
edit: I'll check the code more carefully. I guess for some reason the matching starts kind of "leaking" outside the right TD:s with high byte match values.
edit: there is one <B CLASS="juttu"> (inside TR:s), right before <TR><TD that prints out the left menu,if it might mean something.
|
|
Back to top |
|
|
z12
Sergeant
Joined: Jul 17, 2002
Posts: 135
Location: USA
|
Posted: Mon Mar 08, 2004 4:01 pm Post subject: |
|
|
Ok, I think the problem was in the bounds match, it wasn't specific enough, it matched more than once.
try this for Bounds Match:
Code: |
$NEST(<TD COLSPAN="2" valign="top" CLASS="juttu">,</td>)
|
When I did a debug on this version, it mached only once and in the right place (if I read my Finnish right).
For the Byte Limit I used 2048. The text that matched was 1455 characters, so you may have to tweak that a bit.
Hope that works
Mike
|
|
Back to top |
|
|
SK12
Trooper
Joined: Mar 08, 2004
Posts: 11
Location: USA
|
Posted: Tue Mar 09, 2004 9:11 am Post subject: |
|
|
Thanks for that idea I made a little change: $NEST(<TD (COLSPAN="2" valign="top" | )CLASS="juttu">,</td>) Now it is able to take care of pages with just "juttu", too. I set limit to 10 000 and works just fine. Then another new problem appeared - when story has an image with it, the filter doesn't work. Like this kind of code:
Code: |
<TD CLASS="juttu">
<!-- tulosta_juttu --><!-- valitse_kansio --><!-- /valitse_kansio --><!-- varmista_sisalto --><!-- /varmista_sisalto --><!-- muokkaa_ja_esita_juttu --><table border="0"><tr><td><P><BR><BR><BIG CLASS="otsikko">Torpedos jatkaa keilailun SM-liigassa </BIG><BR><BR>
<P CLASS="juttu">
<TABLE BORDER="0" CELLSPACING="0" CELLPADDING="0" WIDTH="150" ALIGN="right">
<TR>
<TD>&&</TD>
<TD>
<IMG SRC="imagesource.jpg" BORDER="0"><BR>Some text<BR>&
</TD>
</TR>
</TABLE><P CLASS="juttu">
And the story follows
|
This new nested table with <TD>:s probably has some effect to the filter.
Last edited by SK12 on Tue Mar 09, 2004 3:34 pm, edited 1 time in total
|
|
Back to top |
|
|
z12
Sergeant
Joined: Jul 17, 2002
Posts: 135
Location: USA
|
Posted: Tue Mar 09, 2004 3:20 pm Post subject: |
|
|
well, I'm learning some things too!
Try this:
Code: |
Bounds Match:$NEST(<td,</td>)
Matching Expression:(<TD CLASS="juttu"*>)\#(\#(<BR>)$SET(\#=<BR><BR>))+\#
Replacement Text:\@
|
if I add this to the end of the sample code shown
Code: |
</td>
</tr>
</table>
</td>
|
this closes everything up & it matches the way I think it should.
Mike
|
|
Back to top |
|
|
SK12
Trooper
Joined: Mar 08, 2004
Posts: 11
Location: USA
|
Posted: Tue Mar 09, 2004 4:17 pm Post subject: |
|
|
This is groovy, you did it! All the pages now look fine with images or without them. Please tell me,how you thought of this expression of yours. I am just starting to learn regular expressions. I "found" Proxomitron just 2 months ago.
Please correct me if I am wrong in translating your expression:
1. The first \# captures the text from <TD class="juttu"*>until the first <BR> and pushes it to the stack.
2. In the loop \# captures anything before the next <BR> and then pushes <BR> to the stack and replaces it with <BR><BR> on the top of the stack?
3. Then tries to repeat step 2 with +,or else after the loop is finished pushes everything untlil the ending bound with \# to the stack.
At Replace "Pops" all the items out of the stack to the page code with \@
Sven |
|
Back to top |
|
|
z12
Sergeant
Joined: Jul 17, 2002
Posts: 135
Location: USA
|
Posted: Tue Mar 09, 2004 5:23 pm Post subject: |
|
|
Glad it finally works!
I wasn't sure if you would need <td *class="juttu"*> in the matching expression.
I've been using proxo for serveral years now, I can't remember not using it! I've spent alot of time reading the help files, looking at other peoples filters and trying stuff.
As far as this filter goes, it looks to me like you got it figured out:
Code: |
Match:
1. The first \# captures the text from <TD class="juttu"*>until the first <BR> and pushes it to the stack.
2. In the loop \# captures anything before the next <BR> and then pushes it onto the top of the stack.
3. <BR> is not captured.
4. $SET pushes <BR><BR> on the top of the stack.
5. Loop, steps 2,3,4 until theres no more <BR>
6. \# pushes the remaining text up to the ending bound onto the stack.
Replace: \@ "Pops" all the items out of the stack into the page code.
Note: if you leave out the $SET, all the <BR>'s will be removed. Not what you want for this, but it's handy for stripping stuff out.
|
Glad to help a fellow proxo fan!
Mike
|
|
Back to top |
|
|
SK12
Trooper
Joined: Mar 08, 2004
Posts: 11
Location: USA
|
Posted: Wed Mar 10, 2004 4:49 am Post subject: |
|
|
Now I've been messing with that newspaper a lot more. I have centered the main table,removed almost empty right bar and also made my own custom menu (I am using it to replace the default one just below the logo).
Now I ran into another problem. There is a pop-up function and the menu uses it. There shouldn't be the menu in this pop-up window, but it will appear. The urls, where my new meny should appear are like this: http://www.ess.fi/cgi-bin/* but the URL of this pop-up window is like: http://www.ess.fi/webcam/uula.htm There are some other pop-ups,too - all without that cgi-bin. So I guess my menu injecting filter should only apply
to urls with cgi-bin. How would I implement that? I tried this:
Code: |
URL match: www.ess.fi/cgi-bin/*& $TYPE(htm)
|
But Proxomitron still applies the filter also to the pages without cgi-bin. Also when I leave it empty, it still matches.
Then what is the best way to inject my own code to previous one? I know that consuming old code is sometimes bad. But (^(^<tag)) or (^(^tag>)) tends to crash Proxo or make it very slow (when $SET() follows). So I've been using a way like this:
Code: |
<tag>$SET(1=<tag><my own tag1><my own tag2>...)
|
|
|
Back to top |
|
|
z12
Sergeant
Joined: Jul 17, 2002
Posts: 135
Location: USA
|
Posted: Wed Mar 10, 2004 6:37 am Post subject: |
|
|
To test the url match, I made a simple filter.
Code: |
Filter Name:title test match
URL Match:www.ess.fi/cgi-bin/*
Bounds Match:(<title>)\1
Matching Expression:*
Replacement Text:\1
|
when I went here:
Code: |
+++GET 420+++
GET /cgi-bin/uusinetlari/index.pl HTTP/1.1
Host: www.ess.fi
User-Agent: Mozilla/5.0 (U; en-US; rv:1.7a) Gecko/20040220 Firefox/0.8.0+
Accept: */*
Accept-Encoding: gzip, deflate
Connection: close
|
the filter matches.
When I went here:
Code: |
+++GET 465+++
GET /webcam/uula.htm HTTP/1.1
Host: www.ess.fi
User-Agent: Mozilla/5.0 (U; en-US; rv:1.7a) Gecko/20040220 Firefox/0.8.0+
Accept: text/html
Accept-Encoding: gzip, deflate
Connection: close
|
the filter doesn't match.
Note: both the main page & pop-up have a title tag.
so, for the url match, www.ess.fi/cgi-bin/* seems to work
You might try clearing your browsers cache and see if that helps.
By the way, I'll be out of town till friday night, so I won't be able to check back here till then.
Let me know if this works.
Mike
Edit:
for injecting the code, it depends.
Are you replacing code or adding new?
If your injecting javascript for this, you could do that at the end of the page.
|
|
Back to top |
|
|
Kye-U
Sergeant
Joined: Oct 18, 2003
Posts: 149
|
Posted: Wed Mar 10, 2004 4:38 pm Post subject: |
|
|
Try this:
Filter Name:title test match
URL Match:www.ess.fi/(cgi-bin/|)*
Bounds Match:(<title>)\1
Matching Expression:*
Replacement Text:\1 |
|
Back to top |
|
|
z12
Sergeant
Joined: Jul 17, 2002
Posts: 135
Location: USA
|
Posted: Fri Mar 12, 2004 5:50 pm Post subject: |
|
|
Hi Sven, did you get your code injection problem sorted out?
Mike |
|
Back to top |
|
|
SK12
Trooper
Joined: Mar 08, 2004
Posts: 11
Location: USA
|
Posted: Sat Mar 13, 2004 3:13 am Post subject: |
|
|
Thanks Kye-U! That one might come handy.
Mike: Hi! I've been very busy, I didn't have much time to try Proxo in the last two days. I was just wondering that how could I always inject the code without consuming any present tags. I think figured out that when the original tags are "consumed" like this: <original tag>$SET(1=<original tag><my tag1>...), they cannot be reused in other filters,because Proxo has already "been there". I am just not sure if I have understood the code injection concept in the right way. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum
|
Powered by phpBB 2.0.8a © 2001 phpBB Group
Version 2.0.6 of PHP-Nuke Port by Tom Nitzschner © 2002 www.toms-home.com
Version 2.2 by Paul Laudanski © 2003-2004 Computer Cops
|