HTMLelem 1.2
Generate listings from a HTML document.
by Jenda@Krynicky.cz
htmlelem { -tag name [-if condition] -out format }+ < input > output
! htmlelem is a filter !
> htmlelem -tag a -if href -out "Line %4$_line$: $href$\n"
Means: print "Line ", number of line, " : " and href argument
of any tag that has argument HREF.
Every printout has to be on a new line
If we specify more than one tag, all are printed in the order of apperance.
The condition and format are only for last specified tag!
parameters:
-tag name_of_tag [ name_of_tag ... ] - you can specify more tag names
-t name_of_tag - only short of -tag
-if condition
-i condition - only short of -if
condition :- condition ' -or ' condition
condition ' -OR ' condition
{ item ' ' }+
- item list, all items should be true (AND)
item :- name - is this argument pressent?
| '!' name - is this argument absent?
!!! NO SPACE between '!' and name !!!
| name relation value - does the relation hold?
- ! no spaces in between !
- Ex: 'href=\.gif$'
| '!' name relation value - negate the relation
| ' [ ' condition ' ] ' - group - the spaces are necessary !!!
| ' ![ ' condition ' ] ' - negated group
- you cannot use "!" single ("! item" is not correct) !!!
name :- {anphanumeric | '_'}+
relation :- '=' - equal to a regexp
| '==' - numericaly equal
| '!=' - nonequal to a regexp
| '<>' - numericaly nonequal
| '<' - less \
| '>' - greater \___ numericaly
| '<=' - less or equal /
| '>=' - greater or equal /
value :- decimal_number
| regular_expression
You can use -and/-AND . It is ignored.
On default, regular expressions are case insensitive.
You can switch that by switch -i.
-i+ case insensitive
-i- case sensitive
-i the same as -i+
-i switch may occure as a standalone switch or inside a condition.
'-tag a -i- -if ...' , '-tag ... -if ... -i- ... -out ...'
! '-tag h1 h2 -i- h3 h4 -if ...' is wrong !
-out format
-o format - only short for -out
format - similar to C.
You can use escape seqences,
variables : $name$
or %format$name$ - where format means the same as in C
but you DO NOT write the last character specifying type !!!
(%5$line$ instead of %5d$line$ !!!)
variables $_line$ - number of line the tag starts at.
$_pos$ - it's position on the line
$_gnum$ - ordinal number of all prints
$_num$ - ordinal number for one specification (-tag ...)
$_body$ - the body of tag : THIS
may occure only once in a format
%c$_body$ means : skip comments
%t$_body$ means : skip tags
$_tag$ - print the whole tag as HTML, without body!
$_name$ - name of the tag
$name$ - argument "name" form the tag - case insensitive
(Ex: $href$, $onClick$ ...)
Arguments are printed without quotes. If you want to print them with
the quotes that were used in the document use $~name$ instead of $name$.
! If you're producing HTML, use $~name$ and not "$name$" - the argument
can contain the other kind of quotes so it could become incorrectly quoted !
Ex:
--f file.tag Inserts the contents of file.tag into the program arguments.
The file should contain arguments just as they are on
the command line except without system processing -
no variable expanding etc.
You can mix normal arguments with several --f.
see (*.tag)