HTMLelem 1.2 Generate listings from a HTML document. by Jenda@Krynicky.cz htmlelem { -tag name [-if condition] -out format }+ < input > output ! htmlelem is a filter ! > htmlelem -tag a -if href -out "Line %4$_line$: $href$\n" Means: print "Line ", number of line, " : " and href argument of any tag that has argument HREF. Every printout has to be on a new line If we specify more than one tag, all are printed in the order of apperance. The condition and format are only for last specified tag! parameters: -tag name_of_tag [ name_of_tag ... ] - you can specify more tag names -t name_of_tag - only short of -tag -if condition -i condition - only short of -if condition :- condition ' -or ' condition condition ' -OR ' condition { item ' ' }+ - item list, all items should be true (AND) item :- name - is this argument pressent? | '!' name - is this argument absent? !!! NO SPACE between '!' and name !!! | name relation value - does the relation hold? - ! no spaces in between ! - Ex: 'href=\.gif$' | '!' name relation value - negate the relation | ' [ ' condition ' ] ' - group - the spaces are necessary !!! | ' ![ ' condition ' ] ' - negated group - you cannot use "!" single ("! item" is not correct) !!! name :- {anphanumeric | '_'}+ relation :- '=' - equal to a regexp | '==' - numericaly equal | '!=' - nonequal to a regexp | '<>' - numericaly nonequal | '<' - less \ | '>' - greater \___ numericaly | '<=' - less or equal / | '>=' - greater or equal / value :- decimal_number | regular_expression You can use -and/-AND . It is ignored. On default, regular expressions are case insensitive. You can switch that by switch -i. -i+ case insensitive -i- case sensitive -i the same as -i+ -i switch may occure as a standalone switch or inside a condition. '-tag a -i- -if ...' , '-tag ... -if ... -i- ... -out ...' ! '-tag h1 h2 -i- h3 h4 -if ...' is wrong ! -out format -o format - only short for -out format - similar to C. You can use escape seqences, variables : $name$ or %format$name$ - where format means the same as in C but you DO NOT write the last character specifying type !!! (%5$line$ instead of %5d$line$ !!!) variables $_line$ - number of line the tag starts at. $_pos$ - it's position on the line $_gnum$ - ordinal number of all prints $_num$ - ordinal number for one specification (-tag ...) $_body$ - the body of tag : THIS may occure only once in a format %c$_body$ means : skip comments %t$_body$ means : skip tags $_tag$ - print the whole tag as HTML, without body! $_name$ - name of the tag $name$ - argument "name" form the tag - case insensitive (Ex: $href$, $onClick$ ...) Arguments are printed without quotes. If you want to print them with the quotes that were used in the document use $~name$ instead of $name$. ! If you're producing HTML, use $~name$ and not "$name$" - the argument can contain the other kind of quotes so it could become incorrectly quoted ! Ex: --f file.tag Inserts the contents of file.tag into the program arguments. The file should contain arguments just as they are on the command line except without system processing - no variable expanding etc. You can mix normal arguments with several --f. see (*.tag)