.\" Process this file with .\" groff -man -Tascii foo.1 .\" .TH ONLY 1 "" Haskell "" .SH NAME only \- an advanced filter for words, lines, and more .SH SYNOPSIS .B only [\-[bcwlf] .I EXPR .B ] ... .I file .B ... .SH DESCRIPTION .B Only is an advanced filtering tool, like .B grep, but instead of filtering only on lines, it can also filter on characters or words, called .I tokens in general. When tokens .I match, there are two options that allow for greater control than .B grep. They can appear before and/or after a .I regex and are called .I absolute indices and .I relative indices, respectively. .I Absolute indices refer to matches, whereas .I relative indices refer to tokens, with the match being token zero. For example, .B -l .I N/regex/M will show the M-th line after the N-th occurance of .I regex. .P For a more detailed description, see below. .SH OPTIONS .IP (-b|--bytes)=EXPR Byte mode .IP (-c|--chars)=EXPR Character mode .IP (-w|--words)=EXPR Word mode .IP (-l|--lines)=EXPR Line mode .IP (-f|--files)=EXPR File mode .SH EXTENDED DESCRIPTION The original goal of .B 'only' was to combine the features of .B head, .B tail, .B grep, and .B cut into a single utility that was capable of all of their features, but with the power to do so much more. For example, .B head and .B tail are good for selecting the first n-lines or last n-lines of a file, but what if you want lines 10-30? Neither utility would be very good alone, and combining them to accomplish your goal would be a nightmare. Granted, one could probably construct a one liner in .B awk or .B perl to achieve the desired effect, but at the expense of clarity. .P To overview the features of .B only, there are two major kinds of inputs: .I files and .I modes. A file can either be a filename or .B '\-' which means standard input. The modes currently supported are: .B bytes, .B characters, .B words, .B lines, and .B files. The difference between each mode is what the pattern /^.*$/ will match. When no pattern is given, and a number is given instead, then it will refer to the appropriate token type, for example, the first word, the second line, etc. .P In byte mode, the input is broken up into 8-bit octets, so the patterns must only match a single byte. In character mode, the input is broken up according to the specified encoding (or UTF-8 if unspecified), where each character may be multiple bytes. In word mode, the separators can be any white-space, so it tries to remember what separator was there in the beginning, and puts it back before displaying. In line mode, .B only behaves very similar to .B grep but with a few extra features. In file mode, the filenames are not shown (unless -F is used) but the entire file is shown if it matches the pattern. .SH \ \ \ Syntax .I Matching expressions are expressions written in a small language that forms a super-set of regular expressions. The .B syntax of matching expressions are the same regardless of what the current mode is. This is true even of byte mode, where you must write "\\xFF" if you want a non-printable character. Matching expressions can be as simple as a number or a word. First, .B only tries to parse an expression as a number, then as an expression of the form .I M/regex/N and if that fails, then it treats the entire expression as a regex. Each M and N may be a numeric expression. .I Numeric expressions have the syntax (in pseudo-Parsec): num = [+\-][0-9]+ numeric = sepBy numbers ',' numbers = num ';' num ':' num # from A to C step (A-B) | num ':' num ';' num # from A to B step C | num ':' num # from A to B | num which means you can specify just a single number (3) or something as complicated as multiple ranges (such as 3:5,100:109). These numeric expressions can occur on either side of the regex, or both sides with a combined effect. The syntax of the entire matching expression is: expr = do optional numeric c <- punct ; regex ; c (try c ; regex ; c ; optional num | optional numeric) | numeric | regex where .I punct is any ASCII punctuation character except ".,:;", and .I regex is a POSIX extended regular expression. .\" This serves to discribe the syntax of matching expressions. .SH \ \ \ Semantics The .B semantics of matching expressions are a little harder to describe. However, a generalization of the example given above should hold true: "N/regex/M" means the M-th .I tokens relative to the N-th .I matches The default for .I N is .B 1:-1 and the default for .I M is .B 0. The .I N are known as .I absolute indices, and the .I M are known as .I relative indices. Absolute indices will take the list of matches (the list of tokens that were matched by the regular expression), and apply use the numbers in N as the indices of this list. This gives you the ability to select the first match (1) or the last match (-1). If you use negative numbers, then it will count from the end of file going backwards, so (-2) would be the second to last match. Relative indices will take the list of matches, the original list of tokens, and for each match, it forms a virtual list where 0 refers to the match's index in the list of tokens. This allows one to emulate .B grep's \-A (after) and \-B (before) options. .P Here are some equivalent command-lines for "after" a match: grep -A3 expr file.txt only -l/expr/0:3 file.txt Here are some equivalent command-lines for "before" a match: grep -B3 expr file.txt only -l/expr/-3:0 file.txt Normally, these would be used to select line numbers, like if you got a compiler error in a file with 10 million lines, and you just wanted to see the surrounding text. .SH FILES .I ~/.onlyrc .RS A user configuration file. [Not implemented yet]