MaintenanceGuidelinesTips: do_it_with_sed.txt

File do_it_with_sed.txt, 54.8 KB (added by Fran Boon, 14 years ago)
Line 
1Date: Wed Sep 25 13:11:26 1996
2Path: news.demon.co.uk!dispatch.news.demon.net!demon!usenet2.news.uk.psi.net!uknet!usenet1.news.uk.psi.net!uknet!EU.net!Portugal.EU.net!news.rccn.net!news.ist.utl.pt!beta.ist.utl.pt!L38076
3From: L38076@beta.ist.utl.pt (Carlos Jorge G.duarte)
4Newsgroups: comp.editors
5Subject: do-it-with-sed (long)
6Date: 24 Sep 1996 17:18:28 GMT
7Organization: Instituto Superior Tecnico
8Lines: 2137
9Distribution: inet
10Message-ID: <529554$sc4@ci.ist.utl.pt>
11NNTP-Posting-Host: beta.ist.utl.pt
12X-Newsreader: TIN [version 1.2 PL2v [AXP/VMS]]
13
14Hi everyone, this is a little (~50k) document on how to use doc, and
15with some trailing examples.
16
17Here it is now, after my name
18
19-- Carlos
20
21----
22:r! sed -ne '/^-----/{;n;h;n;/^----/{;g;/^.\{72\}$/s/ */ /;p;};}' %
23
24Introduction
25Regular expressions
26Using sed
27Sed resume
28Sed commands
29Examples
30 Squeezing blank lines (like cat -s)
31 Centering lines
32 Delete comments on C code
33 Increment a number
34 Get make targets
35 Rename to lower case
36 Print environ of bash
37 Reverse chars of lines
38 Reverse lines of files
39 Transform text into a C "printf"able string
40 Prefix non-blank lines with their numbers (cat -b)
41 Prefix lines by their number (cat -n)
42 Count chars of input (wc -c)
43 Count lines of input (wc -l)
44 Count words of input (wc -w)
45 Print the filename component of a path (basename)
46 Print directory component of a path (dirname)
47 Print the first few (=10) lines of input
48 Convert a sed script to a bash-command-line command
49 Print last few (=10) lines of input
50 The tee(1) command in sed
51 Print uniq lines of input (uniq)
52 Print duplicated lines of input (uniq -d)
53 Print only duplicated lines (uniq -u)
54Index of sed commands
55Author and credits and date etc...
56
57========================================================================
58
59------------
60Introduction
61------------
62
63This is a little document to help people using sed, not very fancy
64but better than nothing :-)
65
66There are several uses for sed, some of them totally exotic.
67
68Most of scripts that appear through the text are useless, as
69there are (UNIX) utilities that do the same job (and more)
70faster and better. They are intended to show real examples
71of sed, and to show also the power of sed, as well its
72weaknesses.
73
74========================================================================
75
76-------------------
77Regular expressions
78-------------------
79
80To know how to use sed, people should understand regular expressions (RE for
81short).
82
83This is a brief resume of regular expressions used in SED.
84
85c a single char, if not special, is matched against text.
86
87* matches a sequence of zero or more repetitions of previous char,
88 grouped RE, or class.
89
90\+ as *, but matches one or more.
91
92\? as *, but only matches zero or one.
93
94\{i\} as *, but matches exactly <i> sequences (a number, between
95 0 and some limit -- in Henry Spencer's regexp(3) library,
96 this limit is 255)
97
98\{i,j\} matches between <i> and <j>, inclusive, sequences.
99
100\{i,\} matches more thanor equal to <i> sequences.
101
102\{,j\} matches at most (or equal) <j> sequences.
103
104\(RE\) groups RE as a whole, this is used to:
105
106 - apply postfix operators, like `\(abcd\)*'
107 this will search for zero or more whole
108 sequences of "abcd", if `abcd*', it would
109 search for "abc" followed by zero or more "d"s
110
111 - use back references (see below)
112
113. match any character
114
115^ match the null string at beginning of line, i.e. what
116 what appears after ^ must appear at the
117 beginning of line
118
119 e.g. `^#include' will match only lines where "#include" is
120 the first thing on line, but if there are one or two spaces
121 before, the match fail
122
123$ the same as ^, but refers to end of line
124
125\c matches character `c' -- used to match special chars,
126 referred above (and some more below)
127
128[list] matches any single char in list. e.g. `[aeiou]' matches
129 all vowels
130
131[^list] matches any single char NOT in list
132
133 a list may be composed by <char1>-<char2>, and means
134 all chars between (inclusive) <char1> and <char2>
135
136 to include `]' in the list, make it the first char
137 to include `-' in the list, make it the first or last
138
139RE1\|RE2
140 matches RE1 or RE2
141
142\1 \2 \3 \4 \5 \6 \7 \8 \9, => \i
143 matches the <i>th \(\) reference on RE, this is called
144 back reference, and usually it is (very) slow
145
146Notes:
147------
148 - some implementations of sed, may not have all REs mentioned,
149 notably `\+', `\?' and `\|'
150
151 - the RE is greedy, i.e. if two or more matches are detected, it
152 selects the longest, if there are two or more selected with
153 the same size, it selects the first in text
154
155Examples:
156---------
157
158 `abcdef' matches "abcdef"
159 `a*b' matches zero or more "a"s followed by a single "b",
160 like "b" or "aaaaaab"
161 `a\?b' matches "b" or "ab"
162 `a\+b\+' matches one or more "a"s followed by one or more
163 "b"s, the minimum match will be "ab", but
164 "aaaab" or "abbbbb" or "aaaaaabbbbbbb" also
165 match
166 `.*' all chars on line, of all lines (including empty
167 ones)
168 `.\+' all chars on line, but only on lines containing
169 at least one char, i.e. empty lines will not
170 be matched)
171
172 `^main.*(.*)' search for a line containing "main" as the first
173 thing on the line, that line must also
174 contain an opening and closing parenthesis
175 being the open paren preceded and followed
176 by any number of chars (including none)
177
178 `^#' all lines beginning with "#" (shell and
179 make comments)
180
181 `\\$' all lines ending with a single `\' (there are
182 two for escaping `\') -- line continuation
183 in C and make, and shell, etc...
184
185 `[a-zA-Z_]' any letters or digits
186
187 `[^ ]\+' (a tab and a space) -- one or more sequences
188 of any char that isn't a space or tab.
189 Usually this means a word
190
191 `^.*A.*$' match an "A" that is right in the center of the
192 line
193
194 `A.\{9\}$' match an "A" that is exactly the last tenth
195 character on line
196
197 `^.\{,15\}A' match the last "A" on the first 16 chars of the
198 line
199
200========================================================================
201
202---------
203Using sed
204---------
205
206The usual format of sed is:
207
208 sed [-e script] [-f script-file] [-n] [files...]
209
210files...
211 are the files to read, if a "-" appears, read from stdin,
212 if no files are given, read also from stdin
213
214-n
215 by default, sed writes each line to stdout when it reaches
216 the end of the script (being whatever on the line)
217 this option prevents that. i.e. no output unless there
218 is a command to order SED specifically to do it (like p)
219
220-e
221 an "in-line" script, i.e. a script to sed execute given on the
222 command line. Multiple command line scripts can be given,
223 each with an -e option, in fact, -e is only needed when more
224 than one script is present (specified by a previous -e or -f
225 option)
226
227-f
228 read scripts from specified file, several -f options
229 can appear
230
231- Scripts are concatenated as they appear, forming a big script.
232- That script is compiled into a sed program.
233- That program is then applied to each line of given files (the
234 script itself can change this behavior).
235- The results are always written to stdout, although same commands
236 can send stuff to specific files
237- Input files are seen as one to sed, i.e. `sed -n $= *' gives the
238 number of lines of ALL *, something like `cat * | wc -l'
239
240I usually use (sorry the pleonasm!) sed in the following ways:
241
242---- in shell scripts, invoking sed like this
243
244#!/bin/sh
245
246sed [-n] '
247
248whole script
249
250'
251
252---- as an executable itself, like
253
254#!/usr/bin/sed -f
255
256or
257
258#!/usr/bin/sed -nf
259
260---- on the command line, as being part of a shell script, or in an
261 alias (tcsh), or in a function (bash, sh, etc)
262
263For the command line, there are two things to know, there is no
264need on using one -e for each command, although that can be done.
265
266Commands may be separated by semi-colons `;', with some exceptions.
267
268Example: sed '/^#/d;/^$/d;:b;/\\$/{;N;s/\n//;bb;}'
269
270 this would
271
272 /^#/d delete all lines beginned with `#' (comments?)
273 /^$/d delete all empty lines (/./!d could be used instead)
274 :b
275 /\\$/{
276 N
277 s/\n//
278 bb
279 }
280 would join all lines ended with `\', after deleting
281 the `\' it self
282
283 the format of this explained script (except the
284 descriptions themselves) could be used in a file script,
285 but can also be given to sed on one line, without using
286 lots of '-e's
287
288Though, there are exceptions to this `;' ending rule: the direct text
289handling and read/write commands.
290
291There are functions, that handle user text directly (insert, append, change).
292The format of that text is
293
294command\
295first line\
296second line\
297...\
298last line
299
300 no ending \ for the last line
301
302example in a sed script file:
303
304/#include <termios\.h>/{
305 i\
306 #ifdef SYSV
307 a\
308 #else\
309 #include <sgtty.h>\
310 #endif
311}
312
313 that would search for lines `#include <termios.h>' and then
314 would write
315
316#ifdef SYSV
317#include <termios.h>
318#else
319#include <sgtty.h>
320#endif
321
322Now, for writing the same script on one line, the -e mechanism
323is needed... what follows each -e can be considered as an input
324line from a sed script file, so nothing kept us from doing
325
326sed -e '/#include <termios\.h>/{' \
327 -e 'i\' \
328 -e '#ifdef SYSV' \
329 -e 'a\' \
330 -e '#else\' \
331 -e '#include <sgtty.h>\' \
332 -e '#endif' \
333 -e '}'
334
335on the command line, of course the trailing `\'s could be omitted if
336we wrote all of this on one line and thus, getting a fast edit-and-test
337working
338
339and of course, lines that don't need to be alone can be joined with
340the `;' mechanism... rewriting the above, we could get something
341like:
342
343sed -e '/#include <termios\.h>/{;i\' -e '#ifdef SYSV' -e 'a\' -e '#else\' \
344 -e '#include <sgtty.h>\' -e '#endif' -e '}'
345
346NOTE that this fancy work out on the shell command line can be a real
347pain due to quoting mechanism of shell's. For [ba]sh the above should
348be fine, but for [t]csh for instance, the '...\' would quote the '
349and mess everything up.
350
351--
352
353Generally speaking, we can put the above in the following manner:
354
3551. sed commands are usually on one line
356
3572. if we want more (multi-line commands), then we must end the
358 first line with an `\' -- this is not the same as the classic
359 trailing `\' in C or make, etc... this one says: "Ei sed! This
360 command has more than one line.", whereas C, make, etc, say: "Ei
361 make, (g)cc, etc... this line is so huge that I wrote its
362 continuation on the next line!"
363
3643. if a command is one line only, it can be separated by a `;'
365
3664. if it is a multi-line, then it must contain all of its line
367 (except the first) by themselves
368
369...and...
370
3715. on command line, what follows a `-e' is like a whole line in
372 a sed script
373
374--
375
376The insert etc... commands deal with text so, obviously, they are
377multi-line commands by default. i.e. at least two lines: one for the
378command, and other for text (which can be empty), but any other
379command may be a potential multi-liner
380
381The read/write commands are exceptions: they need a whole (last)
382line for themselves. i.e. after the `r' or `w' the rest of
383the line is treated like a filename. So, after this one, nothing
384more can happen (but before can).
385
386========================================================================
387
388----------
389Sed resume
390----------
391
392Input
393-----
394
395Sed input are files (stdin by default), and are seen as a whole.
396
397For instance,
398
399 sed -f some_script /etc/passwd /etc/passwd
400
401is exactly the same as
402
403 ( cat /etc/passwd; cat /etc/passwd ) | sed -f some_script
404
405or
406
407 cat /etc/passwd > foo
408 cat /etc/passwd >> foo
409
410 cat foo | sed -f some_script
411
412or yet
413
414 sed -f some_script foo
415
416i.e. lines from files are read, but no kind of information exists
417to keep track of where they come from.
418
419Description
420-----------
421
422Sed read lines from its input, and applies some actions (or commands,
423or functions-- a matter of choice) to them.
424
425By default, the print command is applied before the next line is read.
426
427So
428 sed '' /etc/passwd
429
430will be like
431
432 cat /etc/passwd
433
434i.e. each line of /etc/passwd is written after being read.
435
436An equivalent form is
437
438 sed -n 'p' /etc/passwd
439
440The general format of an action/function/command is
441
442 [first_address][,second_address] <function> [arguments] [\]
443
444first_address
445 specifies that <function> should be executed only on lines
446 at those addresses (more of these below).
447 By default, <function> will be executed on ALL lines
448
449first_address,second_address
450 when second_address is specified, first_address must also exist,
451 and the format is as above.
452 <function> will be applied to all lines that match the formed
453 range (including bounds)
454
455function
456 see list of them below
457
458arguments
459 are particular to each function, some functions don't even
460 have arguments
461
462\
463 a sed function is a one-line function, but there are some
464 exceptions-- in that case, a `\' must be on the end of the
465 line to tell sed that the specified function is composed
466 of more than one line
467
468 Note that this is not the classical `\', that we are used
469 to see on C, make, sh, etc... this is not continuation
470 on the next line-- a sed command is read until a line
471 which does not end in a `\' is found. Usually, the line
472 that contains the command satisfies this, but if a command
473 extends itself across lines, then all except the line
474 must end with `\' (more about these on i(nsert), a(append),
475 c(hange) and s(ubstitute) commands)
476
477Applying commands
478----------------
479
480The commands are gathered into a big command buffer.
481
482They are fetched as they appear on script's input, either being
483fetched from command line, or from files.
484
485All leading space is ignored (more about this on i(nsert), and
486company).
487
488Then, the big command buffer is compiled into a sed program.
489This sed program will be very fast (it is byte code) - that's
490why sed is a fast and convenient program.
491
492Each command of the program will be applied to the current line
493if there is nothing that prevents this (like specifying an
494address that does not match the current line).
495
496Commands are applied one by one, sequentially, and [possibly]
497transformations on the line are "applied" before the next
498command is executed.
499
500Sequence can be changed with some commands (more on this
501below-- b(ranch) and t(est)).
502
503Pattern space
504-------------
505
506Well, I have been referring to the input of each sed command
507as a "line".
508
509Actually this is not correct, because a sed command can be applied
510to more than one line, or even on some parts of several lines.
511
512The input of each sed command, is called "pattern space".
513
514Usually the pattern space is the current line, but this behavior
515can be changed with sed commands (N,n,x,g and G).
516
517Addresses
518---------
519
520There are two kinds of addresses: line addresses and context
521addresses.
522
523Each line read is counted, and one can use this information to
524absolutely select which lines commands should be applied to.
525
526For instance:
527
528 30= will write "30" if there are at least 30 lines
529 on input, because the `=' command (print current
530 line) will only be executed on line 30
531
532 30,60= will write "30", "31"... "60" with the same conditions
533 as above. i.e. input must contain more than or equal
534 to N lines, to the number N to be written
535
536 $= will write down the number of the last line, a kind
537 of `wc -l'
538
539So, resuming:
540
541 1 first line
542 2 second line
543 ...
544 $ last line
545 i,j from i-th to j-th line, inclusive. j can be $
546
547The second kind of addresses are context, or RE, addresses.
548They are regular expression,s and commands will be executed
549on all pattern spaces matched by that RE.
550
551Examples:
552
553 /.\{73,\}/d will delete all lines that have more than 72
554 characters
555
556 /^$/d will delete all empty lines
557
558 /^$/,/^$/d delete from first empty line seen to the next
559 empty, eating everything appearing in the middle
560 (not very useful)
561
562The context addresses can be mixed up with line addresses, so:
563
564 1,/^$/d delete leading blank lines, i.e. the
565 first output line will be non empty
566
567Resume:
568-------
569
570- commands may take 0, 1 or 2 addresses
571- if no address is given, a command is applied to all pattern spaces
572- if 1 address is given, then it is applied to all pattern spaces
573 that match that address
574- if 2 addresses are given, then it is applied to all formed pattern spaces
575 between the pattern space that matched the first address, and the next
576 pattern space matched by the second address.
577
578 If pattern spaces are all the time single lines, this can be said
579 like, if 2 addrs are given, then the command will be executed on
580 all lines between first addr and second (inclusive)
581
582 If the second address is an RE, then the search starts only on
583 the next line. That's why things like this work:
584
585 /foo/,/foo/<cmd>
586
587========================================================================
588
589------------
590Sed commands
591------------
592
593The following description is arranged in this way:
594
595(arg-number)<function> -- mnemonic, short description
596
597 full description
598
599At the end of the file (after examples) is an index of all
600commands, sorted by name (i.e. letter) with the short description
601and mnemonic.
602
603Line-oriented commands
604----------------------
605
606(2)d -- d(elete), delete lines
607
608 - delete (i.e. don't write) specified lines
609 - execution re-starts at the beginning of the script
610
611 this is somehow like
612
613 s/.*//
614 b
615
616(2)n -- n(ext), next line
617
618 - jumps to next line. i.e. pattern space is replaced with the contents
619 of the next line
620 - execution is prosecuted in the command following the `n' command
621
622Text commands
623-------------
624
625(1)a\
626<text> -- a(ppend), append lines
627
628 - add <text> after the specified line (if address isn't given, then
629 <text> will be added after EACH line of input that executes
630 this, of course)
631
632 - <text> can have any number of lines, the general format is
633
634 a\
635 1st line\
636 2nd\
637 ...\
638 last line
639 `next command'
640
641 - suppose that we have
642
643 sed -e '$a\' -e '<the end>'
644
645 then a single line containing "the end" is appended to the file.
646 If we do
647
648 -e 's/.*//'
649
650 as the first command, then the only thing we will see on output
651 will be "the end", after a bunch a blank lines. i.e. <text>
652 is written after the line has been processed, but this doesn't
653 mean that the line will be written. Usually this is what
654 happens, but nothing imposes it.
655
656(1)i\
657<text> -- (i)nsert, insert lines
658
659 - works like the append command, but <text will be inserted
660 before specified line
661
662(2)c\
663<text> -- (c)hange, change lines
664
665 - this will delete current pattern space, and replace it
666 with 'text'
667
668 - this is roughly the same as insert then delete, or
669 append then delete, or
670
671 s/.*/<text>/
672 b
673
674 note : sed doesn't honor leading spaces, so the leading spaces
675 in <text> will be removed
676
677 To avoid this behavior, a `\' can be placed before
678 the first space that one wants to see written. That
679 way the space is conveniently escaped and will
680 be treated like a normal char.
681
682 GNU sed (as version 2.05) doesn't honor this ignoring-
683 -leading-space procedure
684
685 note2: <text> in not processed by the sed program, i.e.
686 we insert/change/append raw text directly to output
687
688Substitution
689------------
690
691This command is so often used that it deserves a whole section!
692
693(2)s/RE/<replacement>/[flags] -- (s)ubstitute, substitute
694
695 - on specified lines, text matched by RE, if any, is replaced
696 by <replacement>
697
698 - if replacement is done, the flag that permits the `test'
699 command to be performed is set (more about this on
700 `t' command)
701
702 - the `/' separator, in fact could be ANY character. Usually
703 it is `/' due to the fact that almost every program with
704 regular expressions can use it. Exceptions are
705 grep and lex, that don't use any char as a delimiter.
706
707 - <replacement> is raw text. The only exceptions are:
708
709 & it is replaced by all text matched by RE
710 Being so, then
711 s/RE/&/
712 is a null op, whatever the RE, except for
713 setting the test flag
714
715 \d where `d' is a digit (see below for more),
716 is replaced by the d-th grouped \(\) sub-RE
717
718 some implementations of sed (more precisely,
719 some implementations of regex(3) library, that
720 some implementations of sed use), limit `d'
721 to be a single digit (1-9). Others, such as GNU
722 sed (2.05 at least) accept a valid number.
723
724 GNU sed also accepts and understands `\0'
725 as a `&'. i.e. the whole matched RE.
726 I don't know if this behavior is standard.
727
728 If there isn't a d-th grouped \(\), then
729 \d is replaced by the null string.
730
731 \c where `c' is any char except digits, quote `c'
732
733 Note that besides the above, _all_ other text is raw,
734 so `\n' or `\t' doesn't work as one might expect. To
735 insert a newline for instance, one must do
736
737 s/foo/bar-on-this-line\
738 foo-on-next/
739
740 - <flags> are optional, and can be combined
741
742 g replace all occurrences of RE by <replacement>
743 (the default is to replace only the first)
744
745 p write the pattern space only if the substitution was
746 successful
747
748 w <file>
749 work as `p' flag, but the pattern space is written
750 to <file>
751
752 d where `d' is a digit, replace the d-th occurrence,
753 if any, of RE by <replacement>
754
755Output and files
756----------------
757
758(2)p -- (p)rint, print
759
760 - write specified lines to output
761
762(2)l -- (l)ist, list
763
764 - this works more or less like vi's :list, i.e. it prints
765 specified lines, but shows some special characters in \c format
766 like \n and \t
767
768 - useful to debug sed scripts :-)
769
770 note: the list command is present in GNU sed 2.05 (actually,
771 the only reason I know about its existence is by
772 reading the GNU sed source) -- therefore it may be an
773 extension to POSIX sed (?)
774
775(2)w <filename> -- w(rite), write to <filename>
776
777 - write specified lines to <filename>
778
779(1)r <filename> -- r(read), read the contents of <filename>
780
781 - insert contents of <filename> after specified line
782
783 - there is no way of adding contents of <filename> before
784 first line, but if someone wants that, then include
785 <filename> before the other input
786
787 - if file cannot be opened, sed continues as though the
788 command doesn't exist. i.e. it silently fails
789
790Multiple lines
791--------------
792
793(2)N -- (N)ext, (add) next line
794
795 - next line of input is added to current pattern space, and
796 a `\n' gets embedded in the pattern space
797
798(2)D -- (D)elete, delete first part of the pattern space
799
800 - delete everything up to (inclusive) the first newline
801 and then jumps to beginning of script, with next line
802 loaded
803
804 - if just one line is being edited, then `D' is the same as `d'
805
806(2)P -- (P)rint, print first part of the pattern space
807
808 - writes everything up to (inclusive) the first newline
809
810 - if pattern space is a single line, then `P' is the same as `p'
811
812Hold buffer
813-----------
814
815Sed contains one buffer, where it can keep temporary stuff to work on
816later.
817
818(2)h -- (h)old, hold pattern space
819
820 - copy current pattern space to hold buffer, overwriting
821 whatever was in it
822
823(2)H -- (H)old, hold pattern space -- append
824
825 - add current pattern space to the _end_ of hold buffer (if hold
826 space is empty, then this is like `h')
827
828(2)g -- (g)et, get contents of hold area
829
830 - copy the contents of hold space to current pattern space
831 - pattern space is overwritten
832
833(2)G -- (G)et, get contents of hold area -- append
834
835 - adds contents of hold space to the _end_ of current pattern space
836
837(2)x -- e(x)change, exchange
838
839 - exchanges current pattern space with hold buffer
840
841Control flow
842------------
843
844(2)!<command> -- Don't
845
846 - negate address specification of next command
847 - note that if we omit the address, then we mean ALL lines,
848 so, negation of all is nothing. i.e.
849
850 sed '!s/foo/bar/'
851
852 will be as good as nothing
853
854 Already,
855
856 sed '/./!d'
857
858 has a different meaning: delete all empty lines. Why? Because
859 `/./' matches any char, therefore `/./!' matches no char at all.
860
861 - this can be applied to negate 0, 1 or 2 addresses, negating 0
862 doesn't make much sense (as indicated above), but negating 1 or
863 2 addresses proves to be highly useful. Sometimes it is
864 easier to construct an RE that does not match what we want
865 than the other way.
866
867(2){ -- {} as in C or sh(1), Grouping
868
869 - groups a set of commands that are executed on the specified lines
870
871 - the first command of the group may appear right after the `{'
872 (i.e. on the same line) -- usually it is kept on the next line
873
874 - the closing `}' must appear on one line by itself
875
876 - `{...}' can be nested
877
878 addr1,addr2{
879 cmds...
880 }
881
882 can be replaced by
883
884 addr1,addr2 first_grouped_cmd
885 addr1,addr2 second_grouped_cmd
886 ...
887 addr1,addr2 last_grouped_cmd
888
889(0):<label> -- `:' usual markers of labels (C, asm, ...), place a label
890
891 - mark a place with a label, to where `t' and `b' commands can
892 jump to
893
894 - note that trailing space is sensitive (space between command
895 and arguments isn't however), so (output a-la vi :list)
896
897 :label_name $
898
899 b label_name$
900
901 The branch will fail because there isn't any label called
902 "label_name" or "label_name ".
903
904(2)b<label> -- (b)ranch, branch to label
905
906 - do an unconditional branch to specified label
907
908 - A label is not mandatory. If it is not given, the default is
909 to jump to the end of the script. i.e. nothing more is done
910 on this line.
911
912(2)t<label> -- (t)est, test substitutions
913
914 - works like `b', but the jump is only done if a previous
915 substitution has been successfully done (on current pattern
916 space)
917
918 - the flag that determines if the jump is given on not
919 is:
920 - set on a successful substitution (whatever it was)
921 and reset
922
923 - reset after `t' been executed
924 - reset after reading a new line
925
926 warning:
927 a common mistake is doing something like
928
929 /./!b
930
931 s/!/!!/g
932 s/^/-!-/
933 s/$/-!-/
934
935 :a
936 s/-!-\([^!]\|!!\)\(.*\)\([^!]\|!!\)-!-/\3-!-\2-!-\1/
937 ta
938
939 s/-!-//g
940 s/!!/!/g
941
942 (this is a sed script to reverse all chars on each line)
943
944 Note that `ta' will be _always_ executed at least one time
945 and that's not what we intend (at least, what I intend).
946
947 In fact, before `ta' and its related substitution are three
948 others substitutions, and from those three the last will
949 _always_ be successful. So, either the `s' right before `ta'
950 will succeed or not, the flag will be set, and `ta'
951 will jump anyway.
952
953 To correct the situation, a fake `ta' is inserted after the
954 label.
955
956Miscellaneous
957-------------
958
959(0)# -- comment
960
961 - comment. The whole line is ignored.
962
963(2)y/<list1>/<list2>/ -- (y)?, translates
964
965 - remaps all characters presents on <list1> by the character
966 with the same index on <list2>
967
968 - the size of <list1> must be the same as <list2>
969 - all characters are literals. i.e. no ranges, etc...
970 - the separator `/' may be replaced by any other char
971
972 to remap uppercase to lower case do
973
974 y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
975(1)= -- `=' like vi/ed, equals
976
977 - writes current line to output
978
979(1)q -- (q)uit, quit
980
981 - ends sed program. i.e. no further lines will be read, and
982 current line ends command execution here.
983
984========================================================================
985
986--------
987Examples
988--------
989
990Here are some (exotic) examples of sed use.
991
992------------------------------------------------------------------------
993 Squeezing blank lines (like cat -s)
994------------------------------------------------------------------------
995
996 Leaves a blank line at the beginning and end, if there are there
997 some already.
998
999#!/usr/bin/sed -f
1000
1001# on empty lines, join with next
1002:x
1003/^\n*$/{
1004N
1005bx
1006}
1007
1008# now, squeeze all '\n', this can be also done by: s/^\(\n\)*/\1/
1009s/\n*/\
1010/
1011
1012 leaves only at end
1013
1014#!/usr/bin/sed -f
1015
1016#delete all leading empty lines
10171,/^./{
1018/./!d
1019}
1020
1021# find an empty line, keep it, and remove all following empty lines
1022:x
1023/./!{
1024N
1025s/^\n$//
1026tx
1027}
1028
1029 Squeeze all, and remove all leading and trailing blank lines.
1030 This is also the fastest.
1031
1032#!/usr/bin/sed -nf
1033
1034# delete all blanks
1035/./!d
1036
1037# get here: so there is a non empty
1038:x
1039# print it
1040p
1041# get next
1042n
1043# got chars? print it again, etc...
1044/./bx
1045# no, don't have chars: another empty line
1046:z
1047# get next
1048n
1049# also empty? then ignore it, and get next... this will remove ALL empty
1050# lines, if we get to end, sed script will finish on n(ext) command
1051# so no trailing empty lines are written
1052/./!bz
1053
1054# all empty lines were deleted/ignored, but we have a non empty, as
1055# what we want to do is to squeeze, insert a blank line artificially
1056i\
1057
1058bx
1059
1060------------------------------------------------------------------------
1061 Centering lines
1062------------------------------------------------------------------------
1063
1064#!/usr/bin/sed -f
1065# center all lines of a file, on a 80 columns width
1066#
1067# to change that width, the number in \{\} must be replaced, and the number
1068# of added spaces also must be changed
1069#
1070
1071# del leading and trailing spaces
1072y/ / /
1073s/^ *//
1074s/ *$//
1075
1076# add 80 spaces to end of line
1077s/$/ /
1078s/ *$/&&&&&&&&/
1079
1080# keep 1st 80 chars
1081s/^\(.\{80\}\).*$/\1/
1082
1083# split trailing spaces, into two halves, 1st for beg, 2nd to end of line
1084s/\( *\)\1$/#\1%\1/
1085s/^\(.*\)#\(.*\)%\(.*\)$/\2\1\3/
1086
1087------------------------------------------------------------------------
1088 Delete comments from C code
1089------------------------------------------------------------------------
1090
1091#!/usr/bin/sed -f
1092
1093# if no /* get next
1094/\/\*/!b
1095
1096# here we've got an /*, append lines until get the corresponding
1097# */
1098:x
1099/\*\//!{
1100N
1101bx
1102}
1103
1104# delete /*...*/
1105s/\/\*.*\*\///
1106
1107------------------------------------------------------------------------
1108 Increment a number
1109------------------------------------------------------------------------
1110
1111#!/usr/bin/sed -f
1112
1113# algorithm by :
1114# Bruno <Haible@ma2s2.mathematik.uni-karlsruhe.de>
1115
1116# incrementing one number, is just add 1 to first digit, i.e. replacing
1117# it by the following digit
1118#
1119# there is one exception, when carry does happen, on that case, all
1120# following digits must be added with one
1121#
1122# now this solution by `Bruno <Haible@ma2s2.mathematik.uni-karlsruhe.de>'
1123# is very clever and smart
1124#
1125# the only way to happen carry, is when the first digit is a 9
1126# all others cases are just fine
1127#
1128# for a number beginning with any digit except 9, just replace it (the digit)
1129# by the next digit, for each number beginning with a 9, just "remove" it and
1130# proceed as above for all others, i.e. all leadings 9s are "removes" until
1131# a non-9 is found, if any 9 did not remain, a 0 is insert
1132
1133# replace all leading 9s by _ (any other char except digits, could be used)
1134#
1135:d
1136s/9\(_*\)$/_\1/
1137td
1138
1139# if there aren't any digits left, add a MostSign Digit 0
1140#
1141s/^\(_*\)$/0\1/
1142
1143# incr last digit only - there is no need for more
1144#
1145s/8\(_*\)$/9\1/
1146s/7\(_*\)$/8\1/
1147s/6\(_*\)$/7\1/
1148s/5\(_*\)$/6\1/
1149s/4\(_*\)$/5\1/
1150s/3\(_*\)$/4\1/
1151s/2\(_*\)$/3\1/
1152s/1\(_*\)$/2\1/
1153s/0\(_*\)$/1\1/
1154
1155# replace all _ to 0s
1156#
1157s/_/0/g
1158
1159------------------------------------------------------------------------
1160 Get make targets
1161------------------------------------------------------------------------
1162
1163#!/usr/bin/sed -nf
1164
1165# make-targets
1166#
1167# tries to catch all targets on a Makefile
1168#
1169# the purpose of this is to be used on the complete [make] feature
1170# of tcsh... so it should be simple and fast
1171#
1172# this is not a shell script, exactly for that reason... hopefully
1173# the kernel will interpret this executable as a sed script and
1174# feed it directly to it
1175#
1176# the name of the makefile, unfortunelly, must be hard coded on the
1177# complete code, and it is "Makefile"
1178
1179# take care of \ ended lines
1180:n
1181/\\$/{
1182 N
1183 s/\\\n//
1184 bn
1185}
1186
1187y/ / /
1188
1189# delete all comments
1190/^ *#/d
1191s/[^\\]#.*$//
1192
1193# register variables, the only ones in here are the ones of form
1194#
1195# VAR = one_word_def
1196#
1197# in that way, most vars will be skipped, and things like
1198#
1199# SED_TARGET = sed
1200#
1201# will still work
1202#
1203
1204/\([A-Za-z_0-9-]\+\) *= *\([A-Za-z_0-9./-]\+\) *$/{
1205
1206 s/ //g
1207 s/$/ /
1208 H
1209 b
1210}
1211
1212# now, perform the substitution
1213
1214/\$[({][A-Za-z_0-9-]\+[)}]/{
1215
1216 s/$/##/
1217 G
1218 s/\(\$[{(]\)\([A-Za-z_0-9-]\+\)\([)}]\)\(.*\)##.*\2=\([A-Za-z_0-9./-]\+\).*/\5\4/g
1219}
1220
1221# finally, print the targets
1222
1223tt
1224:t
1225s/^\([A-Za-z_0-9./-]\+\)\(\( \+[A-Za-z_0-9./-]\+\)*\) *:\([^=].*\)\?$/\1\2/
1226tx
1227
1228d
1229
1230# now, this a final selection of targets to be print
1231# kind of 'prog | grep -v ...'
1232
1233# don't print *.[hco] targets cause in most cases that makes very long output
1234:x
1235/\.[och]$/!p
1236
1237------------------------------------------------------------------------
1238 Rename to lower case
1239------------------------------------------------------------------------
1240
1241 This is a very abusive use of sed. We transform text, and
1242 transform it to be shell commands, then just feed them to shell.
1243
1244 The main body of this is the sed script, which remaps the name
1245 from lower to upper (or vice-versa) and even check out
1246 if name remaped name is the same as the original name
1247
1248#!/bin/sh -
1249# rename files to lower/upper case...
1250#
1251# usage:
1252# move-to-lower *
1253# move-to-upper *
1254# or
1255# move-to-lower -r .
1256# move-to-upper -r .
1257#
1258
1259help()
1260{
1261 cat << eof
1262Usage: $0 [-n] [-r] [-h] files...
1263
1264-n do nothing, only see what would be done
1265-r recursive (use find)
1266-h this message
1267files files to remap to lower case
1268
1269Examples
1270
1271 $0 -n * (see if everything is ok, then...)
1272 $0 *
1273
1274 $0 -r .
1275
1276eof
1277}
1278
1279apply_cmd='sh'
1280finder='echo $* | tr " " "\n"'
1281files_only=
1282
1283while :
1284do
1285 case "$1" in
1286 -n) apply_cmd='cat' ;;
1287 -r) finder='find $* -type f';;
1288 -h) help ; exit 1 ;;
1289 *) break ;;
1290 esac
1291 shift
1292done
1293
1294[ "$1" ] || {
1295
1296 echo Usage: $0 [-n] [-r] files...
1297 exit 1
1298}
1299
1300LOWER='abcdefghijklmnopqrstuvwxyz'
1301UPPER='ABCDEFGHIJKLMNOPQRSTUVWXYZ'
1302
1303case `basename $0` in
1304 *to-lower*)
1305 FROM=$UPPER; TO=$LOWER ;;
1306 *to-upper*)
1307 TO=$UPPER; FROM=$LOWER ;;
1308 *lower*)
1309 FROM=$UPPER; TO=$LOWER ;;
1310 *upper*)
1311 TO=$UPPER; FROM=$LOWER ;;
1312 *)
1313 FROM=$UPPER; TO=$LOWER ;;
1314esac
1315
1316eval $finder | sed -n '
1317
1318# remove all trailing /s
1319s/\/*$//
1320
1321# add ./ if there are no path, only filename
1322/\//!s/^/.\//
1323
1324# save path+filename
1325h
1326
1327# remove path
1328s/.*\///
1329
1330# do conversion only on filename
1331y/'$FROM'/'$TO'/
1332
1333# swap, now line contains original path+file, hold space contains conv filename
1334x
1335
1336# add converted file name to line, which now contains something like
1337# path/file-name\nconverted-file-name
1338G
1339
1340# check if converted file name is equal to original file name, if it is, do
1341# not print nothing
1342/^.*\/\(.*\)\n\1/b
1343
1344# now, transform path/fromfile\ntofile, into mv path/fromfile path/tofile
1345# and print it
1346s/^\(.*\/\)\(.*\)\n\(.*\)$/mv \1\2 \1\3/p
1347
1348' | $apply_cmd
1349
1350------------------------------------------------------------------------
1351 Print environ of bash
1352------------------------------------------------------------------------
1353
1354#!/bin/sh
1355
1356# penv -- print environ vars of bash
1357
1358set | sed -n '
1359
1360:x
1361
1362# possible start of functions section
1363/^.*=() /{
1364
1365# save it, on case this is a var like FOO="() "
1366h
1367n
1368 # next line isnt {, so this was really a var like FOO
1369 # print it, and process next line
1370 /^{/!{
1371
1372 x
1373 p
1374 x
1375 bx
1376
1377 }
1378
1379 # all right, start of fn section...
1380
1381# :z
1382# /\({[^{}]}\)\+/d
1383#
1384# N
1385# bz
1386
1387 # the above work allright, but since after fns, nothing more come
1388 # we can just quit
1389 q
1390
1391}
1392
1393p
1394
1395'
1396
1397------------------------------------------------------------------------
1398 Reverse chars of lines
1399------------------------------------------------------------------------
1400
1401#!/usr/bin/sed -f
1402
1403# reverse all chars of each line, keep line ordering
1404
1405# ignore empty lines, i.e. nothing to reverse
1406/./!b
1407
1408# escape ! by doubling it, place markers at beginning and end of line
1409# the markers are -!- which can never happen after the escaping of !
1410s/!/!!/g
1411s/^/-!-/
1412s/$/-!-/
1413
1414# swaps first char after first maker, with first char before last marker
1415# and then advance the markers through the swapped chars
1416ta
1417:a
1418s/-!-\([^!]\|!!\)\(.*\)\([^!]\|!!\)-!-/\3-!-\2-!-\1/
1419ta
1420
1421# delete markers, and then unescape the !s
1422s/-!-//g
1423s/!!/!/g
1424
1425------------------------------------------------------------------------
1426 Reverse lines of files
1427------------------------------------------------------------------------
1428
1429#!/usr/bin/sed -nf
1430
1431# reverse all lines of input, i.e. first line became last, ...
1432
1433# first line is pasted into buffer
14341{h;b;}
1435
1436# for all other lines, the buffer (which contains all previous)
1437# is appended to current line, so, the order is being reversed
1438# on the buffer, after that is done, store everything on the buffer
1439# again
1440G;h
1441
1442# the last line (after have done the above job) get the contents
1443# of buffer, and print it
1444${g;p;}
1445
1446------------------------------------------------------------------------
1447 Transform text into a C "printf"able string
1448------------------------------------------------------------------------
1449
1450#!/usr/bin/sed -f
1451
1452# The purpose of this script is to construct C programs like this
1453#
1454# printf("\
1455# common text
1456# ...
1457#
1458#
1459# ...
1460# last line of text
1461#
1462# and then pipe trought this filter the portion between printf and the last
1463# line of text, and get a valid C statement
1464#
1465# That's why, " is placed on last line, and not in first, for eg
1466
1467# escape all special chars " and \ inside a string...
1468s/["\\]/\\&/g
1469
1470# adds a \n\ to the end of each line, except the last, which gets \n"
1471s/$/\\n/
1472$!s/$/\\/
1473$s/$/"/
1474
1475------------------------------------------------------------------------
1476 Prefix non blank lines with their numbers (cat -b)
1477------------------------------------------------------------------------
1478
1479#!/usr/bin/sed -nf
1480
1481# copy all lines of input, prefixing only non blank lines by its number,
1482# kind of `cat -b'
1483
1484# init counter
14851{
1486 x
1487 s/^/0/
1488 x
1489}
1490
1491# for blanks, don't incr count, but print
1492/./!{
1493 p
1494 b
1495}
1496
1497# for the rest is the same as a `cat -n'
1498x
1499:d
1500s/9\(_*\)$/_\1/
1501td
1502s/^\(_*\)$/0\1/
1503s/8\(_*\)$/9\1/
1504s/7\(_*\)$/8\1/
1505s/6\(_*\)$/7\1/
1506s/5\(_*\)$/6\1/
1507s/4\(_*\)$/5\1/
1508s/3\(_*\)$/4\1/
1509s/2\(_*\)$/3\1/
1510s/1\(_*\)$/2\1/
1511s/0\(_*\)$/1\1/
1512s/_/0/g
1513
1514s/^/ /
1515s/^.*\(......\)/\1/
1516
1517G
1518s/\n/ /p
1519s/^ *//
1520s/ .*//
1521h
1522
1523------------------------------------------------------------------------
1524 Prefix lines by their number (cat -n)
1525------------------------------------------------------------------------
1526
1527#!/usr/bin/sed -nf
1528
1529# copy all lines of input, prefixed by its number, kind
1530# of `cat -n'
1531
1532# switch to buffer
1533x
1534
1535# init the counting
15361{
1537 s/^/0/
1538}
1539
1540# increment the count: first line == number 1
1541:d
1542s/9\(_*\)$/_\1/
1543td
1544s/^\(_*\)$/0\1/
1545s/8\(_*\)$/9\1/
1546s/7\(_*\)$/8\1/
1547s/6\(_*\)$/7\1/
1548s/5\(_*\)$/6\1/
1549s/4\(_*\)$/5\1/
1550s/3\(_*\)$/4\1/
1551s/2\(_*\)$/3\1/
1552s/1\(_*\)$/2\1/
1553s/0\(_*\)$/1\1/
1554s/_/0/g
1555
1556# format the number like printf's `"%6d"'
1557s/^/ /
1558s/^.*\(......\)/\1/
1559
1560# append the line to the number, and write: "<number> <line>"
1561# note: this is the format of gnu-cat
1562G
1563s/\n/ /p
1564
1565# after printing the line, transform the line into the number, and
1566# store it on buffer again
1567s/^ *//
1568s/ .*//
1569h
1570
1571------------------------------------------------------------------------
1572 Count chars of input (wc -c)
1573------------------------------------------------------------------------
1574
1575#!/usr/bin/sed -nf
1576
1577# count all chars of input, kind of `wc -c'
1578
1579# the buffer hold the count
1580x
1581
15821{
1583 s/^/0/
1584}
1585
1586# we have a line, so at least there is one char: the `\n'
1587tx
1588:x
1589s/9\(_*\)$/_\1/
1590tx
1591s/^\(_*\)$/0\1/
1592s/ \(_*\)$/0\1/
1593s/8\(_*\)$/9\1/
1594s/7\(_*\)$/8\1/
1595s/6\(_*\)$/7\1/
1596s/5\(_*\)$/6\1/
1597s/4\(_*\)$/5\1/
1598s/3\(_*\)$/4\1/
1599s/2\(_*\)$/3\1/
1600s/1\(_*\)$/2\1/
1601s/0\(_*\)$/1\1/
1602s/_/0/g
1603
1604# get back to the line
1605x
1606
1607# for each char in the line, increment the count
1608tc
1609:c
1610s/.//
1611x
1612tx
1613
1614# on last line, all is done, so print the count, and quit
1615${p;q;}
1616
1617# put current line (which has been swapped with the count) to the buffer
1618h
1619
1620------------------------------------------------------------------------
1621 Count lines of input (wc -l)
1622------------------------------------------------------------------------
1623
1624#!/usr/bin/sed -nf
1625
1626# count lines of input, kind of `wc -l'
1627
1628$=
1629
1630------------------------------------------------------------------------
1631 Count words of input (wc -w)
1632------------------------------------------------------------------------
1633
1634#!/usr/bin/sed -nf
1635
1636# count all words on input
1637# words are separated by tabs, newlines and spaces
1638
1639# the buffer hold the count
16401{;x;s/^/0/;x;}
1641
1642s/^[ ]*/\
1643/
1644ts
1645:t
1646s/^/w/
1647ts
1648:s
1649s/^\(.*\n\)[^ ]\+[ ]*/\1/
1650tt
1651
1652s/\n.*$//
1653
1654# the above, replaced all words by `w', and delete everything else
1655# except newlines, so, now the job to do, is only of counting chars
1656#
1657# from this on, this is the same os count-chars, by first we must
1658# delete one char (to keep up with the extra newline)
1659/./!{;${;g;p;q;};d;}
1660s/.//
1661
1662x
1663# we have a line, so at least there is one char: the `\n'
1664tx
1665:x
1666s/9\(_*\)$/_\1/
1667tx
1668s/^\(_*\)$/0\1/
1669s/ \(_*\)$/0\1/
1670s/8\(_*\)$/9\1/
1671s/7\(_*\)$/8\1/
1672s/6\(_*\)$/7\1/
1673s/5\(_*\)$/6\1/
1674s/4\(_*\)$/5\1/
1675s/3\(_*\)$/4\1/
1676s/2\(_*\)$/3\1/
1677s/1\(_*\)$/2\1/
1678s/0\(_*\)$/1\1/
1679s/_/0/g
1680
1681# get back to the line
1682x
1683
1684# for each char in the line, increment the count
1685tc
1686:c
1687s/.//
1688# put count on line
1689x
1690tx
1691
1692# update buffer with count
1693h
1694
1695# on last line, all is done, so print the count
1696$p
1697
1698------------------------------------------------------------------------
1699 Print the filename component of a path (basename)
1700------------------------------------------------------------------------
1701
1702#!/usr/bin/sed -f
1703
1704# usage: fbasename file
1705# or
1706# usage: find path -print | fbasename
1707#
1708#
1709# this is a basename, but read filenames from stdin, each line
1710# contains the path and a possible suffix
1711#
1712# this will produce one output line per input line, with
1713# the filename component of path, with the (possible) suffix
1714# removed
1715
1716s/^[ ]*//
1717s/[ ]*$//
1718
1719tc
1720:c
1721
1722s/[ ][ ]*/\
1723/
1724ta
1725
1726s/\/*$//
1727s/.*\///
1728b
1729
1730:a
1731
1732h
1733s/.*\n//
1734x
1735s/\n.*//
1736
1737s/\/*$//
1738s/.*\///
1739
1740tb
1741:b
1742G
1743s/^\(.*\)\(.*\)\n\2$/\1/
1744t
1745
1746P
1747d
1748
1749------------------------------------------------------------------------
1750 Print directory component of a path (dirname)
1751------------------------------------------------------------------------
1752
1753#!/usr/bin/sed -f
1754
1755# usage: find path -print | fdirname
1756#
1757# fdirname acts like dirname, but read files from stdin
1758
1759# print the directory component of a path
1760
1761# special case: `/' is given
1762/^\/$/c\
1763/
1764
1765# strip trailing `/'s if any
1766s/\/*$//
1767# strip trailing filename
1768s/[^/]*$//
1769
1770# if get no chars after these, then we have current dir (things like
1771# `bin/ src/' were given
1772/./!c\
1773.
1774
1775# delete the trailing `/'
1776# ("/usr/bin/ls" --> "/usr/bin/", this makes "/usr/bin")
1777s/\/$//
1778
1779------------------------------------------------------------------------
1780 Print the first few (=10) lines of input
1781------------------------------------------------------------------------
1782
1783#!/usr/bin/sed -f
1784
1785# display first 10 lines of input
1786
1787# the number of displayed lines can be changed, by changing the number
1788# before the `q' command to `n' where `n' is the number of lines wanted
1789
179010q
1791
1792------------------------------------------------------------------------
1793 Convert a sed script to a bash-command-line command
1794------------------------------------------------------------------------
1795
1796#!/usr/bin/sed -nf
1797
1798# converts a sed script (like this) to a (one-line) command line
1799# sed expression
1800#
1801# usually, writing sed expressions on command line permit a very
1802# fast development of the idea, but less readability
1803#
1804# this permits to convert (small) sed scripts, and incorporate
1805# them on alias, for instance
1806#
1807# Rules are:
1808#
1809# - ignore lines beginned by [space] # -- comments (see note1)
1810# - delete all beginning white space (see below: note1)
1811# - empty lines are ignored (see below: note1)
1812# - `'' and `!' chars are escaped (__!!__see below__!!_ :note2)
1813# - commands across lines (terminated with `\'), see each line
1814# of it to go to a -e 'line'
1815# - all other commands, are tried to go on a single -e '...'
1816# by being separated by `;'
1817#
1818# note1:
1819# for one-line commands only, or, by other words, only
1820# for the first line of each command
1821#
1822# if a command is multi-line, then all lines, except first
1823# are read literally to a -e 'line', so blank lines, and
1824# line beginned with `#' and beginning white space, are
1825# all preserved (useful for an `i', `c', `a' command)
1826#
1827# note2:
1828# the output is designed for bash
1829# for tcsh it should work also, but....
1830#
1831# the particularities for bash are:
1832# - `'' escapes everything, except `'' and `!' (this
1833# one was introduced by history mechanism, for instance,
1834# there's no away of quoting the `!' (as bash 1.14.5)
1835# in this expression: echo 'Hi!Good day.'
1836#
1837# so, both `!' and `'' are escaped on the following way
1838#
1839# close the preceding `'' with a `'', then escape the
1840# offensive char with `\<char>', then reopen a escaped
1841# expression with another `'', so, if I had
1842#
1843# /./!d
1844#
1845# this would become
1846#
1847# '/./'\!'d'
1848#
1849# and if I had
1850#
1851# s/'\([^']P\)'/\1/
1852#
1853# would be
1854#
1855# 's/'\''\([^'\'']P\)'\''/\1/'
1856#
1857# and all of these is good for the bash command line
1858#
1859# bugs:
1860# - the objective is to produce the smaller command line
1861# possible, this is failed on not-text multi-line commands,
1862# for instance
1863#
1864# s/.*\
1865# /<here was a newline>/
1866#
1867# will be translated to
1868#
1869# -e 's/.*\' -e '/<here was a newline>/' -e '...'
1870#
1871# and, of course, the `...' could be added to the end
1872# of `/<here was a newline>/' with a prefixed `;'
1873#
1874# this is nasty to do, due to the `i\' etc.. commands
1875# which the last line can NOT be concatenated with a
1876# suffixed `;' to the next command
1877#
1878#
1879# - the r(ead) and w(rite) commands, needs a whole line
1880# for themselves, currently they are not checked, and
1881# are treated like ordinary commands, which is wrong
1882# e.g.
1883# r foo
1884# s/foo/bar/
1885# ...
1886#
1887# is converted to
1888#
1889# 'r foo;s/foo/bar/;...'
1890#
1891# which would try to read the file named `foo;s/foo/bar/'
1892# and thats not what was pretended
1893#
1894
1895# init the buffer (what will be the command line)
1896# if #!/usr/bin/sed -n --> line starts with sed -ne '
1897# else starts with sed -e
18981{
1899 /#!.*sed.*-[^ ]*n/ba
1900
1901 x
1902 s/^/sed -e '/
1903 bd
1904:a
1905 x
1906 s/^/sed -ne '/
1907:d
1908 x
1909}
1910
1911# leading spaces go, so comment lines and empty lines
1912s/^[ ]*//
1913/^#/be
1914
1915/./!be
1916
1917# quote '! chars special to bash
1918s/['!]/'\\&'/g
1919
1920# on sed multi-line commands, read the following literally and
1921# and each one, involved on a -e 'line' to command line
1922/\(\\\\\)*\\$/{
1923 :c
1924 s/$/' -e '/
1925 N
1926 /\(\\\\\)*\\$/bc
1927 s/$/' -e '/
1928 bb
1929}
1930
1931# if normal line, then append a `;' and go on
1932s/$/;/
1933
1934# add to existent command line
1935:b
1936H
1937
1938# at the end,
1939# - delete all `\n's lying around
1940# - remove last ; if there is one
1941# - remove un-necessary -e '' (i.e. all -e '' that are not preceded
1942# by something terminated with \' (literally)
1943:e
1944${
1945 x
1946 s/\n//g
1947 s/;\?$/'/
1948 s/\([^\\]'\) -e ''/\1 /g
1949 p
1950}
1951
1952------------------------------------------------------------------------
1953 Print last few (=10) lines of input
1954------------------------------------------------------------------------
1955
1956#!/usr/bin/sed -f
1957
1958# this is a tail command, it displays last 10 lines of input
1959# if there are 10 or more, if less than that, displays all
1960
1961# to change number of displayed lines, the "$b;N" number of
1962# statements after the "1{" must be changed to `n-2', where `n'
1963# is the number of pretended lines, e.g. if want 10 lines,
1964# should have 8 `$b;N'
1965
1966# to do that with vi, just goto the first `$b,N' and do `d/^}/-2 dd 8p'
1967
19681{
1969 $b;N
1970 $b;N
1971 $b;N
1972 $b;N
1973 $b;N
1974 $b;N
1975 $b;N
1976 $b;N
1977}
1978
1979$b;N
1980
1981$!D
1982
1983------------------------------------------------------------------------
1984 The tee(1) command in sed
1985------------------------------------------------------------------------
1986
1987#!/bin/sh -
1988
1989# emulation of tee using sed, and a sh(1) for cycle
1990
1991cmd=
1992for i
1993do
1994 cmd="$cmd -e 'w $i'"
1995done
1996
1997eval sed $cmd
1998
1999------------------------------------------------------------------------
2000 Print uniq lines of input (uniq)
2001------------------------------------------------------------------------
2002
2003#!/usr/bin/sed -f
2004
2005# print all uniq lines on a sorted input-- only one copy of a duplicated
2006# line is printed
2007# like `uniq'
2008
2009:b
2010$b
2011N
2012/^\(.*\)\n\1$/{
2013 s/.*\n//
2014 bb
2015}
2016
2017$b
2018
2019P
2020D
2021
2022------------------------------------------------------------------------
2023 Print duplicated lines of input (uniq -d)
2024------------------------------------------------------------------------
2025
2026#!/usr/bin/sed -nf
2027
2028# print all duplicated uniq lines on a sorted input
2029# like `uniq -d'
2030
2031$b
2032N
2033/^\(.*\)\n\1$/{
2034 s/.*\n//
2035 p
2036 :b
2037 $b
2038 N
2039 /^\(.*\)\n\1$/{
2040 s/.*\n//
2041 bb
2042 }
2043}
2044
2045$b
2046
2047D
2048
2049------------------------------------------------------------------------
2050 Print only and only duplicated lines (uniq -u)
2051------------------------------------------------------------------------
2052
2053#!/usr/bin/sed -f
2054
2055# print all uniq lines on a sorted input-- no copies of duplicated
2056# lines are printed
2057# like `uniq'
2058
2059$b
2060N
2061/^\(.*\)\n\1$/!{
2062 P
2063 D
2064}
2065
2066:c
2067$d
2068s/.*\n//
2069N
2070/^\(.*\)\n\1$/{
2071 bc
2072}
2073D
2074
2075========================================================================
2076
2077---------------------
2078Index of sed commands
2079---------------------
2080
2081(2)!<cmd> -- Don't apply to specified addresses
2082(0)# -- comment
2083(0):<label> -- place a label
2084(1)= -- display line number
2085(2)D -- delete first part of the pattern space
2086(2)G -- append contents of hold area
2087(2)H -- append pattern space on buffer
2088(2)N -- append next line
2089(2)P -- print first part of the pattern space
2090(1)a -- append text
2091(2)b<label> -- branch to label
2092(2)c -- change lines
2093(2)d -- delete lines
2094(2)g -- get contents of hold area
2095(2)h -- hold pattern space
2096(1)i -- insert lines
2097(2)l -- list lines
2098(2)n -- next line
2099(2)p -- print
2100(1)q -- quit
2101(1)r <file> -- read the contents of <file>
2102(2)t<label> -- test substitutions and branch on successful substitution
2103(2)w <file> -- write to <file>
2104(2)x -- exchange buffer space with pattern space
2105(2){ -- group commands
2106(2)s/RE/<replacement>/[flags] -- substitute
2107(2)y/<list1>/<list2>/ -- translates <list1> into <list2>
2108
2109========================================================================
2110
2111----------------------------------
2112Author and credits and date etc...
2113----------------------------------
2114
2115Author:
2116 the "I"s on this text, means I: Carlos Duarte <cgd@teleweb.pt>
2117
2118Credits:
2119 - The regular expressions were learned by reading re_format(7)
2120 by Henry Spencer, version "@(#)re_format.7 8.2 (Berkeley) 3/16/94"
2121
2122 - The sed resume was adapted from the usd-doc paper on sed,
2123 by Lee E. McMahon, version "@(#)sed 6.1 (Berkeley) 5/22/86",
2124 originally at "August 15, 1978"
2125
2126 - The algorithm to increment a number was taken
2127 from GNU source code of gettext library, and is
2128 from: Bruno Haible <Haible@ma2s2.mathematik.uni-karlsruhe.de>
2129
2130 - The rest of stuff is mine
2131
2132 - Some minor language corrections by Casper Boden-Cummins
2133 <bodec@sherwood.co.uk>
2134
2135Date:
2136 this was started on 7-Sep-96