About
sed stands for stream editor.
It is a filter program used for filtering and transforming text
It:
- takes as input a standard stream input
- modifies it based on one or more command,
- and returns it as standard output.
In the stream, it can:
- search-and-replace all occurrences of one string to another.
- delete a range of lines.
It's part of the Gnu utility.
Sed is line-based therefore it is hard for it to grasp newlines and to manipulate eol characters.
Use the utility:
- or dos2unix
Example
How to extract a content via regular expression?
This expression capture and print the regular expression group
sed -n 's/.*\(your-group-expression\).*/\1/p'
This is:
- the s substitution command
- with this arguments/options:
- .*\(your-group-expression\).*: the matching regular expression that captures via the group brackets (They are escaped \( and \))
- \1: the callback expression (known as new pattern space) where \1 refers to the first captured group (\2 to the second and so on)
- p: a flag to print and not replace (p is not the p command but a flag of the substitution command)
- the -n option that outputs only the match
Note the -n and p may not be necessary if the input is a single line.
Syntax
sed
sed 'command1;...;commandN' inputFileName
sed -e 'command1' -e 'commandN' inputFileName
sed --expression='command1' inputFileName
sed -f myscript-with-commands.sed input.txt
sed --file=myscript-with-commands.sed input.txt
# In place editing - No outputFileName needed
sed -i 'command1;...;commandN' inputFileName
Command
command syntax is
[LineAddressSelector]SingleLetterCommand[CommandOptions][sep]
where:
Script
Using a script file avoids problems with shell escaping or substitutions.
Example script.sed: A sed file script with one command by line and a shebang
#!/bin/sed -f
sedCommand1
sedCommand...
sedCommandN
Run it:
- with the f option
sed -f script.sed inputFileName > outputFileName
- directly (thanks to the shebang)
chmod u+x subst.sed
script.sed inputFileName > outputFileName
Command
A command is the first part in the sed expression command/regularExpression/modifier.
s: Substitution
The Substitution command 2) replace a string
It's:
- line based by default (ie you can't use the \n in your pattern)
- document based if the -z or --null-data (separate lines by NUL characters)
Syntax:
s/regexp/replacement/[flags]
# First occurence Default
sed 's/searchString/replacementString/' inputFileName > outputFileName
# All Occurences thanks to the g at the end
sed 's/searchString/replacementString/g' inputFileName > outputFileName
# In place editing - No outputFileName needed
sed -i 's/searchString/replacementString/g' inputFileName
# to use backslash characters. tab by arrow and end of line by reverse p
sed 's/\t/→/g;s/$/¶'
where: in the expression 's/searchString/replacementString/':
- s stands for “substitution”.
- searchString: the search string, the text to find.
- replacementString: the replacement string
- g stands for global (ie replace all occurence)
- i is an option to edit the file directly - no need of outputFileName (a temporary output file is created in the background)
- $ is the single quote format that allows backslash characters
p - print specific line
The -n delete the output that is not matched
# prints only line 45
sed -n '45p' file.txt
# prints the first line of the first file (one.txt) and the last line of the last file (three.txt)
# Use -s to reverse this behavior.
sed -n '1p ; $p' one.txt two.txt three.txt
# Print line that matches an expression
sed -n "/patternExpression/p" one.txt
d: Delete lines
The d (delete) command delete lines (to delete a word, substitute it with nothing)
# line
'/regularExpression/d'
# deletes lines 30 to 35
'30,35d'
Example:
- delete lines that are either blank or only contain spaces
sed '/^ *$/d' inputFileName
- delete word (ie substitute with empty)
s/yourword//g
q: quit
Search for line that starts with foo and quit with the 42 exit code
/^foo/q42
More commands
https://www.gnu.org/software/sed/manual/sed.html#sed-commands-list
How to
test the search pattern expression
sed -n "/patternExpression/p" targetFilePath
where p means print
Flow
Flow of control can be managed by:
- the use of a label (a colon followed by a string)
- and the branch instruction b.
An instruction b followed by a valid label name will move processing to the block following that label.