它非常适合获取任何单个字段,但是一旦设置了这些字段,字段分隔符就会“消失”.
echo "a|b-c|d" | awk 'BEGIN{FS="[|-]"} {$3="z"}1' a b z d
在这种情况下,输出字段分隔符OFS默认设置为空格.
不幸的是,这种语句OFS = FS =“[| – ]”不起作用,因为它将OFS设置为一个字符串.
我知道如果有多个选择,awk选择输出字段分隔符可能会变得棘手,但是如果没有新字段,则可以保留当前的字段.
那么,是否有一种简单的方法可以将OFS设置为与FS完全相同的正则表达式,这样我就可以得到它?
echo "a|b-c|d" | awk '... {$3="z"}1' a|b-z|d
或者,有没有办法捕获数组中的所有分隔符?
同样的问题也适用于记录分隔符RS(及其相关的ORS)
解决方法
但是,如果你有GNU awk还有另一种方法:如column replacement with awk,with retaining the format (Ed Morton’s answer)所示,你可以使用split()
,特别是它的第四个参数.为什么?因为它在每个切片之间存储分隔符:
gawk 'BEGIN{FS="[|-]"} # set FS {split($0,a,FS,seps) # split based on FS and ... # ... store pieces in the array seps() a[3]="z" # change the 3rd field for (i=1;i<=NF;i++) # print the data back printf "%s%s",a[i],seps[i] # keeping the separators print "" # print a new line }'
作为单线:
$gawk 'BEGIN{FS="[|-]"} {split($0,seps); a[3]="z"; for (i=1;i<=NF;i++) printf "%s%s",seps[i]; print ""}' <<< "a|b-c|d" a|b-z|d
split(string,array [,fieldsep [,seps ] ])
Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array07002,the second piece in array07003,and so forth. The string value of the third argument,fieldsep,is a regexp describing where to split string (much as FS can be a regexp describing where to split input records). If fieldsep is omitted,the value of FS is used. split() returns the number of elements created. seps is a gawk extension,with seps[i] being the separator string between array[i] and array[i+1]. If fieldsep is a single space,then any leading whitespace goes into seps[0] and any trailing whitespace goes into seps[n],where n is the return value of split() (i.e.,the number of elements in array).