Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
575 views
in Technique[技术] by (71.8m points)

macos - Remove (CR) from CSV

On OSX I need to remove line-ending CR ( ) characters (represented as ^M in the output from cat -v) from my CSV file:

$ cat -v myitems.csv

output:

strPicture,strEmail^M
image1xl.jpg,me@example.com^M

I have tried lots of options with sed and perl but nothing works.

Any ideas?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Solutions with stock utilities:

Note: Except where noted (the sed -i incompatibility), the following solutions work on both OSX (macOS) and Linux.

Use sed as follows, which replaces with :

sed $'s/
$//' myitems.csv

To update the input file in place, use

sed -i '' $'s/
$//' myitems.csv

-i '' specifies updating in place, with '' indicating that no backup should be made of the input file; if you specify a extension, e.g., -i'.bak', the original input file will be saved with that extension as a backup.
Caveats:
* With GNU sed (Linux), to not create a backup file, you'd have to use just -i, without the separate '' argument, which is an unfortunate syntactic incompatibility between GNU Sed and the BSD Sed used on OSX (macOS) - see this answer of mine for the full story.
* -i creates a new file with a temporary name and then replaces the original file; the most notably consequence is that if the original file was a symlink, it is replaced with a regular file; for a detailed discussion, see the lower half of this answer.

Note: The above uses an ANSI C-quoted string ($'...') to create the character in the sed command, because BSD sed (the one used on OS X), doesn't natively recognize such escape sequences (note that the GNU sed used on Linux distros would).
ANSI C-quoted strings are supported in Bash, Ksh, and Zsh.

If you don't want to rely on such strings, use:

sed 's/'"$(printf '
')"'$//'

Here, the is created via printf and spliced into the sed command with a command substitution ($(...)).


Using perl:

perl -pe 's/
/
/' myitems.csv | cat -v

To update the input file in place, use

perl -i -ple 's/
/
/' myitems.csv  # -i'.bak' creates backup with suffix '.bak' first

The same caveat as above for sed with regard to in-place updating applies.


Using awk:

awk '{ sub("
$", ""); print }' myitems.csv  # shorter: awk 'sub("
$", "")+1'

BSD awk offers no in-place updating option, so you'll have to capture the output in a different file; to use a temporary file and have it replace the original afterward, use the following idiom:

awk '{ sub("
$", ""); print }' myitems.csv > tmpfile && mv tmpfile myitems.csv

GNU awk v4.1 or higher offers -i inplace for in-place updating, to which the same caveat as above for sed applies.


Edge case for all variants above: If the very last char. in the input file happens to be a lone without a following , it will also be replaced with a .


For the sake of completeness: here are additional, possibly suboptimal solutions:

None of them offer in-place updating, but you can employ the > tmpfile && mv tmpfile myitems.csv idiom introduced above


Using tr: a very simple solution that simply removes all instances; thus, it can only be used if instance only occur as part of n sequences; typically, however, that is the case:

tr -d '
' < myitems.csv

Using pure bash code: note that this will be slow; like the tr solution, this can only be used if instance only occur as part of sequences.

while IFS=$'
' read -r line; do
  printf '%s
' "$line"
done < myitems.csv

$IFS is the internal field separator, and setting it to causes read to read everything before , if present, into variable $line (if there's no , the line is read as is). -r prevents read from interpreting instances in the input.

Edge case: If the input doesn't end with , the last line will not print - you could fix that by using read -r line || [[ -n $line ]].


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...