Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
451 views
in Technique[技术] by (71.8m points)

io redirection - Powershell: stdout and stderr to separate files, without new lines

I'm trying to store the stdout and stderr outputs of a command to two separate files. I'm doing this like so:

powershell.exe @_cmd 2>"stderr.txt" >"stdout.txt"

Where $_cmd is an arbitrary string command.

This works, but the output files have newlines appended after the output. I'd like to modify this to eliminate the newlines. I know you can use cmd | Out-File ... -NoNewline or [System.IO.File]::WriteAllText(..., [System.Text.Encoding]::ASCII), but I'm not sure how to accomplish this with the stderr output.

EDIT: I've realized that the issue isn't the trailing new line specifically (although I still want to remove it), but the fact that I need the output file to be UTF-8 encoded. The trailing new line is not a valid UTF-8 character apparently, which is what's causing me grief. Perhaps there's a way to capture the stderr and stdout to separate variables, and then use Out-File -Encoding utf8?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your own Start-Process-based solution that uses -RedirectStandardOutput and -RedirectStandardError indeed creates (BOM-less) UTF-8-encoded output files, but note that they too invariably have a trailing newline.

However, you do not need Start-Process, as you can make PowerShell's redirection operator, > produce UTF-8 files (also with a trailing newline) too.

The following examples use a sample cmd.exe call that produces both stdout and stderr output.

  • In PowerShell (Core) v6+, no extra effort is needed, because > produces (BOM-less) UTF-8 files by default (a default that is used consistently; if you want UTF-8 with a BOM, you can use the technique detailed for Windows PowerShell below, but with value 'utf8bom'):

    cmd /c 'echo hü & dir c:
    osuch' 2>stderr.txt >stdout.txt
    
  • In Windows PowerShell, > produces UTF-16LE ("Unicode") by default, but in version 5.1 you can (temporarily) reconfigure it use UTF-8 instead, albeit invariably with a BOM; see this answer for details; another caveat is that the first stderr line captured in the file will be formatted "noisily", like a PowerShell error:

    # Windows PowerShell v5.1:
    # Make `>` and its effective alias, Out-File, use UTF-8 with a BOM in the
    # remainder of the session.
    # Save and restore any previous value if you want to scope the behavior
    # to select commands only.
    $PSDefaultParameterValues['Out-File:Encoding'] = 'utf8'
    
    cmd /c 'echo hü & dir c:
    osuch' 2>stderr.txt >stdout.txt
    

Caveat:

  • Whenever PowerShell processes an external program's output, it invariably decodes it into .NET strings first. Any external program is assumed to produce output based on the character encoding stored in [Console]::OutputEncoding, which defaults to the system's active OEM code page. This works as expected with cmd.exe, but there are other console applications that use different encodings - notably node.exe (Node.js) and python, which use UTF-8 and the system's active ANSI code page, respectively - in which case [Console]::OutputEncoding must be set to that encoding first; see this answer for more information.

As for your statements and questions:

The trailing new line is not a valid UTF-8 character apparently

PowerShell's > operator and file-output cmdlets apply their character encoding consistently, so the trailing newline's encoding is always consistent with that of the other characters in the file.

Most likely it was the UTF-16LE ("Unicode") encoding used by Windows PowerShell by the default that was the true problem, and you may have only noticed it with respect to the newline.

Perhaps there's a way to capture the stderr and stdout to separate variables

Stdout can be captured by a simple variable assignment, which captures multiple output lines as an array of strings:

$stdout = cmd /c 'echo hü & dir c:
osuch'

You cannot separately capture stderr output, but you can merge stderr into stdout with 2>&1 and even later separate the streams' respective output lines again, based on their data types: stdout lines are always strings, whereas stderr lines are always [ErrorRecord] instances:

# Note the 2>&1 redirection.
$stdoutAndErr = cmd /c 'echo hü & dir c:
osuch' 2>&1

# If desired, you can split the captured output into stdout and stderr output.
# The [string[]] cast converts the [ErrorRecord] instances to strings too.
$stdout, [string[]] $stderr = $stdoutAndErr.Where({ $_ -is [string] }, 'Split')

# Now $stdout is the array of stdout lines, and $stderr the array of stderr lines.
# If desired, you could write them to files *without a trailing newline* as follows:
$stdout -join [Environment]::NewLine | Set-Content -NoNewLine -Encoding utf8 stdout.txt
$stderr -join [Environment]::NewLine | Set-Content -NoNewLine -Encoding utf8 stderr.txt

You can also apply these techniques to PowerShell-native commands (and you can even merge all other streams that PowerShell supports into the success output stream, PowerShell's analog to stdout, with *>&1).

However, if a given PowerShell-native command is a cmdlet / advanced script or function, the more convenient alternative is to use the common -OutVariable parameter (for success-stream output) and common -ErrorVariable parameter (for error-stream output).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...