Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
400 views
in Technique[技术] by (71.8m points)

scripting - How can I filter out text twice in Powershell?

I have a Powershell script that returned an output that's close to what I want, however there are a few lines and HTML-style tags I need to remove. I already have the following code to filter out:

get-content "atxtfile.txt" | select-string -Pattern '<fields>' -Context 1

However, if I attempt to pipe that output into a second "select-string", I won't get any results back. I was looking at the REGEX examples online, but most of what I've seen involves the use of coding loops to achieve their objective. I'm more used to the Linux shell where you can pipe output into multiple greps to filter out text. Is there a way to achieve the same thing or something similar with PowerShell? Here's the file I'm working with as requested:

<?xml version="1.0" encoding="UTF-8"?>
<CustomObject xmlns="http://soap.force.com/2006/04/metadata">
<actionOverrides>
    <actionName>Accept</actionName>
    <type>Default</type>
</actionOverrides>
<actionOverrides>
    <actionName>CancelEdit</actionName>
    <type>Default</type>
</actionOverrides>
   <actionOverrides>
    <actionName>Today</actionName>
    <type>Default</type>
</actionOverrides>
<actionOverrides>
    <actionName>View</actionName>
    <type>Default</type>
</actionOverrides>
<compactLayoutAssignment>SYSTEM</compactLayoutAssignment>
<enableFeeds>false</enableFeeds>
<fields>
    <fullName>ActivityDate</fullName>
</fields>
<fields>
    <fullName>ActivityDateTime</fullName>
</fields>
<fields>
    <fullName>Guid</fullName>
</fields>
<fields>
    <fullName>Description</fullName>
</fields>
</CustomObject>

So, I only want the text between the <fullName> descriptor and I have the following so far:

get-content "txtfile.txt" | select-string -Pattern '<fields>' -Context 1

This will give me everything between the <fields> descriptor, however I essentially need the <fullName> line without the XML tags.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The simplest PSv3+ solution is to use PowerShell's built-in XML DOM support, which makes an XML document's nodes accessible as a hierarchy of objects with dot notation:

PS> ([xml] (Get-Content -Raw txtfile.txt)).CustomObject.fields.fullName
ActivityDate
ActivityDateTime
Guid
Description    

Note how even though .fields is an array - representing all child <fields> elements of top-level element <CustomObject> - .fullName was directly applied to it and returned the values of child elements <fullName> across all array elements (<field> elements) as an array.

This ability to access a property on a collection and have it implicitly applied to the collection's elements, with the results getting collected in an array, is a generic PSv3+ feature called member enumeration.


As an alternative, consider using the Select-Xml cmdlet (available in PSv2 too), which supports XPath queries that generally allow for more complex extraction logic (though not strictly needed here); Select-Xml is a high-level wrapper around the [xml] .NET type's .SelectNodes() method.
The following is the equivalent of the solution above:

$namespaces = @{ ns="http://soap.force.com/2006/04/metadata" }
$xpathQuery = '/ns:CustomObject/ns:fields/ns:fullName'
(Select-Xml -LiteralPath txtfile.txt $xpathQuery -Namespace $namespaces).Node.InnerText

Note:

Unlike with dot notation, XML namespaces must be considered when using Select-Xml.

Given that <CustomObject> and all its descendants are in namespace xmlns, identified via URI http://soap.force.com/2006/04/metadata, you must:

  • define this namespace in a hashtable you pass as the -Namespace argument
    • Caveat: Default namespace xmlns is special in that it cannot be used as the key in the hashtable; instead, choose an arbitrary key name such as ns, but be sure to use that chosen key name as the node-name prefix (see next point).
  • prefix all node names in the XPath query with the namespace name followed by :; e.g., ns:CustomObject

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...