Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
766 views
in Technique[技术] by (71.8m points)

xml - Handling long edit lists in XMLStarlet

Versions of XMLStarlet found in current Linux distributions have a limit of 128 operations per xmlstarlet ed invocation, and all versions are limited by the operating system's maximum command-line length. How can this be worked around?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The following breaks long xmlstarlet edit lists into a pipeline of shorter operations:

xmlstarlet_max_commands=100 # max per instance; see http://sourceforge.net/tracker/?func=detail&aid=3488240&group_id=66612&atid=515106
shopt -s extglob # enable +([0-9]) as an equivalent to the regex ^[[:digit:]]+

xmlstarlet_ed() {
  declare -a global_parameters
  declare -a parameters
  declare -i num_commands
  declare -i cmd_len

  global_parameters=( )
  parameters=( )
  num_commands=0

  global_parameters_remaining=$1; shift

  while (( global_parameters_remaining )); do
    global_parameters+=( "$1" ); shift
    (( global_parameters_remaining-- ))
  done

  while (( "$#" )) ; do
    cmd_len=$1; shift
    if ! [[ $cmd_len = +([0-9]) ]] ; then
      echo "ERROR: xmlstarlet_ed commands must be prefixed by run length"
      return 1
    fi

    if (( num_commands < xmlstarlet_max_commands )) ; then
      parameters+=( "${@:1:$cmd_len}" )
      num_commands+=1
      shift $cmd_len
    else
      xmlstarlet ed "${#global_parameters[@]}" "${global_parameters[@]}" "${parameters[@]}" 
        | xmlstarlet_ed "${#global_parameters[@]}" "${global_parameters[@]}" "$cmd_len" "$@"
      return 0
    fi
  done

  if (( ${#parameters[@]} > 0 )) ; then
    xmlstarlet ed "${global_parameters[@]}" "${parameters[@]}"
  else
    cat
  fi
}

It can be invoked as so:

# first list passed is global parameters; first the count, then the values
# pass only a 0 if no global parameters are desired
global_parameters=( 2 -N "xhtml=http://www.w3.org/1999/xhtml" )

# build up the parameter list as length/command pairs; the lengths are used
# to determine the potential split points between subprocesses
parameters=( )
while read; do
  parameters+=( 8 -s /xhtml:html/xhtml:body -t elem -n line -v "$REPLY" )
done

# ...and actually invoke:
xmlstarlet_ed "${global_parameters[@]}" "${parameters[@]}" 
 <<<"<html xmlns='http://www.w3.org/1999/xhtml'><body/></html>"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...