Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
290 views
in Technique[技术] by (71.8m points)

regex - I need to convert XML to Json in JavaScript on Parse.com's Cloud Code

PLEASE DO NOT COMMENT SAYING YOU CANNOT PARSE XML WITH REGEX, IT CAN BE DONE IT'S JUST NOT THE BEST WAY. AND PLEASE DON'T NEGATIVE THIS QUESTION FOR NO REASON.

On Parse.com's cloud code you currently cannot convert XML to json without major coding effort. I found the following code at: http://killzonekid.com/worlds-smallest-fastest-xml-to-json-javascript-converter/

xml = xml.replace(/s/g, ' ').replace(/< *?[^>]*?? *>/g, '').replace(/< *!--[^>]*?-- *>/g, '').replace(/< *(/?) *(w+):(w+)/g, '<$1$2_$3').replace(/< *(w+)([^>]*?)/ *>/g, '< $1$2>').replace(/(w+):(w+) *= *"([^>]*?)"/g, '$1_$2="$3"').replace(/< *(w+)((?: *w+ *= *" *[^"]*?")+ *)>( *[^< ]*?.*?)< */ *1 *>/g, '< $1$2 value="$3">').replace(/ *(w+) *= *"([^>]*?)" */g, '< $1>$2').replace(/< *(w+) *</g, '<$1>< ').replace(/> *>/g, '>').replace(/< */ *(w+) *> *< *1 *>/g, '').replace(/"/g, '"').replace(/< *(w+) *>([^<>]*?)< */ *1 *>/g, '"$1":"$2",').replace(/< *(w+) *>([^<>]*?)< */ *1 *>/g, '"$1":{$2},').replace(/< *(w+) *>(?=.*?< /1},{)/g, '"$1":[{').split(/},{/).reverse().join('},{').replace(/< */ *(w+) *>(?=.*?"1":[{)/g, '}],').split(/},{/).reverse().join('},{').replace(/< /(w+)},{1>/g, '},{').replace(/< *(w+)[^>]*?>/g, '"$1":{').replace(/< */ *w+ *>/g,'},').replace(/} *,(?= *(}|]))/g, '}').replace(/] *,(?= *(}|]))/g, ']').replace(/" *,(?= *(}|]))/g, '"').replace(/ *, *$/g, '');

It actually does quite a good job of converting XML to json.

There are a few querks with the code. 1. It messes up the attributes.

  1. It doesn't like names with hyphens in them. To fix the hyphens I changed all the w+ to w[w'-] Is this the best way?

Here is an example XML document

    <?xml version="1.0" encoding="UTF-8" ?>
<api>
    <products total-matched="1618" records-returned="1" page-number="1">
        <product>
            <ad-id>1234</ad-id>
            <supplier-name>Window World</supplier-name>
            <supplier-category>3703703</supplier-category>
            <buy-url>http://website.com</buy-url>
            <currency>USD</currency>
            <description>Window</description>
            <image-url>http://website.com/windowa/80x80.jpg</image-url>
            <in-stock>yes</in-stock>
            <manufacturer-name>Window World</manufacturer-name>
            <name>Half Pain Glass</name>
            <price>31.95</price>
            <retail-price>87.60</retail-price>
            <sale-price>29.95</sale-price>
            <sku>5938</sku>
            <upc></upc>
        </product>
    </products>
</api>

Example output:

{
    "api": {
        "products": {
            "total-matched": {
                1618 "records-returned": {
                    1 "page-number": {
                        1 >
                            "product": {
                            "adid": "1234",
                            "suppliername": "Window World",
                            "suppliercategory": "3703703",
                            "buyurl": "http://website.com",
                            "currency": "USD",
                            "description": "Window",
                            "imageurl": "http://website.com/windowa/80x80.jpg",
                            "instock": "yes",
                            "manufacturername": "Window World",
                            "name": "Half Pain Glass",
                            "price": "31.95",
                            "retailprice": "87.60",
                            "saleprice": "29.95",
                            "sku": "5938",
                            "upc": ""
                        }
                    }
                }
            }
        }
    }
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

My guess is that - looking at the structure of the resulting json - there should be no attributes. To authorize them, you would need to change quite some things, including nesting the nested json...
Ain't it possible to just change:

<products total-matched="1618" records-returned="1" page-number="1">

to

<products>
  <total-matched>1618</total-matched>
  <records-returned>1</records-returned>
  <page-number>1</page-number>
  <product>...

...as it would give you what you expect to have with attributes (I guess).

As for the hyphens, your idea is good, just change the w to [w-], it should work (I'll glady admit I didn't look into all the regexes so it's just a guess once more). w+ would become [w-]+ and so on.

Edit:

You can add a step to first change your xml. This regex should do that part:

/(<w+[^<]*?)s+([w-]+)="([^"]+)">/
// asuming there is no " in your attributes' values (would be more complicated...)

Test:

var string = '<api><products total-matched="1618" records-returned="1" page-number="1">';
var regex = /(<w+[^<]*?)s+([w-]+)="([^"]+)">/;
while(string.match(regex)) string = string.replace(regex, '$1><$2>$3</$2>');

Result:

"<api><products><total-matched>1618</total-matched><records-returned>1</records-returned><page-number>1</page-number>"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...