Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
965 views
in Technique[技术] by (71.8m points)

mongodb - How to change datatype of nested field in Mongo document?

My Mongo structure as below,

"topProcesses" : [
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "1",
            "memoryUtilizationPercent" : "0.1",
            "command" : "init",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "2",
            "memoryUtilizationPercent" : "0.0",
            "command" : "kthreadd",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "3",
            "memoryUtilizationPercent" : "0.0",
            "command" : "ksoftirqd/0",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "5",
            "memoryUtilizationPercent" : "0.0",
            "command" : "kworker/0:+",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "6",
            "memoryUtilizationPercent" : "0.0",
            "command" : "kworker/u3+",
            "user" : "root"
        },
        {
            "cpuUtilizationPercent" : "0.0",
            "processId" : "8",
            "memoryUtilizationPercent" : "0.0",
            "command" : "rcu_sched",
            "user" : "root"
        } 
    ]

Now in above documents topProcesses.cpuUtilizationPercent is in string and I wanted to change topProcesses.cpuUtilizationPercent data type to Float. For this I tried below but it did not work

db.collectionName.find({
   "topProcesses":{"$exists":true}}).forEach(function(data){
    for(var ii=0;ii<data.topProcesses.length;ii++){
   db.collectionName.update({_id: data._id},{$set:{"topProcesses.$.cpuUtilizationPercent":parseFloat(data.topProcesses[ii].cpuUtilizationPercent)}},false,true);
  }
})

Can any one help how to changed string to float in nested Mongo documents

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You are doing this the right way but you did not include the array element to match in the query portion of the .update():

db.collectionName.find({
   "topProcesses":{"$exists":true}}).forEach(function(data){
    for(var ii=0;ii<data.topProcesses.length;ii++) {
      db.collectionName.update(
         { 
             "_id": data._id, 
             "topProcesses.processId": data.topProcesses[ii].processId // corrected
         },
         {
             "$set": {
               "topProcesses.$.cpuUtilizationPercent":
                   parseFloat(data.topProcesses[ii].cpuUtilizationPercent)
             }
         }
      );
  }
})

So you need to match something in the array in order for the positional $ operator to have any effect.

You also could have just used the "index" value in the notation, since you are producing that in a loop anyway:

db.collectionName.find({
   "topProcesses":{"$exists":true}}).forEach(function(data){
    for(var ii=0;ii<data.topProcesses.length;ii++) {

      var updoc =  { 
          "$set": {}
      };

      var myKey = "topProcesses." + ii + ".cpuUtilizationPercent";
      updoc["$set"][myKey] = parseFloat(data.topProcesses[ii].cpuUtilizationPercent);

      db.collectionName.update(
         { 
             "_id": data._id
         },
         updoc
      );
  }
})

Which just uses the matching index and is handy where there is no unique identifier of the array element.

Also note that neither the "upsert" or "multi" options should apply here due to the nature of how this is processes existing documents.


Just as a "postscript" note to this, it is also worthwhile to consider the Bulk Operations API of MongoDB in versions from 2.6 and greater. Using these API methods you can significantly reduce the amount of network traffic between your client application and the database. The obvious improvement here is in the overall speed:

var bulk = db.collectionName.initializeOrderedBulkOp();
var counter = 0;

db.collectionName.find({
   "topProcesses":{"$exists":true}}
).forEach(function(data){
    for(var ii=0;ii<data.topProcesses.length;ii++) {

      var updoc =  { 
          "$set": {}
      };

      var myKey = "topProcesses." + ii + ".cpuUtilizationPercent";
      updoc["$set"][myKey] = parseFloat(data.topProcesses[ii].cpuUtilizationPercent);

      // queue the update
      bulk.find({ "_id": data._id }).update(updoc);
      counter++;

      // Drain and re-initialize every 1000 update statements
      if ( counter % 1000 == 0 ) {
          bulk.execute();
          bulk = db.collectionName.initializeOrderedBulkOp();
      }
  }
})

// Add the rest in the queue
if ( counter % 1000 != 0 )
    bulk.execute();

This basically reduces the amount of operations statements sent to the sever to only sending once every 1000 queued operations. You can play with that number and how things are grouped but it will give a significant increase in speed in a relatively safe way.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...