Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
425 views
in Technique[技术] by (71.8m points)

for loop - Google Apps Script - Label Email Based On Email Body [Optimize Code]

I aim to do the following:

  • An email is received that matches the criteria be placed in Label A folder.
  • A new email is received.
  • This new email's body is a duplicate of the first received.
  • This new email skips the inbox and goes to Label B folder.

Here is how I implemented it:

  • All new emails are labeled as "Original"
  • When the script runs, it compares the email body with all previous ones.
  • If the body is a duplicate, it is labeled "Duplicate", moved to "Archive" and "Original" label is removed.

The code is as follow:

function emailLabeling() {


var DUPLICATE = _getLabel();
  var labels = GmailApp.getUserLabelByName("Original");
  if(labels != null){
    var threads = labels.getThreads();
    for (var i = 0; i < threads.length; i++){
      var messages = threads[i].getMessages();
      for (var j = 0; j < messages.length; j++){
        var message = messages[j];
        for (var k = i; k < threads.length; k++){
        var messages_check = threads[k].getMessages();
          for (var l = j; l < messages_check.length; l++){
            var message_check = messages_check[l];
            if(message_check.getPlainBody() == message.getPlainBody()){
              if(i !=  k || j != l){
                Logger.log(i +""+ j +""+ k +""+ l);
                DUPLICATE.addToThread(threads[i]);
                labels.removeFromThread(threads[i]);
                GmailApp.moveThreadToArchive(threads[i]);
              }
            }
          }
        }
      }
    }
  }
  else{
    Logger.log("Label Not Found!");
  }
}

function _getLabel() {
  var label_text = "Duplicates";
  var label = GmailApp.getUserLabelByName(label_text);
  if (label == null) {
    var label = GmailApp.createLabel(label_text);
  }
  return label;
}

The code works fine. The problem lies in 4 nested loop, which exponentially increases the runtime as the number of "Original" emails increase.

Is there a way to optimize this code? Is there a smarter logic to implement this idea?

Any help would be appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

One method of improving performance in nested loop situations - especially duplicate identification - is to store a record of traversed content, rather than repeatedly comparing. For example, you could hash the message body (given the right hash function) and store the hashes as object properties. Note that there is no formal limit on the length of an object property so you may be able to skip hashing it yourself (to obtain a fixed length property) and just let Google Apps Script do it for you. It's probably wise to test how large a message can be before using such an assumption in production, naturally.

function updateEmailLabels() {
  // Use an Object to associate a message's plaintext body with the
  // associated thread/message IDs (or other data as desired).
  var seenBodies = {}, // When a message is read, its plaintext body is stored.
      DUPLICATE = _getLabel("SO_Duplicates"),
      ORIGINAL = _getLabel("SO_Original");

  // getThreads() returns newest first. Start with the oldest by reversing it.
  ORIGINAL.getThreads().reverse().forEach(function (thread) {
    thread.getMessages().forEach(function (message, messageIndex) {
      // Use this message's body for fast lookups.
      // Assumption: Apps Script has no reachable limit on Object property length.
      var body = message.getPlainBody();

      // Compare this message to all previously seen messages:
      if (!seenBodies[body]) {
        seenBodies[body] = {
          count: 1,
          msgIndices: [ messageIndex ],
          threads: [ thread ],
          threadIds: [ thread.getId() ]
        };
      } else {
        // This exact message body has been observed previously.
        // Update information about where the body has been seen (or perform
        // more intricate checks, i.e. compare threadIds and message indices,
        // before treating this thread and message as a duplicate).
        seenBodies[body].count += 1;
        seenBodies[body].msgIndices.push(messageIndex);
        seenBodies[body].threads.push(thread);
        seenBodies[body].threadIds.push(thread.getId());
      }
    }); // End for-each-message. 
  }); // End for-each-thread.

  // All messages in all threads have now been read and checked against each other.
  // Determine the unique threads to be modified.
  var threadsToChange = {};
  for (var body in seenBodies) {
    if (seenBodies[body].count === 1)
      continue;
    var data = seenBodies[body];
    for (var threadIndex = 1; threadIndex < data.threads.length; ++threadIndex)
      threadsToChange[data.threadIds[threadIndex]] = data.threads[threadIndex];
  }
  // Update their labels and archive status.
  for (var id in threadsToChange) {
    var thread = threadsToChange[id];
    DUPLICATE.addToThread(thread);
    ORIGINAL.removeFromThread(thread);
    GmailApp.moveThreadToArchive(thread);
  }
}

function _getLabel(labelText) {
  var label = GmailApp.getUserLabelByName(labelText);
  return label ? label : GmailApp.createLabel(labelText);
}

You'll definitely want to tweak the duplicate detection bits, since I don't exactly have qualifying emails just laying around ;) I suspect what I've written will classify a thread as duplicate if at least 2 messages are the same, even if that thread is the only thread with that particular message body.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...