Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
430 views
in Technique[技术] by (71.8m points)

algorithm - How can I iterate a vector once and insert/remove/modify multiple elements along the way?

I want to iterate an array/vector once and modify multiple elements on the way as this is the most optimal solution. I don't want to scan it again and again just because Rust is unhappy about borrows.

I store a list of intervals represented as [start;stop] in a sorted vector and I want to add a new interval. It might be overlapping, so I want to remove all elements that are not needed anymore. I want to do it all in one go. Algorithm can look something like this (I cut some parts):

use std::cmp::{min, max};

#[derive(Debug, PartialEq, Clone, Copy)]
struct Interval {
    start: usize,
    stop: usize,
}

impl Interval {
    fn new(start: usize, stop: usize) -> Interval {
        Interval {
            start: start,
            stop: stop,
        }
    }
    pub fn starts_before_disjoint(&self, other: &Interval) -> bool {
        self.start < other.start && self.stop < other.start
    }
    pub fn starts_before_non_disjoint(&self, other: &Interval) -> bool {
        self.start <= other.start && self.stop >= other.start
    }
    pub fn starts_after(&self, other: &Interval) -> bool {
        self.start > other.start
    }
    pub fn starts_after_disjoint(&self, other: &Interval) -> bool {
        self.start > other.stop
    }
    pub fn starts_after_nondisjoint(&self, other: &Interval) -> bool {
        self.start > other.start && self.start <= other.stop
    }
    pub fn disjoint(&self, other: &Interval) -> bool {
        self.starts_before_disjoint(other)
    }
    pub fn adjacent(&self, other: &Interval) -> bool {
        self.start == other.stop + 1 || self.stop == other.start - 1
    }
    pub fn union(&self, other: &Interval) -> Interval {
        Interval::new(min(self.start, other.start), max(self.stop, other.stop))
    }
    pub fn intersection(&self, other: &Interval) -> Interval {
        Interval::new(max(self.start, other.start), min(self.stop, other.stop))
    }
}

fn main() {
    //making vectors
    let mut vec = vec![
        Interval::new(1, 1),
        Interval::new(2, 3),
        Interval::new(6, 7),
    ];
    let addition = Interval::new(2, 5); // <- this will take over interval @ 2 and will be adjacent to 3, so we have to merge
    let (mut i, len) = (0, vec.len());
    while i < len {
        let r = &mut vec[i];
        if *r == addition {
            return; //nothing to do, just a duplicate
        }
        if addition.adjacent(r) || !addition.disjoint(r) {
            //if they are next to each other or overlapping
            //lets merge
            let mut bigger = addition.union(r);
            *r = bigger;
            //now lets check what else we can merge
            while i < len - 1 {
                i += 1;
                let next = &vec[i + 1];
                if !bigger.adjacent(next) && bigger.disjoint(next) {
                    //nothing to merge
                    break;
                }
                vec.remove(i); //<- FAIL another mutable borrow
                i -= 1; //lets go back
                vec[i] = bigger.union(next); //<- FAIL and yet another borrow
            }
            return;
        }
        if addition.starts_before_disjoint(r) {
            vec.insert(i - 1, addition); // <- FAIL since another refence already borrowed @ let r = &mut vec[i]
        }
        i += 1;
    }
}

It fails in a couple of places because of the borrowing rules. Is there a way to either

  1. do it with iterators in one go
  2. work around the borrowing

There is borrow splitting available, but I don't see how I can apply it here.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In general, you can't because this is exactly a class of bug that Rust prevents. Examine this sequence of events where I've replaced i with unique variables since the compiler doesn't know what values will be used anyway.

let r = &mut vec[i1];
let next = &vec[i2];
vec.remove(i3);
vec[i4] = bigger.union(next);         
vec.insert(i5, addition);
  • If you remove any value before i1 or i2 when you called vec.remove(i3), then the references in next and r would be invalidated as all the values would move over.
  • If i5 is before i1 or i2, then the same thing would happen, just in the other direction.
  • If i4 were equal to i2, then the immutable value of next would be changed.
  • If i4 were equal to i1, then modifications to r would happen through another path than the single owner of the mutable reference.

Note how each of these correspond to the points the compiler told you about.

It's possible that some of these cases might be fixed via non-lexical lifetimes, if the compiler becomes sufficiently smart to understand that you no longer need to have a reference around. It will not help with the cases of changing the vector through an array index; the compiler is not smart enough to track your math and prove that you never touch the "wrong" index, nor will it be smart enough to realize that two references into an array are disjoint if the indices are.


In this specific case, since your type implements Copy, make use of that to get a value out. Write directly back to the vector when you need to. You can't have borrowing errors if you never borrow.

fn main() {
    let mut vec = vec![
        Interval::new(1, 1),
        Interval::new(2, 3),
        Interval::new(6, 7),
    ];
    let addition = Interval::new(2, 5);
    let (mut i, len) = (0, vec.len());
    while i < len {
        let r = vec[i];
        if r == addition {
            return;
        }
        if addition.adjacent(&r) || !addition.disjoint(&r) {
            let mut bigger = addition.union(&r);
            vec[i] = bigger;
            while i < len - 1 {
                i += 1;
                let next = vec[i + 1];
                if !bigger.adjacent(&next) && bigger.disjoint(&next) {
                    break;
                }
                vec.remove(i); 
                i -= 1;
                vec[i] = bigger.union(&next);
            }
            return;
        }
        if addition.starts_before_disjoint(&r) {
            vec.insert(i - 1, addition);
        }
        i += 1;
    }
}

Realistically, I'd do as loganfsmyth suggests and change the algorithm to take a slice of intervals and return a new Vec. If you were doing this a lot, you could flip writing back and forth between two pre-allocated Vecs, but at that point there's probably a far better data structure than a vector; perhaps an interval tree.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...