Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
119 views
in Technique[技术] by (71.8m points)

python - Given edges, how can find routes that consists of two edges in a vectorised way?

I have an array of towns and their neighbours. I want to get a set all the pairs of towns that have at least one route that consists of exactly two different edges. Is there a vectorized way to do this? If no, why? For example: edges [3,0], [0,4], [5,0] has an incident node 0 so it's quaranteed that [3,4], [4,5], [3,5] are pairs of towns that can be connected in routes likes so: 3-0-4, 4-0-5 and 3-0-5. They consist of two edges.

Example of input: np.array([[3,0], [0,4], [5,0], [2,1], [1,4], [2,3], [5,2]])

Expected output: array([4,3], [4,5], [3,5], [4,2], [1,3], [1,5], [3,5], [0,2], [0,1], [0,2]) (No worries if order is different, any of edge directions are reversed or there are duplicates)

There is what I have done so far:

from itertools import chain, combinations

def get_incidences(roads):
    roads = np.vstack([roads, roads[:,::-1]])
    roads_sorted = roads[np.argsort(roads[:,0])]
    marker_idx = np.flatnonzero(np.diff(roads_sorted[:,0]))+1
    source = roads_sorted[np.r_[marker_idx-1,-1],0]
    target = np.split(roads_sorted[:,1], marker_idx)
    return source, target

def get_combinations_chain(target):
    #I know this could be improved with `np.fromiter`
    return np.array(list(chain(*[combinations(n,2) for n in target])))

def get_combinations_triu(target):
    def combs(t):
        x, y = np.triu_indices(len(t),1)
        return np.transpose(np.array([t[x], t[y]]))
    return np.concatenate([combs(n) for n in target])

roads = np.array([[3,0], [0,4], [5,0], [2,1], [1,4], [2,3], [5,2]])

>>> get_incidences(roads)
(array([0, 1, 2, 3, 4, 5]),
 [array([4, 3, 5]),
  array([4, 2]),
  array([1, 3, 5]),
  array([0, 2]),
  array([0, 1]),
  array([0, 2])])
>>> get_combinations_chain(get_incidences(roads)[1])
array([[4, 3], [4, 5], [3, 5], [4, 2], [1, 3], [1, 5], [3, 5], [0, 2], [0, 1], [0, 2]])
>>> get_combinations_triu(get_incidences(roads)[1])
array([[4, 3], [4, 5], [3, 5], [4, 2], [1, 3], [1, 5], [3, 5], [0, 2], [0, 1], [0, 2]])

The last two ones give an expected output but they require a list comprehension. Is it possible to vectorize this calculation:

np.concatenate([combs(n) for n in target])

Update I ended with a possible way of vectorization but I needed to reorganize an input data (output of get_incidences):

INPUT:
target: [array([4, 3, 5]), array([4, 2]), array([1, 3, 5]), array([0, 2]), array([0, 1]), array([0, 2])]
stream: [4 3 5 4 2 1 3 5 0 2 0 1 0 2]
lengths: [3 2 3 2 2 2]
OUTPUT:
array([[3, 4], [4, 5], [3, 5], [2, 4], [1, 3], [1, 5], [3, 5], [0, 2], [0, 1], [0, 2]])

It also appears to be faster than straightforward concatenation of all the combinations:

def get_incidences(roads):
    roads = np.vstack([roads, roads[:,::-1]])
    roads_sorted = roads[np.argsort(roads[:,0])]
    marker_idx = np.flatnonzero(np.diff(roads_sorted[:,0]))+1
    lengths = np.diff(marker_idx, prepend=0, append=len(roads_sorted))
    stream = roads_sorted[:,1]
    target = np.split(stream, marker_idx)
    return target, stream, lengths

def get_combinations_vectorized(data):
    target, stream, lengths = data
    idx1 = np.concatenate(np.repeat(target, lengths))
    idx2 = np.repeat(stream, np.repeat(lengths, lengths))
    return np.array([idx1, idx2]).T[idx1 < idx2]

def get_combinations_triu(data):
    target, stream, lengths = data
    def combs(t):
        x, y = np.triu_indices(len(t),1)
        return np.transpose(np.array([t[x], t[y]]))
    return np.concatenate([combs(n) for n in target])

def get_combinations_chain(data):
    target, stream, lengths = data
    return np.array(list(chain(*[combinations(n,2) for n in target])))

def get_combinations_scott(data):
    target, stream, lengths = data
    return np.array([x for i in target for x in combinations(i,2)])

def get_combinations_index(data):
    target, stream, lengths = data
    index = np.fromiter(chain.from_iterable(chain.from_iterable(combinations(n,2) for n in target)), 
                        dtype=int, count=np.sum(lengths*(lengths-1)))
    return index.reshape(-1,2)

roads = np.array([[64, 53], [94, 90], [24, 60], [45, 44], [83, 17], [10, 88], [14, 6], [56, 93], [98, 93], [86, 77], [12, 85], [58, 2], [19, 80], [48, 26], [11, 51], [16, 83], [45, 96], [35, 54], [47, 23], [81, 57], [52, 34], [88, 11], [18, 4], [35, 90], [41, 45], [2, 7], [58, 68], [58, 11], [46, 38], [32, 93], [44, 41], [26, 39], [20, 58], [44, 4], [8, 96], [74, 71], [34, 35], [91, 72], [28, 58], [53, 73], [66, 5], [84, 97], [24, 29], [43, 63], [96, 63], [20, 57], [1, 74], [4, 89], [10, 89], [98, 22]])
data = get_incidences(roads)

%timeit get_combinations_vectorized(data)
%timeit get_combinations_chain(data)
%timeit get_combinations_triu(data)
%timeit get_combinations_scott(data)
%timeit get_combinations_index(data)

92 μs ± 18.3 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
123 μs ± 3.67 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
1.8 ms ± 9.44 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
126 μs ± 2.45 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
140 μs ± 4.48 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

However, it depends a lot on data. Timings for roads = np.array(list(combinations(range(100),2)))

44.2 ms ± 4.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
277 ms ± 8.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
21.2 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
369 ms ± 17.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
43.2 ms ± 911 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use the networkx library:

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from itertools import combinations

a = np.array([[3,0], [0,4], [5,0], [2,1], [1,4], [2,3], [5,2]])

G = nx.Graph()

G.add_edges_from(a)

#Creates this newtork
nx.draw_networkx(G)

enter image description here

# Create pairs of all nodes in network
c = combinations(G.nodes, 2)

# Find all routes between each pair in the network
routes = [list(nx.all_simple_paths(G, i, j, cutoff=2))[0] for i, j in c]

# Select only routes with three nodes/two edges the show first and last node
paths_2_edges = [(i[0], i[-1]) for i in routes if len(i) == 3]
print(paths_2_edges)

Output:

[(3, 4), (3, 5), (3, 1), (0, 2), (0, 1), (4, 5), (4, 2), (5, 1)]

Per comments

Vectorize this statement: np.concatenate([combs(n) for n in target]):

For t = get_incidences(roads)[1]

s2 = get_combinations_triu(t)

Output s2:

array([[4, 3],
       [4, 5],
       [3, 5],
       [4, 2],
       [1, 3],
       [1, 5],
       [3, 5],
       [0, 2],
       [0, 1],
       [0, 2]])

%timeit get_combinations_triu(t)

96.9 μs ± 3.44 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


Then

s1 = np.array([x for i in t for x in combinations(i,2)])

Output s1:

array([[4, 3],
       [4, 5],
       [3, 5],
       [4, 2],
       [1, 3],
       [1, 5],
       [3, 5],
       [0, 2],
       [0, 1],
       [0, 2]])

And, (s1 == s2).all()

True

Timeit:

%timeit np.array([x for i in t for x in list(combinations(i,2))])

14.7 μs ± 577 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...