Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
555 views
in Technique[技术] by (71.8m points)

R expand.grid() function in Python

Is there a Python function similar to the expand.grid() function in R ? Thanks in advance.

(EDIT) Below are the description of this R function and an example.

Create a Data Frame from All Combinations of Factors

Description:

     Create a data frame from all combinations of the supplied vectors
     or factors.  

> x <- 1:3
> y <- 1:3
> expand.grid(x,y)
  Var1 Var2
1    1    1
2    2    1
3    3    1
4    1    2
5    2    2
6    3    2
7    1    3
8    2    3
9    3    3

(EDIT2) Below is an example with the rpy package. I would like to get the same output object but without using R :

>>> from rpy import *
>>> a = [1,2,3]
>>> b = [5,7,9]
>>> r.assign("a",a)
[1, 2, 3]
>>> r.assign("b",b)
[5, 7, 9]
>>> r("expand.grid(a,b)")
{'Var1': [1, 2, 3, 1, 2, 3, 1, 2, 3], 'Var2': [5, 5, 5, 7, 7, 7, 9, 9, 9]}

EDIT 02/09/2012: I'm really lost with Python. Lev Levitsky's code given in his answer does not work for me:

>>> a = [1,2,3]
>>> b = [5,7,9]
>>> expandgrid(a, b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in expandgrid
NameError: global name 'itertools' is not defined

However the itertools module seems to be installed (typing from itertools import * does not return any error message)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Just use list comprehensions:

>>> [(x, y) for x in range(5) for y in range(5)]

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4)]

convert to numpy array if desired:

>>> import numpy as np
>>> x = np.array([(x, y) for x in range(5) for y in range(5)])
>>> x.shape
(25, 2)

I have tested for up to 10000 x 10000 and performance of python is comparable to that of expand.grid in R. Using a tuple (x, y) is about 40% faster than using a list [x, y] in the comprehension.

OR...

Around 3x faster with np.meshgrid and much less memory intensive.

%timeit np.array(np.meshgrid(range(10000), range(10000))).reshape(2, 100000000).T
1 loops, best of 3: 736 ms per loop

in R:

> system.time(expand.grid(1:10000, 1:10000))
   user  system elapsed 
  1.991   0.416   2.424 

Keep in mind that R has 1-based arrays whereas Python is 0-based.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...