Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
501 views
in Technique[技术] by (71.8m points)

memory - maximum size of a matrix in R

I am using igraph to do some network analysis. As part of that, I have to create a matrix with 2 columns and as many rows as there are links. I have a large network (several million links) and creating this matrix didn't work after 3 hours of run time (no errors, just no result, and it shows "not responding").

What is the maximum size of such a character matrix? How long does it take to run?

I am running 64 bit R 2.14.1, on a Windows 7 machine with 4 GB of memory running at 2.67 Ghz

thanks

ADDED Thanks for the quick responses. This made me positive it wasn't the size of the matrix; it turned out to be an error in which columns of another matrix I was using to create that matrix.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The theoretical limit of a vector in R is 2147483647 elements. So that's about 1 billion rows / 2 columns.

...but that amount of data does not fit in 4 GB of memory... And especially not with strings in a character vector. Each string is at least 96 bytes (object.size('a') == 96), and each element in your matrix will be a pointer (8 bytes) to such a string (there is only one instance of each unique string though).

So what typically happens is that the machine starts using virtual memory and start swapping. Heavy swapping typically kills all hope of ever finishing in this century - especially on Windows.

But if you are using a package (igraph?) and you're asking it to produce the matrix, it probably does a lot of internal work and creates lots of auxiliary objects. So even if you're nowhere near the memory limit for the single result matrix, the algorithm used to produce it can run out of memory. It can also be non-linear (quadratic or worse) in time, which would again kill all hope of ever finishing in this century...

A good way to investigate could be to time it on a small graph (e.g. using system.time), and the again when doubling the graph size a couple of times. Then you can see if the time is linear or quadratic and you can estimate how long it will take to complete your big graph. If the prediction says a week, well then you know ;-)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...