Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.8k views
in Technique[技术] by (71.8m points)

list - Memory leakage issue in python tuples

The identities list contains a big array of approximately 57000 images. Now, I am creating a negative list with the help of itertools.product. This whole list store in RAM which is very costly and my system will be hanged after 4 minutes. How I can optimize the below code and avoid saving in RAM?

for i in range(0, len(idendities) - 1):
    for j in range(i + 1, len(idendities)):
        cross_product = itertools.product(samples_list[i], samples_list[j])
        cross_product = list(cross_product)

        for cross_sample in cross_product:
            negative = []
            negative.append(cross_sample[0])
            negative.append(cross_sample[1])
            negatives.append(negative)
            print(len(negatives))

negatives = pd.DataFrame(negatives, columns=["file_x", "file_y"])
negatives["decision"] = "No"

negatives = negatives.sample(positives.shape[0])

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The product from itertools is a generator so naturally it dose not store the whole list in memory, but in the next line, cross_product = list(cross_product) you convert it to list object which store the whole data in your memory.

The idea of a generator is that you don't do all the calculation at the same time, as you do with your call list(itertools.product(samples_list[i], samples_list[j])). So what you want to do is generate the results one by one:

Try something like this:

for i in range(len(idendities) - 1):
    for j in range(i + 1, len(idendities)):
        for cross_sample in itertools.product(samples_list[i], samples_list[j]):
            # do something ...

Some credits to @9mat, @cybot and this question How to get Cartesian product in Python using a generator?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...