Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
341 views
in Technique[技术] by (71.8m points)

python - Memory Issues Using Keras Convolutional Network

I am very new to ML using Big Data and I have played with Keras generic convolutional examples for the dog/cat classification before, however when applying a similar approach to my set of images, I run into memory issues.

My dataset consists of very long images that are 10048 x1687 pixels in size. To circumvent the memory issues, I am using a batch size of 1, feeding in one image at a time to the model.

The model has two convolutional layers, each followed by max-pooling which together make the flattened layer roughly 290,000 inputs right before the fully-connected layer.

Immediately after running however, Memory usage chokes at its limit (8Gb).

So my questions are the following:

1) What is the best approach to process computations of such size in Python locally (no Cloud utilization)? Are there additional python libraries that I need to utilize?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Check out what yield does in python and the idea of generators. You do not need to load all of your data at the beginning. You should make your batch_size just small enough that you do not get memory errors. Your generator can look like this:

def generator(fileobj, labels, memory_one_pic=1024, batch_size): 
  start = 0
  end = start + batch_size
  while True:
     X_batch = fileobj.read(memory_one_pic*batch_size)
     y_batch = labels[start:end]
     start += batch_size
     end += batch_size
     if not X_batch:
        break
     if start >= amount_of_datasets:
       start = 0
       end = batch_size
     yield (X_batch, y_batch)

...later when you already have your architecture ready...

train_generator = generator(open('traindata.csv','rb'), labels, batch_size)
train_steps = amount_of_datasets//batch_size + 1

model.fit_generator(generator=train_generator,
                     steps_per_epoch=train_steps,
                     epochs=epochs)

You should also read about batch_normalization, which basically helps to learn faster and with better accuracy.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...