Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
276 views
in Technique[技术] by (71.8m points)

python - Keras flowFromDirectory get file names as they are being generated

Is it possible to get the file names that were loaded using flow_from_directory ? I have :

datagen = ImageDataGenerator(
    rotation_range=3,
#     featurewise_std_normalization=True,
    fill_mode='nearest',
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True
)

train_generator = datagen.flow_from_directory(
        path+'/train',
        target_size=(224, 224),
        batch_size=batch_size,)

I have a custom generator for my multi output model like:

a = np.arange(8).reshape(2, 4)
# print(a)

print(train_generator.filenames)

def generate():
    while 1:
        x,y = train_generator.next()
        yield [x] ,[a,y]

Node that at the moment I am generating random numbers for a but for real training , I wish to load up a json file that contains the bounding box coordinates for my images. For that I will need to get the file names that were generated using train_generator.next() method. After I have that , I can load the file, parse the json and pass it instead of a. It is also necessary that the ordering of the x variable and the list of the file names that I get is the same.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Yes is it possible, at least with version 2.0.4 (don't know about earlier version).

The instance of ImageDataGenerator().flow_from_directory(...) has an attribute with filenames which is a list of all the files in the order the generator yields them and also an attribute batch_index. So you can do it like this:

datagen = ImageDataGenerator()
gen = datagen.flow_from_directory(...)

And every iteration on generator you can get the corresponding filenames like this:

for i in gen:
    idx = (gen.batch_index - 1) * gen.batch_size
    print(gen.filenames[idx : idx + gen.batch_size])

This will give you the filenames of the images in the current batch.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...