Dealing with a “ResourceExhaustedError” in Keras

I’m using Keras (version 2.2.0) with TensorFlow (version 1.8.0) to train ANNs with many input neurons (204800 to be precise) and a rather small database.

I recently started to get the following error message every once in a while:

> ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[204800,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

> [[Node: dense_58/kernel/Assign = Assign[T=DT_FLOAT, use_locking=true, validate_shape=true, _device=”/job:localhost/replica:0/task:0/device:GPU:0″](dense_58/kernel, dense_58/random_uniform)]]

> Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

If you run into the same issue, there are two things to try:

1.) Add the following to your code:

from keras import backend as K
[…]
K.clear_session()

2.) Try to save each model to the hard-drive (resulting in roughly 240 MB for each model in my case) and delete it to free memory for a new model in each iteration. You can delete a model by simply telling Python to free the memory and clean up:

import gc
[…]
del model
gc.collect()