Those who have been working with python for few years might have encountered this problem at least once. What happens is that the python processes are getting killed by the Operating System’s Out Of Memory Killer (OOM). OOM Killer does this to reclaim the memory for the OS, since there is not enough RAM space available for the critical OS operations. This OOM killer getting involved with python processes is not due to python itself but because of how the programmer using python to get their things done.

Python is a fully Object Oriented and Dynamically typed language. Python’s internal memory manager does a lot of  optimization to make the program run faster with effective use of memory. Python provides a lot of flexibility for the coder, all these flexibility cost internally while running (runtime cost) the code. But Python tries to reduce this cost as much as possible. Programmers who know about this will not blame python for its lack of performance with memory management compared to a statically typed language.

A sneak peek into python memory manager

Python’s memory manager keeps its own pool of memory(Private heap) for python objects and this pool of memory was claimed from the OS via raw malloc call at low level. Python’s memory manager abstract the malloc call and provide its own special malloc method specifically for different types of python objects. Python won’t release the memory back to the OS when an object goes out of scope in python program. Instead, it keeps this memory for future reuse.

The python garbage collector does reclaim memory back into the OS if it meets certain criteria – Please read this article for more information about python’s memory allocation policies.

We won’t come across this problem if our system has enough RAM and peak memory usage of the program comes under the RAM size. We usually start worrying about the python’s memory problems once the OOM Killer starts terminating the process. Immediate steps to solve this problem are:-

  1. Optimize the python program by using generators where ever possible
  2. Learn best python programming paradigms. Usually newbies get into this problem because of the bad coding practices
  3. Know how much memory  your application needs in peak time and add that much RAM or Increase the SWAP. Use of swap space may not be very good because it will reduce the performance of the program if program is constantly swapping data from RAM to Disk.

I have encountered this problem while working with long running python programs where it’s peak memory usage is higher than the average memory usage.

With Celery

Celery is a famous distributed job scheduler in python. It’s used for asynchronous task executions and other complex scenario executions. Those who use celery for simple tasks wouldn’t have faced the memory issue. But in our case — we use celery very heavily.

RabbitMQ is the AMQP Message queue; backbone to the celery.

In our case we called celery tasks with kwargs,  which holds very large objects with no defined structure. Celery passes this data through Rabbitmq’s queue to the celery’s job runner which actually runs the task with given kwargs. Since the size of the kwargs was very high and the structure varied each time, the celery’s main process kept on increase its memory footprint. Obviously if we have enough RAM then we can handle the peak memory requirement but this RAM size wasn’t affordable to us. We had to find solution to prevent celery’s main process from acquiring lot of memory.

The solution for this problem was to, use kwargs (or message) with fixed structure and small in size. Only pass ids or indexes through the message so that the celery’s main process won’t acquire more memory. Celery has option to kill its worker processes and re-spawn after some time. This can be used so that the work processes also release all its memory after some time and start fresh.

RabbitMQ also recommends using messages with smaller size so that we get better performance out of the system. This doesn’t mean Rabbitmq can’t handle large messages. The smaller message size helps in reduceing the IO overload required for the RabbitMQ.

I hope this brief explanation will help you to tackle similar problems. If you have any questions or suggestion please put it on the comment section below.

This post is cross posted from my blog with minor corrections.