Sunday, June 24, 2007

Linux Memory Blues

You might have often noticed that the memory utilization of your linux box
is full and you get irritated and tempted to reboot the system. 'top' and 'ps'
comes of no help and u seen no process eating up the memory. The result is
you spend hours cursing the developer for not testing the memory leaks when
actually there is none.

Linux kernel doesn't like to waste RAM and it caches disk I/O onto the
cache.. leading to a much theoretical efficiency.

hmm the efficiency can very well be proved.. for eg:
type this command..

for i in 1 2; do free -o; time grep -r foo /usr/bin >/dev/null
2>/dev/null; done

what the command does is that it searches for "foo" in/usr/bin directory
resulting in a heavy Disk I/O utilization. Now lets check the output..

total used free shared buffers cached
Mem: 1019400 255496 763904 0 15852 131700
Swap: 1044184 0 1044184

real 2m2.018s
user 0m20.930s
sys 0m8.110s
total used free shared buffers cached
Mem: 1019400 476712 542688 0 19292 326480
Swap: 1044184 0 1044184

real 0m23.644s
user 0m21.000s
sys 0m0.560s

What does this output signifies?
Look at the first iteration .. you can see that the system has 1GB RAM of
which 763MB is free and 131MB is used by the cache, before the grep
command. The grep command executes and not surprisingly it took a little
more that 2 mins to execute .. quite a good job, considering the amount of
Disk I/O it had to do.
Now look at the second iteration .. you may now have noticed the increase in
the cached memory .. about 200 MB rise in cached memory and 200MB decrease
in free memory. Now look at the output time of the grep command.. WOW only
23 Secs!! thats quite a lot of reduction .. thats wonderful .. kudos to
the developer :) .. it seems to be the right way to go about it..

But wait... is it the right time to rejoice?? .. hmm will there be any
problems with this ?? after all I am facing memory issues from the past
few months even after increasing my RAM. Now what may be the cause? If
cached mem is a boon then why my machine slows down dramatically? May be the
caching is causing thrashing in my machine.. ok lets not get into
conclusions .. a quick glance through internet revealed to me

"When an application needs memory and all the RAM is fully occupied, the
kernel has two ways to free some memory at its disposal: it can either
reduce the disk cache in the RAM by eliminating the oldest data or it may
swap some less used portions (pages) of programs out to the swap partition
on disk. It is not easy to predict which method would be more efficient.
The kernel makes a choice by roughly guessing the effectiveness of the two
methods at a given instant, based on the recent history of activity."

"Based on the recent history of acitivity"?? thats seems to be pretty
unclear and irrelevant .. worst cases cannot be predicted..

Ok lets check what happens when I use up the entire cache memory available
.. ie I am gonna use my entire RAM,leaving no free space. Lets scan a huge
file. Scanning a 4.4 GB file in my machine ate up the entire RAM within no
time..

total used free shared buffers cached
Mem: 995 986 8 0 23 831
-/+ buffers/cache: 132 863
Swap: 1019 2 1017

vmstat 1 2
procs memory swap io system
cpu
r b swpd free buff cache si so bi bo in cs us sy
wa id
4 1 2232 8844 23780 850824 0 1 796 29 274 631 11 7
15 66
1 1 2232 8528 23784 851120 0 0 1076 0 249 342 7 5
20 68

Looks good as still the system isnt swapping.. But the performance has
degraded drastically as i am unable to move my mouse freely :)

Now what may be the reason .. is it that the system is swapping in & out
of the cached memory, keeping the processor busy? Lets check our first
example now that the cache is heavily used..

for i in 1 2; do free -o; time grep -r foo /usr/bin >/dev/null
2>/dev/null; done
total used free shared buffers cached
Mem: 1019400 1010088 9312 0 24136 850864
Swap: 1044184 2232 1041952

real 3m2.794s
user 0m23.110s
sys 0m12.930s
total used free shared buffers cached
Mem: 1019400 1008400 11000 0 24516 849568
Swap: 1044184 3464 1040720

real 2m52.996s
user 0m21.980s
sys 0m8.860s

hmm not much of a difference in the 2 iterations.. but the significant
thing is the increase in the system time. This time indicates the
processor time used up by the processor in handling the big cache memory
it had used up.

Ok till now we have been dealing with read operations wherein the cached
memory just had to be discarded.. now what happens if there is also a
heavy write operation?

What should be the tradeoff in allowing the processor to cache I/O .. how
much space should I allocate for DISK I/O to achieve optimal performance
with my system? Luckily if you have kernel 2.6 then there is a parameter
called Swappiness.

Heres what a quick glance over the net revealed..
Since 2.6, there has been a way to tune how much Linux favors swapping out
to disk compared to shrinking the caches when memory gets full.

Before the 2.6 kernels, the user had no possible means to influence the
calculations and there could happen situations where the kernel often made
the wrong choice, leading to thrashing and slow performance. The addition
of swappiness in 2.6 changes this. Thanks, ghoti!

Swappiness takes a value between 0 and 100 to change the balance between
swapping applications and freeing cache. At 100, the kernel will always
prefer to find inactive pages and swap them out; in other cases, whether a
swapout occurs depends on how much application memory is in use and how
poorly the cache is doing at finding and releasing inactive items.

The default swappiness is 60. A value of 0 gives something close to the
old behavior where applications that wanted memory could shrink the cache
to a tiny fraction of RAM. For laptops which would prefer to let their
disk spin down, a value of 20 or less is recommended.

As a sysctl, the swappiness can be set at runtime with either of the
following commands:

sysctl -w vm.swappiness=30
echo 30 >/proc/sys/vm/swappiness

I Personally felt keeping the swappiness value to 10 has helped me achieve better performance in my desktop Linux system as I have only a single IDE slot and my CDROM and HDD are connected in the same bus.. so it was better to keep the disk utilization down.

More on how to achieve this with kernel 2.4 .. shortly :)