The kernel was trying to free pages from the low/normal zone but it uses repeated calls to a generic free pages function that simply kept freeing pages from high memory destroying our cache, apparently the low zone was exhausted.
There are other configurations for this kludge, changing the low/high split, but we decided to just move to x86-64 instead.
See the next installment for the new graphs, hello x86-64 Linux.
It should be noted, that during one of these periods, the buffers increase by about 10MB and hold steady at the increased size until the period ends, at which point they drop back down to normal. It is not visible in the graphs due to the large scale.
We cannot identify what is causing this period to be entered, it happens pretty randomly, we've even observed it on a sunday when the server is relatively idle.

The effect this has on the NFS (less cache hits, note the spikes):

Longer timeline memory graph contrasting normal operating conditions against the eviction hell mode (this graph is from a different day BTW):

We have spent quite some time profiling and monitoring the system, tuning various knobs in /proc/sys/vm, to no avail. I have also spent extensive time tracing the applications running to see if they are responsible by doing something silly like operate on 2GB worth of fs data then throwing it out all at once, like a gigantic file... but nothing of the sort is occuring.
The kernel version is 2.6.14.3, the machine is a Dell 2850 w/6GB RAM, the workload is a combination mail delivery and pop server for ~300,000 users using NFSv3 for the mailstore, the app software is custom written by myself and makes heavy use of posix threads, it also uses a stock MySQL replicant running on the local machine.
The graphs are generated from data gathered at 1HZ, so the data is reasonably high resolution.
The primary purpose of arranging the system like this is to have a large buffer cache shielding our NAS from as much IO as possible. When these cache evictions start occuring our NFS server average call time begins to climb because the server is essentially losing 2GB of it's ram or something to that effect. We have 6GB ram for a reason ;), most of it is for cache, the apps don't use much just a few hundred megs.
If anyone can tell me what causes this I would love to know.
Vito Caputo - <vito at hostway dawt com>
Hostway Linux Systems Developer
Addendum:
Added some more samples with more detail than the graphs capturing one of the bulk eviction events, also added a /proc/slabinfo cat @ the end of the /proc/meminfo samples. Jump to "Tue Jun 13 16:02:27", HighFree reaches a low, LowFree is pretty low throughout the sampling, but the next few samples look at HighFree, it increases slightly the next two samples, then boom, HighFree shoots up, nothing in the application is accountable for this. Tue Jun 13 16:02:24 CDT 2006 MemTotal: 6228648 kB MemFree: 122688 kB Buffers: 15456 kB Cached: 4848380 kB SwapCached: 0 kB Active: 2901152 kB Inactive: 2368616 kB HighTotal: 5373696 kB HighFree: 113212 kB LowTotal: 854952 kB LowFree: 9476 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 11612 kB Writeback: 0 kB Mapped: 411820 kB Slab: 819624 kB CommitLimit: 4166572 kB Committed_AS: 1377112 kB PageTables: 3688 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:25 CDT 2006 MemTotal: 6228648 kB MemFree: 112068 kB Buffers: 15476 kB Cached: 4858696 kB SwapCached: 0 kB Active: 2906356 kB Inactive: 2373720 kB HighTotal: 5373696 kB HighFree: 102796 kB LowTotal: 854952 kB LowFree: 9272 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 13456 kB Writeback: 16 kB Mapped: 411728 kB Slab: 819840 kB CommitLimit: 4166572 kB Committed_AS: 1376992 kB PageTables: 3688 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:26 CDT 2006 MemTotal: 6228648 kB MemFree: 108740 kB Buffers: 15476 kB Cached: 4862096 kB SwapCached: 0 kB Active: 2909968 kB Inactive: 2373392 kB HighTotal: 5373696 kB HighFree: 99448 kB LowTotal: 854952 kB LowFree: 9292 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 13964 kB Writeback: 16 kB Mapped: 411700 kB Slab: 819788 kB CommitLimit: 4166572 kB Committed_AS: 1377000 kB PageTables: 3688 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:27 CDT 2006 MemTotal: 6228648 kB MemFree: 101576 kB Buffers: 15480 kB Cached: 4869232 kB SwapCached: 0 kB Active: 2914288 kB Inactive: 2376296 kB HighTotal: 5373696 kB HighFree: 92380 kB LowTotal: 854952 kB LowFree: 9196 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 13860 kB Writeback: 4 kB Mapped: 411796 kB Slab: 819956 kB CommitLimit: 4166572 kB Committed_AS: 1377080 kB PageTables: 3692 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:28 CDT 2006 MemTotal: 6228648 kB MemFree: 104940 kB Buffers: 15480 kB Cached: 4865696 kB SwapCached: 0 kB Active: 2907048 kB Inactive: 2380048 kB HighTotal: 5373696 kB HighFree: 95852 kB LowTotal: 854952 kB LowFree: 9088 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 13060 kB Writeback: 28 kB Mapped: 411772 kB Slab: 820004 kB CommitLimit: 4166572 kB Committed_AS: 1377092 kB PageTables: 3688 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:29 CDT 2006 MemTotal: 6228648 kB MemFree: 111424 kB Buffers: 15480 kB Cached: 4859100 kB SwapCached: 0 kB Active: 2898784 kB Inactive: 2381636 kB HighTotal: 5373696 kB HighFree: 102300 kB LowTotal: 854952 kB LowFree: 9124 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 4112 kB Writeback: 24 kB Mapped: 411752 kB Slab: 819988 kB CommitLimit: 4166572 kB Committed_AS: 1377052 kB PageTables: 3692 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:30 CDT 2006 MemTotal: 6228648 kB MemFree: 233520 kB Buffers: 15120 kB Cached: 4736788 kB SwapCached: 0 kB Active: 2848088 kB Inactive: 2309964 kB HighTotal: 5373696 kB HighFree: 224068 kB LowTotal: 854952 kB LowFree: 9452 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 5828 kB Writeback: 8 kB Mapped: 412080 kB Slab: 819804 kB CommitLimit: 4166572 kB Committed_AS: 1377436 kB PageTables: 3696 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:31 CDT 2006 MemTotal: 6228648 kB MemFree: 1420452 kB Buffers: 15132 kB Cached: 3551672 kB SwapCached: 0 kB Active: 2585440 kB Inactive: 1387496 kB HighTotal: 5373696 kB HighFree: 1409756 kB LowTotal: 854952 kB LowFree: 10696 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 10628 kB Writeback: 0 kB Mapped: 411984 kB Slab: 818260 kB CommitLimit: 4166572 kB Committed_AS: 1377260 kB PageTables: 3692 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:32 CDT 2006 MemTotal: 6228648 kB MemFree: 1410328 kB Buffers: 15148 kB Cached: 3562400 kB SwapCached: 0 kB Active: 2590912 kB Inactive: 1392736 kB HighTotal: 5373696 kB HighFree: 1399216 kB LowTotal: 854952 kB LowFree: 11112 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 11332 kB Writeback: 28 kB Mapped: 411964 kB Slab: 818204 kB CommitLimit: 4166572 kB Committed_AS: 1377292 kB PageTables: 3692 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:33 CDT 2006 MemTotal: 6228648 kB MemFree: 1401136 kB Buffers: 15152 kB Cached: 3569060 kB SwapCached: 0 kB Active: 2597740 kB Inactive: 1395164 kB HighTotal: 5373696 kB HighFree: 1389916 kB LowTotal: 854952 kB LowFree: 11220 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 13240 kB Writeback: 0 kB Mapped: 414584 kB Slab: 818124 kB CommitLimit: 4166572 kB Committed_AS: 1380016 kB PageTables: 3704 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:34 CDT 2006 MemTotal: 6228648 kB MemFree: 1397776 kB Buffers: 15160 kB Cached: 3574424 kB SwapCached: 0 kB Active: 2598112 kB Inactive: 1397528 kB HighTotal: 5373696 kB HighFree: 1386568 kB LowTotal: 854952 kB LowFree: 11208 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 16304 kB Writeback: 0 kB Mapped: 411928 kB Slab: 818124 kB CommitLimit: 4166572 kB Committed_AS: 1377248 kB PageTables: 3692 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:35 CDT 2006 MemTotal: 6228648 kB MemFree: 1391084 kB Buffers: 15168 kB Cached: 3581964 kB SwapCached: 0 kB Active: 2598776 kB Inactive: 1404320 kB HighTotal: 5373696 kB HighFree: 1379748 kB LowTotal: 854952 kB LowFree: 11336 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 13420 kB Writeback: 8 kB Mapped: 411872 kB Slab: 818132 kB CommitLimit: 4166572 kB Committed_AS: 1377180 kB PageTables: 3692 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:36 CDT 2006 MemTotal: 6228648 kB MemFree: 1398128 kB Buffers: 15168 kB Cached: 3574620 kB SwapCached: 0 kB Active: 2590084 kB Inactive: 1405716 kB HighTotal: 5373696 kB HighFree: 1386816 kB LowTotal: 854952 kB LowFree: 11312 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 6272 kB Writeback: 16 kB Mapped: 411876 kB Slab: 818020 kB CommitLimit: 4166572 kB Committed_AS: 1377172 kB PageTables: 3696 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB Tue Jun 13 16:02:37 CDT 2006 MemTotal: 6228648 kB MemFree: 1393432 kB Buffers: 15168 kB Cached: 3579108 kB SwapCached: 0 kB Active: 2591740 kB Inactive: 1408456 kB HighTotal: 5373696 kB HighFree: 1381980 kB LowTotal: 854952 kB LowFree: 11452 kB SwapTotal: 1052248 kB SwapFree: 1052248 kB Dirty: 8736 kB Writeback: 16 kB Mapped: 411844 kB Slab: 818016 kB CommitLimit: 4166572 kB Committed_AS: 1377220 kB PageTables: 3696 kB VmallocTotal: 118776 kB VmallocUsed: 976 kB VmallocChunk: 117612 kB -------------------------------------------------------------------------------- /proc/slabinfo: slabinfo - version: 2.1 # name: tunables : slabdata rpc_buffers 8 8 2048 2 1 : tunables 24 12 8 : slabdata 4 4 0 rpc_tasks 509 525 256 15 1 : tunables 120 60 8 : slabdata 35 35 292 rpc_inode_cache 594 595 512 7 1 : tunables 54 27 8 : slabdata 85 85 0 UNIX 1433 1442 512 7 1 : tunables 54 27 8 : slabdata 206 206 0 ipt_hashlimit 0 0 40 92 1 : tunables 120 60 8 : slabdata 0 0 0 tcp_bind_bucket 463 2436 16 203 1 : tunables 120 60 8 : slabdata 12 12 0 inet_peer_cache 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 ip_fib_alias 29 113 32 113 1 : tunables 120 60 8 : slabdata 1 1 0 ip_fib_hash 29 113 32 113 1 : tunables 120 60 8 : slabdata 1 1 0 ip_dst_cache 24870 24870 256 15 1 : tunables 120 60 8 : slabdata 1658 1658 0 arp_cache 30 30 256 15 1 : tunables 120 60 8 : slabdata 2 2 0 RAW 5 7 512 7 1 : tunables 54 27 8 : slabdata 1 1 0 UDP 6 21 512 7 1 : tunables 54 27 8 : slabdata 3 3 0 tw_sock_TCP 14058 16530 128 30 1 : tunables 120 60 8 : slabdata 551 551 180 request_sock_TCP 283 590 64 59 1 : tunables 120 60 8 : slabdata 10 10 60 TCP 977 1200 1024 4 1 : tunables 54 27 8 : slabdata 300 300 11 uhci_urb_priv 0 0 44 84 1 : tunables 120 60 8 : slabdata 0 0 0 scsi_cmd_cache 108 150 384 10 1 : tunables 54 27 8 : slabdata 15 15 38 cfq_ioc_pool 0 0 48 78 1 : tunables 120 60 8 : slabdata 0 0 0 cfq_pool 0 0 96 40 1 : tunables 120 60 8 : slabdata 0 0 0 crq_pool 0 0 48 78 1 : tunables 120 60 8 : slabdata 0 0 0 deadline_drq 0 0 52 72 1 : tunables 120 60 8 : slabdata 0 0 0 as_arq 322 413 64 59 1 : tunables 120 60 8 : slabdata 7 7 180 mqueue_inode_cache 1 6 640 6 1 : tunables 54 27 8 : slabdata 1 1 0 udf_inode_cache 0 0 412 9 1 : tunables 54 27 8 : slabdata 0 0 0 nfs_write_data 295 511 512 7 1 : tunables 54 27 8 : slabdata 73 73 200 nfs_read_data 133 161 512 7 1 : tunables 54 27 8 : slabdata 23 23 11 nfs_inode_cache 685686 780552 640 6 1 : tunables 54 27 8 : slabdata 130092 130092 0 nfs_page 855 3422 64 59 1 : tunables 120 60 8 : slabdata 58 58 0 isofs_inode_cache 0 0 384 10 1 : tunables 54 27 8 : slabdata 0 0 0 ext2_inode_cache 335 592 468 8 1 : tunables 54 27 8 : slabdata 74 74 0 journal_handle 84 169 20 169 1 : tunables 120 60 8 : slabdata 1 1 0 journal_head 235 432 52 72 1 : tunables 120 60 8 : slabdata 6 6 0 revoke_table 10 254 12 254 1 : tunables 120 60 8 : slabdata 1 1 0 revoke_record 16 203 16 203 1 : tunables 120 60 8 : slabdata 1 1 0 ext3_inode_cache 308 568 504 8 1 : tunables 54 27 8 : slabdata 71 71 0 ext3_xattr 0 0 48 78 1 : tunables 120 60 8 : slabdata 0 0 0 reiser_inode_cache 0 0 436 9 1 : tunables 54 27 8 : slabdata 0 0 0 dnotify_cache 0 0 20 169 1 : tunables 120 60 8 : slabdata 0 0 0 dquot 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 eventpoll_pwq 0 0 36 101 1 : tunables 120 60 8 : slabdata 0 0 0 eventpoll_epi 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 inotify_event_cache 0 0 28 127 1 : tunables 120 60 8 : slabdata 0 0 0 inotify_watch_cache 0 0 36 101 1 : tunables 120 60 8 : slabdata 0 0 0 kioctx 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 kiocb 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 fasync_cache 0 0 16 203 1 : tunables 120 60 8 : slabdata 0 0 0 shmem_inode_cache 6 16 452 8 1 : tunables 54 27 8 : slabdata 2 2 0 posix_timers_cache 0 0 100 39 1 : tunables 120 60 8 : slabdata 0 0 0 uid_cache 4 59 64 59 1 : tunables 120 60 8 : slabdata 1 1 0 sgpool-128 32 33 2560 3 2 : tunables 24 12 8 : slabdata 11 11 0 sgpool-64 32 33 1280 3 1 : tunables 24 12 8 : slabdata 11 11 0 sgpool-32 32 36 640 6 1 : tunables 54 27 8 : slabdata 6 6 0 sgpool-16 116 140 384 10 1 : tunables 54 27 8 : slabdata 14 14 27 sgpool-8 243 285 256 15 1 : tunables 120 60 8 : slabdata 19 19 60 blkdev_ioc 1711 1778 28 127 1 : tunables 120 60 8 : slabdata 14 14 0 blkdev_queue 28 40 396 10 1 : tunables 54 27 8 : slabdata 4 4 0 blkdev_requests 217 384 160 24 1 : tunables 120 60 8 : slabdata 16 16 60 biovec-(256) 256 256 3072 2 2 : tunables 24 12 8 : slabdata 128 128 0 biovec-128 256 260 1536 5 2 : tunables 24 12 8 : slabdata 52 52 0 biovec-64 256 260 768 5 1 : tunables 54 27 8 : slabdata 52 52 0 biovec-16 479 555 256 15 1 : tunables 120 60 8 : slabdata 37 37 44 biovec-4 256 295 64 59 1 : tunables 120 60 8 : slabdata 5 5 0 biovec-1 424 812 16 203 1 : tunables 120 60 8 : slabdata 4 4 60 bio 502 660 128 30 1 : tunables 120 60 8 : slabdata 22 22 120 file_lock_cache 10 42 92 42 1 : tunables 120 60 8 : slabdata 1 1 0 sock_inode_cache 2375 2723 512 7 1 : tunables 54 27 8 : slabdata 389 389 27 skbuff_fclone_cache 2309 2840 384 10 1 : tunables 54 27 8 : slabdata 284 284 135 skbuff_head_cache 1050 1050 256 15 1 : tunables 120 60 8 : slabdata 70 70 232 acpi_operand 1373 1564 40 92 1 : tunables 120 60 8 : slabdata 17 17 0 acpi_parse_ext 0 0 44 84 1 : tunables 120 60 8 : slabdata 0 0 0 acpi_parse 0 0 28 127 1 : tunables 120 60 8 : slabdata 0 0 0 acpi_state 0 0 48 78 1 : tunables 120 60 8 : slabdata 0 0 0 proc_inode_cache 45 50 372 10 1 : tunables 54 27 8 : slabdata 5 5 0 sigqueue 0 0 148 26 1 : tunables 120 60 8 : slabdata 0 0 0 radix_tree_node 255795 315014 276 14 1 : tunables 54 27 8 : slabdata 22501 22501 0 bdev_cache 9 28 512 7 1 : tunables 54 27 8 : slabdata 4 4 0 sysfs_dir_cache 3008 3128 40 92 1 : tunables 120 60 8 : slabdata 34 34 0 mnt_cache 319 330 128 30 1 : tunables 120 60 8 : slabdata 11 11 0 inode_cache 1065 1100 356 11 1 : tunables 54 27 8 : slabdata 100 100 0 dentry_cache 639271 753956 140 28 1 : tunables 120 60 8 : slabdata 26927 26927 0 filp 3234 4215 256 15 1 : tunables 120 60 8 : slabdata 281 281 300 names_cache 125 125 4096 1 1 : tunables 24 12 8 : slabdata 125 125 72 idr_layer_cache 91 116 136 29 1 : tunables 120 60 8 : slabdata 4 4 0 buffer_head 13514 38160 52 72 1 : tunables 120 60 8 : slabdata 530 530 0 mm_struct 95 120 640 6 1 : tunables 54 27 8 : slabdata 20 20 0 vm_area_struct 9330 9492 92 42 1 : tunables 120 60 8 : slabdata 226 226 60 fs_cache 81 177 64 59 1 : tunables 120 60 8 : slabdata 3 3 0 files_cache 82 98 512 7 1 : tunables 54 27 8 : slabdata 14 14 0 signal_cache 125 150 384 10 1 : tunables 54 27 8 : slabdata 15 15 0 sighand_cache 119 130 1408 5 2 : tunables 24 12 8 : slabdata 26 26 0 task_struct 3635 3636 1280 3 1 : tunables 24 12 8 : slabdata 1212 1212 0 anon_vma 1144 1524 12 254 1 : tunables 120 60 8 : slabdata 6 6 0 pgd 95 339 32 113 1 : tunables 120 60 8 : slabdata 3 3 0 pmd 195 195 4096 1 1 : tunables 24 12 8 : slabdata 195 195 0 size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 0 size-65536 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 0 size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0 size-32768 275 280 32768 1 8 : tunables 8 4 0 : slabdata 275 280 0 size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 0 size-16384 28 32 16384 1 4 : tunables 8 4 0 : slabdata 28 32 0 size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0 size-8192 3650 3650 8192 1 2 : tunables 8 4 0 : slabdata 3650 3650 0 size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 8 : slabdata 0 0 0 size-4096 808 808 4096 1 1 : tunables 24 12 8 : slabdata 808 808 48 size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 8 : slabdata 0 0 0 size-2048 2216 2288 2048 2 1 : tunables 24 12 8 : slabdata 1144 1144 24 size-1024(DMA) 0 0 1024 4 1 : tunables 54 27 8 : slabdata 0 0 0 size-1024 196 196 1024 4 1 : tunables 54 27 8 : slabdata 49 49 0 size-512(DMA) 0 0 512 8 1 : tunables 54 27 8 : slabdata 0 0 0 size-512 1871 1952 512 8 1 : tunables 54 27 8 : slabdata 244 244 79 size-256(DMA) 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0 size-256 318 330 256 15 1 : tunables 120 60 8 : slabdata 22 22 0 size-128(DMA) 0 0 128 30 1 : tunables 120 60 8 : slabdata 0 0 0 size-128 8063 8460 128 30 1 : tunables 120 60 8 : slabdata 282 282 0 size-64(DMA) 0 0 64 59 1 : tunables 120 60 8 : slabdata 0 0 0 size-32(DMA) 0 0 32 113 1 : tunables 120 60 8 : slabdata 0 0 0 size-64 222843 222843 64 59 1 : tunables 120 60 8 : slabdata 3777 3777 0 size-32 8885 12091 32 113 1 : tunables 120 60 8 : slabdata 107 107 344 kmem_cache 136 150 128 30 1 : tunables 120 60 8 : slabdata 5 5 0