mm/readahead: break read-ahead loop if filemap_add_folio return -ENOMEM
authorLiu Shixin <liushixin2@huawei.com>
Fri, 22 Mar 2024 09:35:54 +0000 (17:35 +0800)
committerAndrew Morton <akpm@linux-foundation.org>
Fri, 26 Apr 2024 03:56:07 +0000 (20:56 -0700)
commit0fd44ab213bcfb26c47eedaa0985e4b5dbf0a494
tree4853a560e63685b3ccfa305d0c37a95a801b2c16
parentf238b8c33c6738f146bbfbb09b78870ea157c2b7
mm/readahead: break read-ahead loop if filemap_add_folio return -ENOMEM

Patch series "Fix I/O high when memory almost met memcg limit", v2.

Recently, when install package in a docker which almost reached its memory
limit, the installer has no respond severely for more than 15 minutes.
During this period, I/O stays high(~1G/s) and influence the whole machine.
I've constructed a use case as follows:

  1. create a docker:

$ cat test.sh
#!/bin/bash

docker rm centos7 --force

docker create --name centos7 --memory 4G --memory-swap 6G centos:7 /usr/sbin/init
docker start centos7
sleep 1

docker cp ./alloc_page centos7:/
docker cp ./reproduce.sh centos7:/

docker exec -it centos7 /bin/bash

  2. try reproduce the problem in docker:

$ cat reproduce.sh
#!/bin/bash

while true; do
flag=$(ps -ef | grep -v grep | grep alloc_page| wc -l)
if [ "$flag" -eq 0 ]; then
/alloc_page &
fi

sleep 30

start_time=$(date +%s)
yum install -y expect > /dev/null 2>&1

end_time=$(date +%s)

elapsed_time=$((end_time - start_time))

echo "$elapsed_time seconds"
yum remove -y expect > /dev/null 2>&1
done

$ cat alloc_page.c:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define SIZE 1*1024*1024 //1M

int main()
{
void *addr = NULL;
int i;

for (i = 0; i < 1024 * 6 - 50;i++) {
addr = (void *)malloc(SIZE);
if (!addr)
return -1;

memset(addr, 0, SIZE);
}

sleep(99999);
return 0;
}

We found that this problem is caused by a lot ot meaningless read-ahead.
Since the docker is almost met memory limit, the page will be reclaimed
immediately after read-ahead and will read-ahead again immediately.  The
program is executed slowly and waste a lot of I/O resource.

These two patch aim to break the read-ahead in above scenario.

[1] https://lore.kernel.org/linux-mm/c2f4a2fa-3bde-72ce-66f5-db81a373fdbc@huawei.com/T/
[2] https://lore.kernel.org/all/20240201100835.1626685-1-liushixin2@huawei.com/
[3] https://lore.kernel.org/all/20240201173130.frpaqpy7iyzias5j@quack3/

This patch (of 2):

When filemap_add_folio() return -ENOMEM, break read-ahead loop like what
filemap_alloc_folio() does.

Link: https://lkml.kernel.org/r/20240322093555.226789-1-liushixin2@huawei.com
Link: https://lkml.kernel.org/r/20240322093555.226789-2-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Liu Shixin <liushixin2@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/readahead.c