acrn-kernel/mm
Zach Brown 65b8291c40 [PATCH] dio: invalidate clean pages before dio write
This patch fixes a user-triggerable oops that was reported by Leonid
Ananiev as archived at http://lkml.org/lkml/2007/2/8/337.

dio writes invalidate clean pages that intersect the written region so that
subsequent buffered reads go to disk to read the new data.  If this fails
the interface tries to tell the caller that the cache is inconsistent by
returning EIO.

Before this patch we had the problem where this invalidation failure would
clobber -EIOCBQUEUED as it made its way from fs/direct-io.c to fs/aio.c.
Both fs/aio.c and bio completion call aio_complete() and we reference freed
memory, usually oopsing.

This patch addresses this problem by invalidating before the write so that
we can cleanly return -EIO before ->direct_IO() has had a chance to return
-EIOCBQUEUED.

There is a compromise here.  During the dio write we can fault in mmap()ed
pages which intersect the written range with get_user_pages() if the user
provided them for the source buffer.  This is a crazy thing to do, but we
can make it mostly work in most cases by trying the invalidation again.
The compromise is that we won't return an error if this second invalidation
fails if it's an AIO write and we have -EIOCBQUEUED.

This was tested by having two processes race performing large O_DIRECT and
buffered ordered writes.  Within minutes ext3 would see a race between
ext3_releasepage() and jbd holding a reference on ordered data buffers and
would cause invalidation to fail, panicing the box.  The test can be found
in the 'aio_dio_bugs' test group in test.kernel.org/autotest.  After this
patch the test passes.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Leonid Ananiev <leonid.i.ananiev@linux.intel.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:04 -07:00
..
Kconfig [PATCH] Set CONFIG_ZONE_DMA for arches with GENERIC_ISA_DMA 2007-02-11 10:51:19 -08:00
Makefile
allocpercpu.c
backing-dev.c
bootmem.c
bounce.c [PATCH] blktrace: only add a bounce trace when we really bounce 2007-01-12 10:46:49 -08:00
fadvise.c
filemap.c [PATCH] dio: invalidate clean pages before dio write 2007-03-16 19:25:04 -07:00
filemap.h
filemap_xip.c [PATCH] mm: mremap correct rmap accounting 2007-01-30 08:33:32 -08:00
fremap.c
highmem.c [PATCH] Use ZVC for free_pages 2007-02-11 10:51:17 -08:00
hugetlb.c [PATCH] hugetlb: preserve hugetlb pte dirty state 2007-02-09 09:25:46 -08:00
internal.h
madvise.c [PATCH] mm: fix madvise infinine loop 2007-03-16 19:25:04 -07:00
memory.c [PATCH] Add NOPFN_REFAULT result from vm_ops->nopfn() 2007-02-12 09:48:27 -08:00
memory_hotplug.c
mempolicy.c [PATCH] Page migration: Fix vma flag checking 2007-03-05 07:57:51 -08:00
mempool.c [PATCH] Numerous fixes to kernel-doc info in source files. 2007-02-11 10:51:32 -08:00
migrate.c [PATCH] Page migration: Fix vma flag checking 2007-03-05 07:57:51 -08:00
mincore.c [PATCH] mincore: vma crossing fix 2007-02-15 09:57:03 -08:00
mlock.c
mmap.c [PATCH] Bug in MM_RB debugging 2007-03-01 14:53:38 -08:00
mmzone.c
mprotect.c
mremap.c [PATCH] mm: mremap correct rmap accounting 2007-01-30 08:33:32 -08:00
msync.c
nommu.c
oom_kill.c
page-writeback.c [PATCH] throttle_vm_writeout(): don't loop on GFP_NOFS and GFP_NOIO allocations 2007-03-01 14:53:38 -08:00
page_alloc.c [PATCH] Rename PG_checked to PG_owner_priv_1 2007-03-01 14:53:37 -08:00
page_io.c
pdflush.c
prio_tree.c
readahead.c [PATCH] Drop __get_zone_counts() 2007-02-11 10:51:18 -08:00
rmap.c [PATCH] adapt page_lock_anon_vma() to PREEMPT_RCU 2007-03-01 14:53:39 -08:00
shmem.c [PATCH] shmem and simple const super_operations 2007-03-05 07:57:51 -08:00
shmem_acl.c
slab.c [PATCH] kernel-doc fixes for 2.6.20-git15 (non-drivers) 2007-03-01 14:53:37 -08:00
slob.c
sparse.c
swap.c
swap_state.c
swapfile.c
thrash.c
tiny-shmem.c [PATCH] mm/{,tiny-}shmem.c cleanups 2007-03-01 14:53:35 -08:00
truncate.c [PATCH] VM: invalidate_inode_pages2_range() should not exit early 2007-03-01 14:53:39 -08:00
util.c
vmalloc.c [PATCH] Numerous fixes to kernel-doc info in source files. 2007-02-11 10:51:32 -08:00
vmscan.c [PATCH] throttle_vm_writeout(): don't loop on GFP_NOFS and GFP_NOIO allocations 2007-03-01 14:53:38 -08:00
vmstat.c [PATCH] optional ZONE_DMA: optional ZONE_DMA in the VM 2007-02-11 10:51:18 -08:00