Re: FIO enter dead-loop to look for new file

From: Zhang, Yanmin <yanmin_zhang_at_linux.intel.com>
Date: Mon, 25 Feb 2008 14:25:30 +0800

On Mon, 2008-02-25 at 06:28 +0100, Jens Axboe wrote:
> On Mon, Feb 25 2008, Zhang, Yanmin wrote:
> > On Wed, 2008-02-13 at 16:58 +0800, Zhang, Yanmin wrote:
> > > On Tue, 2008-02-05 at 10:06 +0100, Jens Axboe wrote:
> > > > On Tue, Feb 05 2008, Jens Axboe wrote:
> > > > > On Tue, Feb 05 2008, Zhang, Yanmin wrote:
> > > > > > On Mon, 2008-02-04 at 11:01 +0100, Jens Axboe wrote:
> > > > > > > On Mon, Feb 04 2008, Zhang, Yanmin wrote:
> > > > > > > > On Mon, 2008-02-04 at 17:03 +0800, Zhang, Yanmin wrote:
> > > > > > > > > When I used below job file to test, it hangs. I used gdb to check it and found
> > > > > > > > > thread_main keeps calling clear_io_state over and over again. Every sub-process
> > > > > > > > > has one file, but it doesn't finish its task after it finishs the file, so it
> > > > > > > > > calls do_io again and again although it has no more file.
> > > > > > > > >
> > > > > > > > > If change bsrange=4k-4k, it does work. if it's 2k-4k, it also doesn't work.
> > > > > > > > If I use bs=2k to replace bsrange, it looks like it does work although
> > > > > > > > my testing is still running.
> > > > > > >
> > > > > > > Can you try the current version, I fixed some bugs in this area on
> > > > > > > friday? Either use git to download it, or just use
> > > > > > >
> > > > > > > http://brick.kernel.dk/snaps/fio-git-latest.tar.gz
> > > > > > I tried it. with bsrange=2k-4k, it doesn't hang. However, there is
> > > > > > anthoer issue. I used 9 disks and every disk has a 1GB file. Every 2
> > > > > > threads do I/O on one file, so there are 18 threads and 9 groups. With
> > > > > > the new fio-git, the status shows there are just 4 threads working on
> > > > > > I/O. The result also showed 5 groups has no result.
> > This issue appears again in the latest tarball of Feb. 21st.
>
> That's odd, and you state that the fix is missing. Perhaps my git tar
> ball script is broken. What does git log say in the directory of the
> downloaded tar ball?

1) When I untie the tarball fio-git-latest.tar.gz, tar reported:
fio/HOWTO

gzip: stdin: decompression OK, trailing garbage ignored
fio/blktrace_api.h
tar: Child returned status 2
tar: Error exit delayed from previous errors

But I still could compile and use it.

2) git log output of last items. It looks like no the fix.

commit cc9159c3f6f6a650cb973a636c35b41b8be34dbf
Author: Jens Axboe <jens.axboe_at_oracle.com>
Date: Mon Feb 4 15:58:24 2008 +0100

    sync engine: missing fsync check in vsync
    
    Signed-off-by: Jens Axboe <jens.axboe_at_oracle.com>

commit 1d2af02a16fc3c3561c994be4de887b926b2b774
Author: Jens Axboe <jens.axboe_at_oracle.com>
Date: Mon Feb 4 10:59:07 2008 +0100

    Add vsync io engine
    
    It uses readv/writev to transfer the data and coalesces adjacent
    data into a single system call (emulating queueing).
    
    Signed-off-by: Jens Axboe <jens.axboe_at_oracle.com>

commit 163f849eea2b0ce443825fa510a1cb311092a234
Author: Jens Axboe <jens.axboe_at_oracle.com>
Date: Mon Feb 4 10:56:26 2008 +0100

    Improve depth marking
    
    Signed-off-by: Jens Axboe <jens.axboe_at_oracle.com>

commit e4f54adb2aa0aec54f92f3e67eb7353e229bef95
Author: Jens Axboe <jens.axboe_at_oracle.com>
Date: Mon Feb 4 10:56:07 2008 +0100

    Decrement io_issue count when requeuing an io_u
    
    Signed-off-by: Jens Axboe <jens.axboe_at_oracle.com>

commit b271fe62101e84cd6ca2a78c92299beba251db24
Author: Jens Axboe <jens.axboe_at_oracle.com>
Date: Mon Feb 4 10:49:41 2008 +0100

    Add debug trace for io_u_queue_complete()
    
    Signed-off-by: Jens Axboe <jens.axboe_at_oracle.com>

commit 1e3c09afa64eb5b6ce5c5cc14a7a13f3c564f197
Received on Mon Feb 25 2008 - 07:25:30 CET

This archive was generated by hypermail 2.2.0 : Mon Feb 25 2008 - 07:30:02 CET