non-stop thread in fio-1.16.9

From: ljzhang,Yaxin Hu,Jianchao Tang <nonggia_at_sjtu.edu.cn>
Date: Thu, 26 Jul 2007 13:40:04 +0800

1. The job file causing the problem:
-------------------rw-------------------
[global]
directory=./temp
nrfiles=1
rw=randread:8
norandommap
size=32k
bs=2k
thread
[rw]
description="Offset over real file size."
-----------------------------------------

The job file can keep fio running with the eta increasing and the
progress pausing like this:
----------------------------------------
rw: (g=0): rw=randread, bs=2K-2K/2K-2K, ioengine=sync, iodepth=1
Starting 1 thread
Jobs: 1 (f=1): [r] [92.3% done] [ 0/ 0 kb/s] [eta 00m:01s]]

----------------------------------------

If we force it to stop, it prints like this:
--------------------------------------------
fio: terminating on signal 2

rw: (groupid=0, jobs=1): err= 0: pid=6124
  Description : ["Offset over real file size."]
  read : io=28KiB, bw=0KiB/s, iops=0, runt= 61900msec
    clat (usec): min= 5, max= 436, avg=87.79, stdev=123.71
  cpu : usr=2.01%, sys=93.48%, ctx=13644
  IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
     issued r/w: total=14/0, short=0/0
     lat (usec): 10=42.86%, 100=35.71%, 250=7.14%, 500=14.29%

Run status group 0 (all jobs):
   READ: io=28KiB, aggrb=0KiB/s, minb=0KiB/s, maxb=0KiB/s,
mint=61900msec, maxt=61900msec

Disk stats (read/write):
  sda: ios=8/45, merge=0/40, ticks=4/368, in_queue=372, util=0.27%
--------------------------------------------
>From above we can see an uncompleted job with a io 28KiB less than what
is specified by 'size'.

2. Reason for problem:
After looking into the source codes,We guess the problem comes from the
function get_next_offset().If the io reaches the end of the file, the
function fails, and the file will be marked as DONE.We wonder if that is
the planned behavior.Isn't it very common to reach the file's end when
doing a random io?
 
3. Fix suggestion:
Here is the patch:
-----------------------------------------------------
diff -Nraup fio-7.25/io_u.c fio-7.25-rw/io_u.c
--- fio-7.25/io_u.c 2007-07-25 14:25:05.000000000 +0800
+++ fio-7.25-rw/io_u.c 2007-07-26 09:55:22.000000000 +0800
@@ -162,10 +162,12 @@ static int get_next_offset(struct thread
if (get_next_rand_offset(td, f, ddir, &b))
return 1;
} else {
- if (f->last_pos >= f->real_file_size)
- return 1;
-
- b = f->last_pos / td->o.min_bs[ddir];
+ if (f->last_pos >= f->real_file_size) {
+ if (!td_random(td) || get_next_rand_offset(td, f, ddir, &b))
+ return 1;
+ } else {
+ b = f->last_pos / td->o.min_bs[ddir];
+ }
}

io_u->offset = (b * td->o.min_bs[ddir]) + f->file_offset;
------------------------------------------------------

After applying that, we got the job file run to the end normally:
-------------------------------------------------------
nonggia_at_nonggia-desktop:~/fio$ ./rw_fio rw
rw: (g=0): rw=randread, bs=2K-2K/2K-2K, ioengine=sync, iodepth=1
Starting 1 thread

rw: (groupid=0, jobs=1): err= 0: pid=6519
  Description : ["Offset over real file size."]
  read : io=32KiB, bw=1560KiB/s, iops=761, runt= 21msec
    clat (usec): min= 5, max=14251, avg=1293.50, stdev=3510.37
  cpu : usr=0.00%, sys=19.05%, ctx=7
  IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
     issued r/w: total=16/0, short=0/0
     lat (usec): 10=37.50%, 20=12.50%, 250=12.50%, 750=6.25%, 1000=6.25%
     lat (msec): 2=18.75%, 20=6.25%

Run status group 0 (all jobs):
   READ: io=32KiB, aggrb=1560KiB/s, minb=1560KiB/s, maxb=1560KiB/s,
mint=21msec, maxt=21msec

Disk stats (read/write):
  sda: ios=8/0, merge=0/0, ticks=20/0, in_queue=20, util=13.89%
-------------------------------------------------------

Received on Thu Jul 26 2007 - 07:40:04 CEST

This archive was generated by hypermail 2.2.0 : Thu Jul 26 2007 - 09:30:01 CEST