Re: [PATCH] Fail on bad verify.

From: Shawn Lewis <shawnlewis_at_google.com>
Date: Thu, 2 Aug 2007 13:34:30 -0700

On 8/2/07, Jens Axboe <jens.axboe_at_oracle.com> wrote:
>
> On Thu, Aug 02 2007, Shawn Lewis wrote:
> > On 8/2/07, Jens Axboe <jens.axboe_at_oracle.com> wrote:
> > >
> > > On Wed, Aug 01 2007, Shawn Lewis wrote:
> > > > diff --git a/verify.c b/verify.c
> > > > index 5bdfb76..fadf1c0 100644
> > > > --- a/verify.c
> > > > +++ b/verify.c
> > > > @@ -349,7 +349,7 @@ int verify_io_u(struct thread_data *td,
> > > > struct verify_header *hdr;
> > > > unsigned int hdr_inc, hdr_num = 0;
> > > > void *p;
> > > > - int ret;
> > > > + int ret = 0;
> > > >
> > > > if (td->o.verify == VERIFY_NULL || io_u->ddir != DDIR_READ)
> > > > return 0;
> > > > @@ -401,7 +401,7 @@ int verify_io_u(struct thread_data *td,
> > > > hdr_num++;
> > > > }
> > > >
> > > > - return 0;
> > > > + return ret;
> > > > }
> > > >
> > > > static void fill_meta(struct verify_header *hdr, struct thread_data
> > > *td,
> > >
> > > Hmm, I think it's useful, but perhaps we also need an option to keep
> > > going? It may be interesting to continue verifying and see if the rest
> > > of the blocks are ok, or if there are other failures in that data set.
> > > Something like a
> > >
> > > verify_failure_fatal
> > >
> > > or whatever, I'm not very good with names. If you have a better option
> > > name, please tell me :-)
> >
> >
> > Actually I'd rather continue if there is a verify failure but it looks
> like
> > the original intention was to die, and it got lost somewhere. ret is now
> not
> > used. Maybe we should just always continue?
>
> Continue is what it does now. So I'm obviously fine with that behaviour,
> as long as it's clearly noted that it failed (and the exit status shows
> that).

Hmm, exit status doesn't indicate failure if a checksum is bad currently:

[root_at_ipww25 fio]# ./fio helloworld.fio
file: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
Starting 1 thread
crc32: verify failed at 2560/512
crc32: wanted 213c02ee, got e51fc835

file: (groupid=0, jobs=1): err= 0: pid=26601
  read : io=8KiB, bw=8,192KiB/s, iops=2,000, runt= 1msec
    clat (usec): min= 7, max= 13, avg=10.00, stdev= 4.24
  cpu : usr=0.00%, sys=0.00%, ctx=4
  IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=
0.0%
     issued r/w: total=2/0, short=0/0
     lat (usec): 10=50.00%, 20=50.00%

Run status group 0 (all jobs):
   READ: io=8KiB, aggrb=8,192KiB/s, minb=8,192KiB/s, maxb=8,192KiB/s,
mint=1msec, maxt=1msec

Disk stats (read/write):
  sda: ios=0/3, merge=0/0, ticks=0/3, in_queue=3, util=0.79%
[root_at_ipww25 fio]# echo $?
0

verify_io_u is a bit inconsistent about what it does for various types of
errors. After you apply the verify_fill_pattern I can try to make this a
little more consistent.

--
> Jens Axboe
>
>
Received on Thu Aug 02 2007 - 22:34:30 CEST

This archive was generated by hypermail 2.2.0 : Thu Aug 02 2007 - 23:00:01 CEST