[NBLUG/talk] ext3 performance

Scott Doty scott at sonic.net
Sun Nov 23 08:48:00 PST 2003


[ This is part of a conversation I've been having on another mailing list,
thought I'd bring it here... -sd ]

On Sat, Nov 22, 2003 at 11:32:04PM -0800, Someone wrote:
> On Sat, Nov 22, 2003 at 12:27:29PM -0800, Scott Doty wrote:
> > Would I see better performance for large file copies with a larger journal?
> 
> No, I don't think so.  There are at least two modes for the journal, data
> and ordered.  One of them (the default, I think) puts data into the
> journal, the other only the metadata.  They perform substantially
> differently, most filesystems are tuned for small file I/O since that is
> what happens 99% of the time.
> 
> Of course, this is assuming that disk io is actually what's limiting the
> copy.  If you haven't already, looking at 'iostat -x' could be helpful.

Here's a result from top:
Mem:  1030284k av, 1021128k used,    9156k free,       0k shrd,  193928k buff
                    767936k actv,   87484k in_d,   21536k in_c

"in_d" is what I find worrysome.

The behavior is that kjournald is spending a lot of time "DW", and it pretty
much nails the first CPU.  Interactive performance suffers, and if I don't
renice -19 the liveice/icecast processes, they have trouble keeping the
pipes full. (Audio stalls for listeners, and the liveice->icecast connection
will even time out.)  Also, there's the curious effect that top reports that
other processes start using 80-90% of their CPU's, unless they're reniced -19.

This isn't just the RedHat kernel, as I've noticed the same behavior at home
when copying large video files between journalled filesystems.

# iostat -x
Linux 2.4.20-8smp (rock.disinfotainment.com)    11/23/2003

avg-cpu:  %user   %nice    %sys   %idle
           4.68    0.00    1.07   94.25

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
/dev/hda    10.37   2.71  1.30  0.78   93.07   27.90    46.54    13.95    58.20     0.02    1.97   3.09   0.64
/dev/hda1    0.04   0.00  0.00  0.00    0.08    0.00     0.04     0.00    72.14     0.00   69.35  37.79   0.00
/dev/hda2    0.00   0.01  0.01  0.04    0.11    0.40     0.05     0.20     9.84     0.03   48.24  94.22   0.49
/dev/hda3    1.06   0.90  0.54  0.57   12.82   11.76     6.41     5.88    22.24     0.01    9.97   2.98   0.33
/dev/hda5    9.26   1.79  0.74  0.18   80.06   15.74    40.03     7.87   104.14     0.01   11.03   5.96   0.55
/dev/hdb     0.24  10.31  0.03  0.75    2.16   88.48     1.08    44.24   117.11     0.08    9.08   8.23   0.64
/dev/hdb1    0.24  10.31  0.03  0.75    2.16   88.48     1.08    44.24   117.11     0.01    9.08   6.55   0.51

This is with tar reniced 19, which seems to have improved interactive
performance.

For those curious, I've appended the journal options from mount(8).  I plan
on trying the "writeback" option, unless folks have had bad experiences
with it.

 -Scott

       data=journal / data=ordered / data=writeback
              Specifies the journalling  mode  for  file  data.   Metadata  is
              always journaled.

              journal
                     All  data  is  committed  into the journal prior to being
                     written into the main file system.

              ordered
                     This is the default mode.  All data  is  forced  directly
                     out  to  the main file system prior to its metadata being
                     committed to the journal.

              writeback
                     Data ordering is not preserved - data may be written into
                     the  main file system after its metadata has been commit-
                     ted to the journal.  This is rumoured to be the  highest-
                     throughput  option.   It  guarantees internal file system
                     integrity, however it can allow old  data  to  appear  in
                     files after a crash and journal recovery.

/sd



More information about the talk mailing list