How can I benchmark my HDD?How can the read/write speed of a partition or drive be measured?Measuring SD card...

How did the SysRq key get onto modern keyboards if it's rarely used?

Sci-fi change: Too much or Not enough

Why is it considered Acid Rain with pH <5.6

How to store my pliers and wire cutters on my desk?

Why is drive/partition number still used?

Assuring luggage isn't lost with short layover

Sea level static test of an upper stage possible?

Did the IBM PC use the 8088's NMI line?

Is my investment strategy a form of fundamental indexing?

3D Statue Park: Daggers and dashes

Correlation length anisotropy in the 2D Ising model

Writing a clean implementation of rock–paper–scissors game in C++

How did the Axis intend to hold the Caucasus?

How can religions be structured in ways that allow inter-faith councils to work?

Word for showing a small part of something briefly to hint to its existence or beauty without fully uncovering it

Sci fi story: Clever pigs that help a galaxy lawman

What do you call a flexible diving platform?

How many oliphaunts died in all of the Lord of the Rings battles?

Can a table be formatted so that math mode is in some columns and text is in others by default?

How do I stop my characters falling in love?

Polyhedra, Polyhedron, Polytopes and Polygon

Defining a Function programmatically

Unethical behavior : should I report it?

Why can't my huge trees be chopped down?



How can I benchmark my HDD?


How can the read/write speed of a partition or drive be measured?Measuring SD card read / write rateswhat is the difference between seek and skip in dd command?FIO reporting slow write speeds while DD reports fast onesFile transfer between SSD and RAID array uber-slowFull DD copy from hdd to hddHow to interpret `cryptsetup benchmark` results?Outsource mount of HDDAdding new HDD to VirtualBox Linux VM and setting it to already taken mounting point - what happens to old content?Universal non-bash `time` benchmark alternative?Problems with udev rules for starting bash script when HDD is disconnectedHow to ensure the same speed of hard disks among cluster machines?cryptsetup, LUKS, dm-crypt benchmarkDisk setup on laptop with regular HDD plus SDDHow to disable HPA capability using DCO on a SATA HDD?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







50















I've seen commands to benchmark one's HDD such as this using dd:



$ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"


Are there better methods to do so than this?










share|improve this question

























  • related: askubuntu.com/questions/87035/…

    – Ciro Santilli 新疆改造中心996ICU六四事件
    Dec 20 '18 at 16:54


















50















I've seen commands to benchmark one's HDD such as this using dd:



$ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"


Are there better methods to do so than this?










share|improve this question

























  • related: askubuntu.com/questions/87035/…

    – Ciro Santilli 新疆改造中心996ICU六四事件
    Dec 20 '18 at 16:54














50












50








50


36






I've seen commands to benchmark one's HDD such as this using dd:



$ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"


Are there better methods to do so than this?










share|improve this question
















I've seen commands to benchmark one's HDD such as this using dd:



$ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"


Are there better methods to do so than this?







linux hard-disk benchmark






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Apr 24 '14 at 19:09









Braiam

24.5k20 gold badges82 silver badges147 bronze badges




24.5k20 gold badges82 silver badges147 bronze badges










asked Jan 11 '14 at 3:50









slmslm

265k73 gold badges573 silver badges716 bronze badges




265k73 gold badges573 silver badges716 bronze badges













  • related: askubuntu.com/questions/87035/…

    – Ciro Santilli 新疆改造中心996ICU六四事件
    Dec 20 '18 at 16:54



















  • related: askubuntu.com/questions/87035/…

    – Ciro Santilli 新疆改造中心996ICU六四事件
    Dec 20 '18 at 16:54

















related: askubuntu.com/questions/87035/…

– Ciro Santilli 新疆改造中心996ICU六四事件
Dec 20 '18 at 16:54





related: askubuntu.com/questions/87035/…

– Ciro Santilli 新疆改造中心996ICU六四事件
Dec 20 '18 at 16:54










7 Answers
7






active

oldest

votes


















60














I usually use hdparm to benchmark my HDD's. You can benchmark both the direct reads and the cached reads. You'll want to run the commands a couple of times to establish an average value.



Examples



Here's a direct read.



$ sudo hdparm -t /dev/sda2

/dev/sda2:
Timing buffered disk reads: 302 MB in 3.00 seconds = 100.58 MB/sec


And here's a cached read.



$ sudo hdparm -T /dev/sda2

/dev/sda2:
Timing cached reads: 4636 MB in 2.00 seconds = 2318.89 MB/sec


Details



-t     Perform  timings  of  device reads for benchmark and comparison 
purposes. For meaningful results, this operation should be repeated
2-3 times on an otherwise inactive system (no other active processes)
with at least a couple of megabytes of free memory. This displays
the speed of reading through the buffer cache to the disk without
any prior caching of data. This measurement is an indication of how
fast the drive can sustain sequential data reads under Linux, without
any filesystem overhead. To ensure accurate measurements, the
buffer cache is flushed during the processing of -t using the
BLKFLSBUF ioctl.

-T Perform timings of cache reads for benchmark and comparison purposes.
For meaningful results, this operation should be repeated 2-3
times on an otherwise inactive system (no other active processes)
with at least a couple of megabytes of free memory. This displays
the speed of reading directly from the Linux buffer cache without
disk access. This measurement is essentially an indication of the
throughput of the processor, cache, and memory of the system under
test.


Using dd



I too have used dd for this type of testing as well. One modification I would make to the above command is to add this bit to the end of your command, ; rm ddfile.



$ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"; rm ddfile


This will remove the ddfile after the command has completed. NOTE: ddfile is a transient file that you don't need to keep, it's the file that dd is writing to (of=ddfile), when it's putting your HDD under load.



Going beyond



If you need more rigorous testing of your HDD's you can use Bonnie++.



References




  • How to use 'dd' to benchmark your disk or CPU?

  • Benchmark disk IO with DD and Bonnie++






share|improve this answer





















  • 1





    I like hdparm as well, for quick benchmarks. The only downside is it only benchmarks read bandwidth and the performance of many types of block devices (e.g. RAID, iSCSI) can be very asymmetrical. For comparing ‘before’ and ‘after’ performance on the same box, dd works well too.

    – Alexios
    Jan 11 '14 at 10:02













  • @Alexios - yes thanks for mentioning that. Yes you typically have to use at least hdparm + dd or just bonnie++ or all 3.

    – slm
    Jan 11 '14 at 14:14











  • Instead of sync which is questionable use iflag=direct oflag=direct when it is supposed (eg linux with a filesystem that supports direct io).

    – user112865
    May 3 '15 at 3:30



















22














This is a very popular question (you can see variations of it on https://stackoverflow.com/q/1198691 , https://serverfault.com/q/219739/203726 and https://askubuntu.com/q/87035/740413 ).




Are there better methods [than dd] to [benchmark disks]?




Yes but they will take longer to run and require knowledge of how to interpret the results - there's no single number that will tell you everything in one go because the following influence the type of test you will run:




  • Are you interested in the performance of I/O that is random, sequential or some mix of the two?

  • Are you reading from or writing to the disk (or some mixture of the two)?

  • Are you concerned about latency, throughput or both?

  • Are you trying to understand how different parts of the same hard disk perform (generally speeds a faster closer to the centre of the disk)?

  • Are you interested in raw numbers or how a given filesystem will perform when using your disk?

  • Are you interested in how a particular size of I/O performs?

  • Are you submitting the I/O synchronously or asynchronously?

  • How much I/O are you submitting (submit too little the wrong way and all the I/O can be cached so you wind up testing the speed of your RAM rather than the speed of the disk)?

  • How compressible is the content of the data you are writing (e.g. zero only data is highly compressible and some filesystems/disks even have a special fast-path for zero only data leading to numbers that are unobtainable with other content)?


And so on.



Assuming you're interested in raw, non-filesystem benchmarks here's a short list of tools with easiest to run at the top and difficult/better/more thorough nearer the bottom:




  1. dd (sequential, only shows throughput but configured correctly it can be made to bypass the block cache/wait for I/O to be really completed)

  2. hdparm (sequential, only tests reads, only shows throughput, only works with ATA devices, doesn't account for filesystem overhead but configured correctly it can be made to bypass the block cache)

  3. GNOME Disk Utility's benchmark (easy to run, graphical but requires a full GNOME install, gives latency and throughput numbers for different types of I/O).


  4. fio (can do nearly anything and gives detailed results but requires configuration and an understanding of how to interpret said results). Here's what Linus says about it:


    Greg - get Jens' FIO code. It does things right, including writing actual pseudo-random contents, which shows if the disk does some "de-duplication" (aka "optimize for benchmarks):



    [ https://github.com/axboe/fio/ ]



    Anything else is suspect - forget about bonnie or other traditional tools.





Source: comment left on Google Plus to Greg Kroah-Hartman by Linus Torvalds.






share|improve this answer

































    12














    with the IOPS tool



    If you can't be bothered to read all this I'd just recommend the IOPS tool. It will tell you real-world speed depending on block size.





    Otherwise - when doing an IO benchmark I would look at the following things:




    • blocksize/cache/IOPS/direct vs buffered/async vs sync

    • read/write

    • threads

    • latency

    • CPU utilization



    • Which blocksize will you use: If you want to read/write 1 GB from/to disk this will be quick if you do one I/O operation. But if your application needs to write in 512 byte chunks all over the harddisk in non-sequential pieces (called random I/O although it is not random) this will look differently. Now, databases will do random I/O for the data volume and sequential I/O for the log volume due to their nature. So, first you need to become clear what you want to measure. If you want to copy large video files that's different than if you want to install Linux.



      This blocksize is effecting the count of I/O operations you do. If you do e.g. 8 sequential read (or write, just not mixed) operations the I/O scheduler of the OS will merge them. If it does not, the controller's cache will do the merge. There is practically no difference if you read 8 sequential blocks of 512 bytes or one 4096 bytes chunk. One exception - if you manage to do direct sync IO and wait for the 512 bytes before you request the next 512 bytes. In this case, increasing the block size is like adding cache.



      Also you should be aware that there is sync and async IO: With sync IO you will not issue the next IO request before the current one returns. With async IO you can request e.g. 10 chunks of data and then wait as they arrive. Disctinct database threads will typically use sync IO for log and async IO for data. The IOPS tool takes care of that by measuring all relevant block sizes starting from 512 bytes.




    • Will you read or write: Usually reading is faster than writing. But note that caching works quite a different way for reads and writes:




      • For writes, the data will be handed over to the controller and if it caches, it will acknowledge before the data is on disk unless the cache is full. Using the tool iozone you can draw beautiful graphs of plateaus of cache effects (CPU cache effect and buffer cache effect). The caches becomes less efficient the more has been written.


      • For reads, read data is held in cache after the first read. The first reads take longest and caching becomes more and more effective during uptime. Noteable caches are the CPU cache, the OS' file system cache, the IO controller's cache and the storage's cache. The IOPS tool only measures reads. This allows it to "read all over the place" and you do not want it to write instead of read.




    • How many threads will you use: If you use one thread (using dd for disk benchmarks) you will probably get a much worse performance than with several threads. The IOPS tool takes this into account and reads on several threads.


    • How important is latency for you: Looking at databases, IO latency becomes enormously important. Any insert/update/delete SQL command will be written into the database journal ("log" in database lingo) on commit before it is acknowledged. This means the complete database may be waiting for this IO operation to be completed. I show here how to measure the average wait time (await) using the iostat tool.


    • How important is CPU utilization for you: Your CPU may easily become the bottleneck for your application's performance. In this case you must know how much CPU cycles get burned per byte read/written and optimize into that direction. This can mean to decide for/against PCIe flash memory depending on your measurement results. Again the iostat tool can give you a rough estimation on CPU utilization by your IO operations.







    share|improve this answer





















    • 1





      The iops script is nice, I was really confused that it wasn't on apt or pip though. It does work though.

      – ThorSummoner
      Jan 23 '16 at 5:53











    • The iops tool seem to be abandoned. Also, it just measures reads and doesn't print any statistic figures (e.g. stddev/quantitative ones).

      – maxschlepzig
      Apr 8 '17 at 10:42











    • The iops tool is simple and that is what you need to accomplish comparability. It is basically a wrapper to the read syscall, done by random on a file (everything is a file). Believe it or read the source - it is finished and the code does not need an update. Think about it - do you really want another tool like IOMeter with 1000s of lines of code where each one is debatable? And what do you do with a new version? Will you have to re-do all benchmarks?

      – Thorsten Staerk
      Apr 10 '17 at 18:55



















    8














    If you have installed PostgreSQL, you can use their excellent pg_test_fsync benchmark. It basically test your write sync performance.



    On Ubuntu you find it here: /usr/lib/postgresql/9.5/bin/pg_test_fsync



    The great thing about it, is that this tool will show you why enterprise SSD's are worth the extra $.






    share|improve this answer



















    • 2





      On Debian it is available in postgresql-contrib package.

      – TranslucentCloud
      Mar 15 '17 at 18:27



















    3














    You can use fio - the Multithreaded IO generation tool. It is packaged by several distributions, e.g. Fedora 25, Debian and OpenCSW.



    The fio tool is very flexible, it can be easily used to benchmark various IO
    scenarios - including concurrent ones. The package comes with some example
    configuration files (cf. e.g. /usr/share/doc/fio/examples). It properly measures things, i.e. it also prints the
    standard deviation and quantitative statistics for some figures. Things some
    other popular benchmarking tools don't care about.



    A simple example (a sequence of simple scenarios: sequential/random X read/write):



    $ cat fio.cfg
    [global]
    size=1g
    filename=/dev/sdz

    [randwrite]
    rw=randwrite

    [randread]
    wait_for=randwrite
    rw=randread
    size=256m

    [seqread]
    wait_for=randread
    rw=read

    [seqwrite]
    wait_for=seqread
    rw=write


    The call:



    # fio -o fio-seagate-usb-xyz.log fio.cfg
    $ cat fio-seagate-usb-xyz.log
    [..]
    randwrite: (groupid=0, jobs=1): err= 0: pid=11858: Sun Apr 2 21:23:30 2017
    write: io=1024.0MB, bw=16499KB/s, iops=4124, runt= 63552msec
    clat (usec): min=1, max=148280, avg=240.21, stdev=2216.91
    lat (usec): min=1, max=148280, avg=240.49, stdev=2216.91
    clat percentiles (usec):
    | 1.00th=[ 2], 5.00th=[ 2], 10.00th=[ 2], 20.00th=[ 7],
    | 30.00th=[ 10], 40.00th=[ 11], 50.00th=[ 11], 60.00th=[ 12],
    | 70.00th=[ 14], 80.00th=[ 16], 90.00th=[ 19], 95.00th=[ 25],
    | 99.00th=[ 9408], 99.50th=[10432], 99.90th=[21888], 99.95th=[38144],
    | 99.99th=[92672]
    bw (KB /s): min= 7143, max=371874, per=45.77%, avg=15104.53, stdev=32105.17
    lat (usec) : 2=0.20%, 4=15.36%, 10=6.58%, 20=69.35%, 50=6.07%
    lat (usec) : 100=0.49%, 250=0.07%, 500=0.01%, 750=0.01%
    lat (msec) : 4=0.01%, 10=1.20%, 20=0.54%, 50=0.08%, 100=0.03%
    lat (msec) : 250=0.01%
    cpu : usr=1.04%, sys=4.79%, ctx=4977, majf=0, minf=11
    IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
    submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
    complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
    issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
    latency : target=0, window=0, percentile=100.00%, depth=1
    randread: (groupid=0, jobs=1): err= 0: pid=11876: Sun Apr 2 21:23:30 2017
    read : io=262144KB, bw=797863B/s, iops=194, runt=336443msec
    [..]
    bw (KB /s): min= 312, max= 4513, per=15.19%, avg=591.51, stdev=222.35
    [..]


    Note that the [global] section has global defaults that can be overriden by
    other sections. Each section describes a job, the section name is the job name
    and can be freely choosen. By default, different jobs are started in parallel,
    thus the above example explicitly serializes the job execution with the
    wait_for key. Also, fio uses a block size of 4 KiB - which can be changed, as
    well. The example directly uses the raw device for read and write jobs, thus,
    make sure that you use the right device. The tool also supports using a
    file/directory on existing filesystems.



    Other Tools



    The hdparm utility provides a very simple read benchmark, e.g.:



    # hdparm -t -T /dev/sdz


    It's not a replacement for a state-of-the-art benchmarking tool like fio, it
    just should be used for a first plausibility check. For example, to check if
    the external USB 3 drive is wrongly recognized as USB 2 device (you would see ~
    100 MiB/s vs. ~ 30 MiB/s rates then).






    share|improve this answer





















    • 1





      This answer is essentially a different version of the summary answer unix.stackexchange.com/a/138516/134856 (but with an expanded fio section). I'm torn because it does provide an fio summary but it's quite long and you be able to get away with linking to fio.readthedocs.io/en/latest/fio_doc.html#job-file-format ...

      – Anon
      Jun 30 '17 at 3:28











    • PS: I'd recommend adding direct=1 to the global section of your job so you bypass Linux's page cache and only see the disk's speed (but since your iodepth is only 1... [insert discussion about submitting disk I/O]). It's also easier to use stonewall (fio.readthedocs.io/en/latest/… ) globally to make all jobs run sequentially.

      – Anon
      Jun 30 '17 at 3:41



















    1














    As pointed out here here, you can use gnome-disks (if you use Gnome).



    Click to the the drive that you want to test and the click on "Additional partition options" (the wheels). Then Benchmark Partition. You'll get average read/write in MB/s and average access times in milliseconds. I found that very comfortable.






    share|improve this answer

































      1














      It's a little crude, but this works in a pinch:



      find <path> -type f -print0 | cpio -0o >/dev/null


      You can do some interesting things with this technique, including caching all the /lib and /usr/bin files. You can also use this as part of a benchmarking effort:



      find / -xdev -type f -print0 | 
      sort -R --from0-file=- |
      timeout "5m" cpio -0o >/dev/null


      All filenames on root are found, sorted randomly, and copy them into cache for up to 1 minute. The output from cpio tells you how many blocks were copied. Repeat 3 times to get an average of blocks-per-minute. (Note, the find/sort operation may take a long time -- much longer than the copy. It would be better to cache the find / sort and use split to get a sample of files.)






      share|improve this answer


























        Your Answer








        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "106"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f108838%2fhow-can-i-benchmark-my-hdd%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        7 Answers
        7






        active

        oldest

        votes








        7 Answers
        7






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        60














        I usually use hdparm to benchmark my HDD's. You can benchmark both the direct reads and the cached reads. You'll want to run the commands a couple of times to establish an average value.



        Examples



        Here's a direct read.



        $ sudo hdparm -t /dev/sda2

        /dev/sda2:
        Timing buffered disk reads: 302 MB in 3.00 seconds = 100.58 MB/sec


        And here's a cached read.



        $ sudo hdparm -T /dev/sda2

        /dev/sda2:
        Timing cached reads: 4636 MB in 2.00 seconds = 2318.89 MB/sec


        Details



        -t     Perform  timings  of  device reads for benchmark and comparison 
        purposes. For meaningful results, this operation should be repeated
        2-3 times on an otherwise inactive system (no other active processes)
        with at least a couple of megabytes of free memory. This displays
        the speed of reading through the buffer cache to the disk without
        any prior caching of data. This measurement is an indication of how
        fast the drive can sustain sequential data reads under Linux, without
        any filesystem overhead. To ensure accurate measurements, the
        buffer cache is flushed during the processing of -t using the
        BLKFLSBUF ioctl.

        -T Perform timings of cache reads for benchmark and comparison purposes.
        For meaningful results, this operation should be repeated 2-3
        times on an otherwise inactive system (no other active processes)
        with at least a couple of megabytes of free memory. This displays
        the speed of reading directly from the Linux buffer cache without
        disk access. This measurement is essentially an indication of the
        throughput of the processor, cache, and memory of the system under
        test.


        Using dd



        I too have used dd for this type of testing as well. One modification I would make to the above command is to add this bit to the end of your command, ; rm ddfile.



        $ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"; rm ddfile


        This will remove the ddfile after the command has completed. NOTE: ddfile is a transient file that you don't need to keep, it's the file that dd is writing to (of=ddfile), when it's putting your HDD under load.



        Going beyond



        If you need more rigorous testing of your HDD's you can use Bonnie++.



        References




        • How to use 'dd' to benchmark your disk or CPU?

        • Benchmark disk IO with DD and Bonnie++






        share|improve this answer





















        • 1





          I like hdparm as well, for quick benchmarks. The only downside is it only benchmarks read bandwidth and the performance of many types of block devices (e.g. RAID, iSCSI) can be very asymmetrical. For comparing ‘before’ and ‘after’ performance on the same box, dd works well too.

          – Alexios
          Jan 11 '14 at 10:02













        • @Alexios - yes thanks for mentioning that. Yes you typically have to use at least hdparm + dd or just bonnie++ or all 3.

          – slm
          Jan 11 '14 at 14:14











        • Instead of sync which is questionable use iflag=direct oflag=direct when it is supposed (eg linux with a filesystem that supports direct io).

          – user112865
          May 3 '15 at 3:30
















        60














        I usually use hdparm to benchmark my HDD's. You can benchmark both the direct reads and the cached reads. You'll want to run the commands a couple of times to establish an average value.



        Examples



        Here's a direct read.



        $ sudo hdparm -t /dev/sda2

        /dev/sda2:
        Timing buffered disk reads: 302 MB in 3.00 seconds = 100.58 MB/sec


        And here's a cached read.



        $ sudo hdparm -T /dev/sda2

        /dev/sda2:
        Timing cached reads: 4636 MB in 2.00 seconds = 2318.89 MB/sec


        Details



        -t     Perform  timings  of  device reads for benchmark and comparison 
        purposes. For meaningful results, this operation should be repeated
        2-3 times on an otherwise inactive system (no other active processes)
        with at least a couple of megabytes of free memory. This displays
        the speed of reading through the buffer cache to the disk without
        any prior caching of data. This measurement is an indication of how
        fast the drive can sustain sequential data reads under Linux, without
        any filesystem overhead. To ensure accurate measurements, the
        buffer cache is flushed during the processing of -t using the
        BLKFLSBUF ioctl.

        -T Perform timings of cache reads for benchmark and comparison purposes.
        For meaningful results, this operation should be repeated 2-3
        times on an otherwise inactive system (no other active processes)
        with at least a couple of megabytes of free memory. This displays
        the speed of reading directly from the Linux buffer cache without
        disk access. This measurement is essentially an indication of the
        throughput of the processor, cache, and memory of the system under
        test.


        Using dd



        I too have used dd for this type of testing as well. One modification I would make to the above command is to add this bit to the end of your command, ; rm ddfile.



        $ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"; rm ddfile


        This will remove the ddfile after the command has completed. NOTE: ddfile is a transient file that you don't need to keep, it's the file that dd is writing to (of=ddfile), when it's putting your HDD under load.



        Going beyond



        If you need more rigorous testing of your HDD's you can use Bonnie++.



        References




        • How to use 'dd' to benchmark your disk or CPU?

        • Benchmark disk IO with DD and Bonnie++






        share|improve this answer





















        • 1





          I like hdparm as well, for quick benchmarks. The only downside is it only benchmarks read bandwidth and the performance of many types of block devices (e.g. RAID, iSCSI) can be very asymmetrical. For comparing ‘before’ and ‘after’ performance on the same box, dd works well too.

          – Alexios
          Jan 11 '14 at 10:02













        • @Alexios - yes thanks for mentioning that. Yes you typically have to use at least hdparm + dd or just bonnie++ or all 3.

          – slm
          Jan 11 '14 at 14:14











        • Instead of sync which is questionable use iflag=direct oflag=direct when it is supposed (eg linux with a filesystem that supports direct io).

          – user112865
          May 3 '15 at 3:30














        60












        60








        60







        I usually use hdparm to benchmark my HDD's. You can benchmark both the direct reads and the cached reads. You'll want to run the commands a couple of times to establish an average value.



        Examples



        Here's a direct read.



        $ sudo hdparm -t /dev/sda2

        /dev/sda2:
        Timing buffered disk reads: 302 MB in 3.00 seconds = 100.58 MB/sec


        And here's a cached read.



        $ sudo hdparm -T /dev/sda2

        /dev/sda2:
        Timing cached reads: 4636 MB in 2.00 seconds = 2318.89 MB/sec


        Details



        -t     Perform  timings  of  device reads for benchmark and comparison 
        purposes. For meaningful results, this operation should be repeated
        2-3 times on an otherwise inactive system (no other active processes)
        with at least a couple of megabytes of free memory. This displays
        the speed of reading through the buffer cache to the disk without
        any prior caching of data. This measurement is an indication of how
        fast the drive can sustain sequential data reads under Linux, without
        any filesystem overhead. To ensure accurate measurements, the
        buffer cache is flushed during the processing of -t using the
        BLKFLSBUF ioctl.

        -T Perform timings of cache reads for benchmark and comparison purposes.
        For meaningful results, this operation should be repeated 2-3
        times on an otherwise inactive system (no other active processes)
        with at least a couple of megabytes of free memory. This displays
        the speed of reading directly from the Linux buffer cache without
        disk access. This measurement is essentially an indication of the
        throughput of the processor, cache, and memory of the system under
        test.


        Using dd



        I too have used dd for this type of testing as well. One modification I would make to the above command is to add this bit to the end of your command, ; rm ddfile.



        $ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"; rm ddfile


        This will remove the ddfile after the command has completed. NOTE: ddfile is a transient file that you don't need to keep, it's the file that dd is writing to (of=ddfile), when it's putting your HDD under load.



        Going beyond



        If you need more rigorous testing of your HDD's you can use Bonnie++.



        References




        • How to use 'dd' to benchmark your disk or CPU?

        • Benchmark disk IO with DD and Bonnie++






        share|improve this answer















        I usually use hdparm to benchmark my HDD's. You can benchmark both the direct reads and the cached reads. You'll want to run the commands a couple of times to establish an average value.



        Examples



        Here's a direct read.



        $ sudo hdparm -t /dev/sda2

        /dev/sda2:
        Timing buffered disk reads: 302 MB in 3.00 seconds = 100.58 MB/sec


        And here's a cached read.



        $ sudo hdparm -T /dev/sda2

        /dev/sda2:
        Timing cached reads: 4636 MB in 2.00 seconds = 2318.89 MB/sec


        Details



        -t     Perform  timings  of  device reads for benchmark and comparison 
        purposes. For meaningful results, this operation should be repeated
        2-3 times on an otherwise inactive system (no other active processes)
        with at least a couple of megabytes of free memory. This displays
        the speed of reading through the buffer cache to the disk without
        any prior caching of data. This measurement is an indication of how
        fast the drive can sustain sequential data reads under Linux, without
        any filesystem overhead. To ensure accurate measurements, the
        buffer cache is flushed during the processing of -t using the
        BLKFLSBUF ioctl.

        -T Perform timings of cache reads for benchmark and comparison purposes.
        For meaningful results, this operation should be repeated 2-3
        times on an otherwise inactive system (no other active processes)
        with at least a couple of megabytes of free memory. This displays
        the speed of reading directly from the Linux buffer cache without
        disk access. This measurement is essentially an indication of the
        throughput of the processor, cache, and memory of the system under
        test.


        Using dd



        I too have used dd for this type of testing as well. One modification I would make to the above command is to add this bit to the end of your command, ; rm ddfile.



        $ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"; rm ddfile


        This will remove the ddfile after the command has completed. NOTE: ddfile is a transient file that you don't need to keep, it's the file that dd is writing to (of=ddfile), when it's putting your HDD under load.



        Going beyond



        If you need more rigorous testing of your HDD's you can use Bonnie++.



        References




        • How to use 'dd' to benchmark your disk or CPU?

        • Benchmark disk IO with DD and Bonnie++







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jan 11 '14 at 4:13

























        answered Jan 11 '14 at 3:54









        slmslm

        265k73 gold badges573 silver badges716 bronze badges




        265k73 gold badges573 silver badges716 bronze badges








        • 1





          I like hdparm as well, for quick benchmarks. The only downside is it only benchmarks read bandwidth and the performance of many types of block devices (e.g. RAID, iSCSI) can be very asymmetrical. For comparing ‘before’ and ‘after’ performance on the same box, dd works well too.

          – Alexios
          Jan 11 '14 at 10:02













        • @Alexios - yes thanks for mentioning that. Yes you typically have to use at least hdparm + dd or just bonnie++ or all 3.

          – slm
          Jan 11 '14 at 14:14











        • Instead of sync which is questionable use iflag=direct oflag=direct when it is supposed (eg linux with a filesystem that supports direct io).

          – user112865
          May 3 '15 at 3:30














        • 1





          I like hdparm as well, for quick benchmarks. The only downside is it only benchmarks read bandwidth and the performance of many types of block devices (e.g. RAID, iSCSI) can be very asymmetrical. For comparing ‘before’ and ‘after’ performance on the same box, dd works well too.

          – Alexios
          Jan 11 '14 at 10:02













        • @Alexios - yes thanks for mentioning that. Yes you typically have to use at least hdparm + dd or just bonnie++ or all 3.

          – slm
          Jan 11 '14 at 14:14











        • Instead of sync which is questionable use iflag=direct oflag=direct when it is supposed (eg linux with a filesystem that supports direct io).

          – user112865
          May 3 '15 at 3:30








        1




        1





        I like hdparm as well, for quick benchmarks. The only downside is it only benchmarks read bandwidth and the performance of many types of block devices (e.g. RAID, iSCSI) can be very asymmetrical. For comparing ‘before’ and ‘after’ performance on the same box, dd works well too.

        – Alexios
        Jan 11 '14 at 10:02







        I like hdparm as well, for quick benchmarks. The only downside is it only benchmarks read bandwidth and the performance of many types of block devices (e.g. RAID, iSCSI) can be very asymmetrical. For comparing ‘before’ and ‘after’ performance on the same box, dd works well too.

        – Alexios
        Jan 11 '14 at 10:02















        @Alexios - yes thanks for mentioning that. Yes you typically have to use at least hdparm + dd or just bonnie++ or all 3.

        – slm
        Jan 11 '14 at 14:14





        @Alexios - yes thanks for mentioning that. Yes you typically have to use at least hdparm + dd or just bonnie++ or all 3.

        – slm
        Jan 11 '14 at 14:14













        Instead of sync which is questionable use iflag=direct oflag=direct when it is supposed (eg linux with a filesystem that supports direct io).

        – user112865
        May 3 '15 at 3:30





        Instead of sync which is questionable use iflag=direct oflag=direct when it is supposed (eg linux with a filesystem that supports direct io).

        – user112865
        May 3 '15 at 3:30













        22














        This is a very popular question (you can see variations of it on https://stackoverflow.com/q/1198691 , https://serverfault.com/q/219739/203726 and https://askubuntu.com/q/87035/740413 ).




        Are there better methods [than dd] to [benchmark disks]?




        Yes but they will take longer to run and require knowledge of how to interpret the results - there's no single number that will tell you everything in one go because the following influence the type of test you will run:




        • Are you interested in the performance of I/O that is random, sequential or some mix of the two?

        • Are you reading from or writing to the disk (or some mixture of the two)?

        • Are you concerned about latency, throughput or both?

        • Are you trying to understand how different parts of the same hard disk perform (generally speeds a faster closer to the centre of the disk)?

        • Are you interested in raw numbers or how a given filesystem will perform when using your disk?

        • Are you interested in how a particular size of I/O performs?

        • Are you submitting the I/O synchronously or asynchronously?

        • How much I/O are you submitting (submit too little the wrong way and all the I/O can be cached so you wind up testing the speed of your RAM rather than the speed of the disk)?

        • How compressible is the content of the data you are writing (e.g. zero only data is highly compressible and some filesystems/disks even have a special fast-path for zero only data leading to numbers that are unobtainable with other content)?


        And so on.



        Assuming you're interested in raw, non-filesystem benchmarks here's a short list of tools with easiest to run at the top and difficult/better/more thorough nearer the bottom:




        1. dd (sequential, only shows throughput but configured correctly it can be made to bypass the block cache/wait for I/O to be really completed)

        2. hdparm (sequential, only tests reads, only shows throughput, only works with ATA devices, doesn't account for filesystem overhead but configured correctly it can be made to bypass the block cache)

        3. GNOME Disk Utility's benchmark (easy to run, graphical but requires a full GNOME install, gives latency and throughput numbers for different types of I/O).


        4. fio (can do nearly anything and gives detailed results but requires configuration and an understanding of how to interpret said results). Here's what Linus says about it:


          Greg - get Jens' FIO code. It does things right, including writing actual pseudo-random contents, which shows if the disk does some "de-duplication" (aka "optimize for benchmarks):



          [ https://github.com/axboe/fio/ ]



          Anything else is suspect - forget about bonnie or other traditional tools.





        Source: comment left on Google Plus to Greg Kroah-Hartman by Linus Torvalds.






        share|improve this answer






























          22














          This is a very popular question (you can see variations of it on https://stackoverflow.com/q/1198691 , https://serverfault.com/q/219739/203726 and https://askubuntu.com/q/87035/740413 ).




          Are there better methods [than dd] to [benchmark disks]?




          Yes but they will take longer to run and require knowledge of how to interpret the results - there's no single number that will tell you everything in one go because the following influence the type of test you will run:




          • Are you interested in the performance of I/O that is random, sequential or some mix of the two?

          • Are you reading from or writing to the disk (or some mixture of the two)?

          • Are you concerned about latency, throughput or both?

          • Are you trying to understand how different parts of the same hard disk perform (generally speeds a faster closer to the centre of the disk)?

          • Are you interested in raw numbers or how a given filesystem will perform when using your disk?

          • Are you interested in how a particular size of I/O performs?

          • Are you submitting the I/O synchronously or asynchronously?

          • How much I/O are you submitting (submit too little the wrong way and all the I/O can be cached so you wind up testing the speed of your RAM rather than the speed of the disk)?

          • How compressible is the content of the data you are writing (e.g. zero only data is highly compressible and some filesystems/disks even have a special fast-path for zero only data leading to numbers that are unobtainable with other content)?


          And so on.



          Assuming you're interested in raw, non-filesystem benchmarks here's a short list of tools with easiest to run at the top and difficult/better/more thorough nearer the bottom:




          1. dd (sequential, only shows throughput but configured correctly it can be made to bypass the block cache/wait for I/O to be really completed)

          2. hdparm (sequential, only tests reads, only shows throughput, only works with ATA devices, doesn't account for filesystem overhead but configured correctly it can be made to bypass the block cache)

          3. GNOME Disk Utility's benchmark (easy to run, graphical but requires a full GNOME install, gives latency and throughput numbers for different types of I/O).


          4. fio (can do nearly anything and gives detailed results but requires configuration and an understanding of how to interpret said results). Here's what Linus says about it:


            Greg - get Jens' FIO code. It does things right, including writing actual pseudo-random contents, which shows if the disk does some "de-duplication" (aka "optimize for benchmarks):



            [ https://github.com/axboe/fio/ ]



            Anything else is suspect - forget about bonnie or other traditional tools.





          Source: comment left on Google Plus to Greg Kroah-Hartman by Linus Torvalds.






          share|improve this answer




























            22












            22








            22







            This is a very popular question (you can see variations of it on https://stackoverflow.com/q/1198691 , https://serverfault.com/q/219739/203726 and https://askubuntu.com/q/87035/740413 ).




            Are there better methods [than dd] to [benchmark disks]?




            Yes but they will take longer to run and require knowledge of how to interpret the results - there's no single number that will tell you everything in one go because the following influence the type of test you will run:




            • Are you interested in the performance of I/O that is random, sequential or some mix of the two?

            • Are you reading from or writing to the disk (or some mixture of the two)?

            • Are you concerned about latency, throughput or both?

            • Are you trying to understand how different parts of the same hard disk perform (generally speeds a faster closer to the centre of the disk)?

            • Are you interested in raw numbers or how a given filesystem will perform when using your disk?

            • Are you interested in how a particular size of I/O performs?

            • Are you submitting the I/O synchronously or asynchronously?

            • How much I/O are you submitting (submit too little the wrong way and all the I/O can be cached so you wind up testing the speed of your RAM rather than the speed of the disk)?

            • How compressible is the content of the data you are writing (e.g. zero only data is highly compressible and some filesystems/disks even have a special fast-path for zero only data leading to numbers that are unobtainable with other content)?


            And so on.



            Assuming you're interested in raw, non-filesystem benchmarks here's a short list of tools with easiest to run at the top and difficult/better/more thorough nearer the bottom:




            1. dd (sequential, only shows throughput but configured correctly it can be made to bypass the block cache/wait for I/O to be really completed)

            2. hdparm (sequential, only tests reads, only shows throughput, only works with ATA devices, doesn't account for filesystem overhead but configured correctly it can be made to bypass the block cache)

            3. GNOME Disk Utility's benchmark (easy to run, graphical but requires a full GNOME install, gives latency and throughput numbers for different types of I/O).


            4. fio (can do nearly anything and gives detailed results but requires configuration and an understanding of how to interpret said results). Here's what Linus says about it:


              Greg - get Jens' FIO code. It does things right, including writing actual pseudo-random contents, which shows if the disk does some "de-duplication" (aka "optimize for benchmarks):



              [ https://github.com/axboe/fio/ ]



              Anything else is suspect - forget about bonnie or other traditional tools.





            Source: comment left on Google Plus to Greg Kroah-Hartman by Linus Torvalds.






            share|improve this answer















            This is a very popular question (you can see variations of it on https://stackoverflow.com/q/1198691 , https://serverfault.com/q/219739/203726 and https://askubuntu.com/q/87035/740413 ).




            Are there better methods [than dd] to [benchmark disks]?




            Yes but they will take longer to run and require knowledge of how to interpret the results - there's no single number that will tell you everything in one go because the following influence the type of test you will run:




            • Are you interested in the performance of I/O that is random, sequential or some mix of the two?

            • Are you reading from or writing to the disk (or some mixture of the two)?

            • Are you concerned about latency, throughput or both?

            • Are you trying to understand how different parts of the same hard disk perform (generally speeds a faster closer to the centre of the disk)?

            • Are you interested in raw numbers or how a given filesystem will perform when using your disk?

            • Are you interested in how a particular size of I/O performs?

            • Are you submitting the I/O synchronously or asynchronously?

            • How much I/O are you submitting (submit too little the wrong way and all the I/O can be cached so you wind up testing the speed of your RAM rather than the speed of the disk)?

            • How compressible is the content of the data you are writing (e.g. zero only data is highly compressible and some filesystems/disks even have a special fast-path for zero only data leading to numbers that are unobtainable with other content)?


            And so on.



            Assuming you're interested in raw, non-filesystem benchmarks here's a short list of tools with easiest to run at the top and difficult/better/more thorough nearer the bottom:




            1. dd (sequential, only shows throughput but configured correctly it can be made to bypass the block cache/wait for I/O to be really completed)

            2. hdparm (sequential, only tests reads, only shows throughput, only works with ATA devices, doesn't account for filesystem overhead but configured correctly it can be made to bypass the block cache)

            3. GNOME Disk Utility's benchmark (easy to run, graphical but requires a full GNOME install, gives latency and throughput numbers for different types of I/O).


            4. fio (can do nearly anything and gives detailed results but requires configuration and an understanding of how to interpret said results). Here's what Linus says about it:


              Greg - get Jens' FIO code. It does things right, including writing actual pseudo-random contents, which shows if the disk does some "de-duplication" (aka "optimize for benchmarks):



              [ https://github.com/axboe/fio/ ]



              Anything else is suspect - forget about bonnie or other traditional tools.





            Source: comment left on Google Plus to Greg Kroah-Hartman by Linus Torvalds.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 20 mins ago

























            answered Jun 22 '14 at 7:59









            AnonAnon

            1,65414 silver badges21 bronze badges




            1,65414 silver badges21 bronze badges























                12














                with the IOPS tool



                If you can't be bothered to read all this I'd just recommend the IOPS tool. It will tell you real-world speed depending on block size.





                Otherwise - when doing an IO benchmark I would look at the following things:




                • blocksize/cache/IOPS/direct vs buffered/async vs sync

                • read/write

                • threads

                • latency

                • CPU utilization



                • Which blocksize will you use: If you want to read/write 1 GB from/to disk this will be quick if you do one I/O operation. But if your application needs to write in 512 byte chunks all over the harddisk in non-sequential pieces (called random I/O although it is not random) this will look differently. Now, databases will do random I/O for the data volume and sequential I/O for the log volume due to their nature. So, first you need to become clear what you want to measure. If you want to copy large video files that's different than if you want to install Linux.



                  This blocksize is effecting the count of I/O operations you do. If you do e.g. 8 sequential read (or write, just not mixed) operations the I/O scheduler of the OS will merge them. If it does not, the controller's cache will do the merge. There is practically no difference if you read 8 sequential blocks of 512 bytes or one 4096 bytes chunk. One exception - if you manage to do direct sync IO and wait for the 512 bytes before you request the next 512 bytes. In this case, increasing the block size is like adding cache.



                  Also you should be aware that there is sync and async IO: With sync IO you will not issue the next IO request before the current one returns. With async IO you can request e.g. 10 chunks of data and then wait as they arrive. Disctinct database threads will typically use sync IO for log and async IO for data. The IOPS tool takes care of that by measuring all relevant block sizes starting from 512 bytes.




                • Will you read or write: Usually reading is faster than writing. But note that caching works quite a different way for reads and writes:




                  • For writes, the data will be handed over to the controller and if it caches, it will acknowledge before the data is on disk unless the cache is full. Using the tool iozone you can draw beautiful graphs of plateaus of cache effects (CPU cache effect and buffer cache effect). The caches becomes less efficient the more has been written.


                  • For reads, read data is held in cache after the first read. The first reads take longest and caching becomes more and more effective during uptime. Noteable caches are the CPU cache, the OS' file system cache, the IO controller's cache and the storage's cache. The IOPS tool only measures reads. This allows it to "read all over the place" and you do not want it to write instead of read.




                • How many threads will you use: If you use one thread (using dd for disk benchmarks) you will probably get a much worse performance than with several threads. The IOPS tool takes this into account and reads on several threads.


                • How important is latency for you: Looking at databases, IO latency becomes enormously important. Any insert/update/delete SQL command will be written into the database journal ("log" in database lingo) on commit before it is acknowledged. This means the complete database may be waiting for this IO operation to be completed. I show here how to measure the average wait time (await) using the iostat tool.


                • How important is CPU utilization for you: Your CPU may easily become the bottleneck for your application's performance. In this case you must know how much CPU cycles get burned per byte read/written and optimize into that direction. This can mean to decide for/against PCIe flash memory depending on your measurement results. Again the iostat tool can give you a rough estimation on CPU utilization by your IO operations.







                share|improve this answer





















                • 1





                  The iops script is nice, I was really confused that it wasn't on apt or pip though. It does work though.

                  – ThorSummoner
                  Jan 23 '16 at 5:53











                • The iops tool seem to be abandoned. Also, it just measures reads and doesn't print any statistic figures (e.g. stddev/quantitative ones).

                  – maxschlepzig
                  Apr 8 '17 at 10:42











                • The iops tool is simple and that is what you need to accomplish comparability. It is basically a wrapper to the read syscall, done by random on a file (everything is a file). Believe it or read the source - it is finished and the code does not need an update. Think about it - do you really want another tool like IOMeter with 1000s of lines of code where each one is debatable? And what do you do with a new version? Will you have to re-do all benchmarks?

                  – Thorsten Staerk
                  Apr 10 '17 at 18:55
















                12














                with the IOPS tool



                If you can't be bothered to read all this I'd just recommend the IOPS tool. It will tell you real-world speed depending on block size.





                Otherwise - when doing an IO benchmark I would look at the following things:




                • blocksize/cache/IOPS/direct vs buffered/async vs sync

                • read/write

                • threads

                • latency

                • CPU utilization



                • Which blocksize will you use: If you want to read/write 1 GB from/to disk this will be quick if you do one I/O operation. But if your application needs to write in 512 byte chunks all over the harddisk in non-sequential pieces (called random I/O although it is not random) this will look differently. Now, databases will do random I/O for the data volume and sequential I/O for the log volume due to their nature. So, first you need to become clear what you want to measure. If you want to copy large video files that's different than if you want to install Linux.



                  This blocksize is effecting the count of I/O operations you do. If you do e.g. 8 sequential read (or write, just not mixed) operations the I/O scheduler of the OS will merge them. If it does not, the controller's cache will do the merge. There is practically no difference if you read 8 sequential blocks of 512 bytes or one 4096 bytes chunk. One exception - if you manage to do direct sync IO and wait for the 512 bytes before you request the next 512 bytes. In this case, increasing the block size is like adding cache.



                  Also you should be aware that there is sync and async IO: With sync IO you will not issue the next IO request before the current one returns. With async IO you can request e.g. 10 chunks of data and then wait as they arrive. Disctinct database threads will typically use sync IO for log and async IO for data. The IOPS tool takes care of that by measuring all relevant block sizes starting from 512 bytes.




                • Will you read or write: Usually reading is faster than writing. But note that caching works quite a different way for reads and writes:




                  • For writes, the data will be handed over to the controller and if it caches, it will acknowledge before the data is on disk unless the cache is full. Using the tool iozone you can draw beautiful graphs of plateaus of cache effects (CPU cache effect and buffer cache effect). The caches becomes less efficient the more has been written.


                  • For reads, read data is held in cache after the first read. The first reads take longest and caching becomes more and more effective during uptime. Noteable caches are the CPU cache, the OS' file system cache, the IO controller's cache and the storage's cache. The IOPS tool only measures reads. This allows it to "read all over the place" and you do not want it to write instead of read.




                • How many threads will you use: If you use one thread (using dd for disk benchmarks) you will probably get a much worse performance than with several threads. The IOPS tool takes this into account and reads on several threads.


                • How important is latency for you: Looking at databases, IO latency becomes enormously important. Any insert/update/delete SQL command will be written into the database journal ("log" in database lingo) on commit before it is acknowledged. This means the complete database may be waiting for this IO operation to be completed. I show here how to measure the average wait time (await) using the iostat tool.


                • How important is CPU utilization for you: Your CPU may easily become the bottleneck for your application's performance. In this case you must know how much CPU cycles get burned per byte read/written and optimize into that direction. This can mean to decide for/against PCIe flash memory depending on your measurement results. Again the iostat tool can give you a rough estimation on CPU utilization by your IO operations.







                share|improve this answer





















                • 1





                  The iops script is nice, I was really confused that it wasn't on apt or pip though. It does work though.

                  – ThorSummoner
                  Jan 23 '16 at 5:53











                • The iops tool seem to be abandoned. Also, it just measures reads and doesn't print any statistic figures (e.g. stddev/quantitative ones).

                  – maxschlepzig
                  Apr 8 '17 at 10:42











                • The iops tool is simple and that is what you need to accomplish comparability. It is basically a wrapper to the read syscall, done by random on a file (everything is a file). Believe it or read the source - it is finished and the code does not need an update. Think about it - do you really want another tool like IOMeter with 1000s of lines of code where each one is debatable? And what do you do with a new version? Will you have to re-do all benchmarks?

                  – Thorsten Staerk
                  Apr 10 '17 at 18:55














                12












                12








                12







                with the IOPS tool



                If you can't be bothered to read all this I'd just recommend the IOPS tool. It will tell you real-world speed depending on block size.





                Otherwise - when doing an IO benchmark I would look at the following things:




                • blocksize/cache/IOPS/direct vs buffered/async vs sync

                • read/write

                • threads

                • latency

                • CPU utilization



                • Which blocksize will you use: If you want to read/write 1 GB from/to disk this will be quick if you do one I/O operation. But if your application needs to write in 512 byte chunks all over the harddisk in non-sequential pieces (called random I/O although it is not random) this will look differently. Now, databases will do random I/O for the data volume and sequential I/O for the log volume due to their nature. So, first you need to become clear what you want to measure. If you want to copy large video files that's different than if you want to install Linux.



                  This blocksize is effecting the count of I/O operations you do. If you do e.g. 8 sequential read (or write, just not mixed) operations the I/O scheduler of the OS will merge them. If it does not, the controller's cache will do the merge. There is practically no difference if you read 8 sequential blocks of 512 bytes or one 4096 bytes chunk. One exception - if you manage to do direct sync IO and wait for the 512 bytes before you request the next 512 bytes. In this case, increasing the block size is like adding cache.



                  Also you should be aware that there is sync and async IO: With sync IO you will not issue the next IO request before the current one returns. With async IO you can request e.g. 10 chunks of data and then wait as they arrive. Disctinct database threads will typically use sync IO for log and async IO for data. The IOPS tool takes care of that by measuring all relevant block sizes starting from 512 bytes.




                • Will you read or write: Usually reading is faster than writing. But note that caching works quite a different way for reads and writes:




                  • For writes, the data will be handed over to the controller and if it caches, it will acknowledge before the data is on disk unless the cache is full. Using the tool iozone you can draw beautiful graphs of plateaus of cache effects (CPU cache effect and buffer cache effect). The caches becomes less efficient the more has been written.


                  • For reads, read data is held in cache after the first read. The first reads take longest and caching becomes more and more effective during uptime. Noteable caches are the CPU cache, the OS' file system cache, the IO controller's cache and the storage's cache. The IOPS tool only measures reads. This allows it to "read all over the place" and you do not want it to write instead of read.




                • How many threads will you use: If you use one thread (using dd for disk benchmarks) you will probably get a much worse performance than with several threads. The IOPS tool takes this into account and reads on several threads.


                • How important is latency for you: Looking at databases, IO latency becomes enormously important. Any insert/update/delete SQL command will be written into the database journal ("log" in database lingo) on commit before it is acknowledged. This means the complete database may be waiting for this IO operation to be completed. I show here how to measure the average wait time (await) using the iostat tool.


                • How important is CPU utilization for you: Your CPU may easily become the bottleneck for your application's performance. In this case you must know how much CPU cycles get burned per byte read/written and optimize into that direction. This can mean to decide for/against PCIe flash memory depending on your measurement results. Again the iostat tool can give you a rough estimation on CPU utilization by your IO operations.







                share|improve this answer















                with the IOPS tool



                If you can't be bothered to read all this I'd just recommend the IOPS tool. It will tell you real-world speed depending on block size.





                Otherwise - when doing an IO benchmark I would look at the following things:




                • blocksize/cache/IOPS/direct vs buffered/async vs sync

                • read/write

                • threads

                • latency

                • CPU utilization



                • Which blocksize will you use: If you want to read/write 1 GB from/to disk this will be quick if you do one I/O operation. But if your application needs to write in 512 byte chunks all over the harddisk in non-sequential pieces (called random I/O although it is not random) this will look differently. Now, databases will do random I/O for the data volume and sequential I/O for the log volume due to their nature. So, first you need to become clear what you want to measure. If you want to copy large video files that's different than if you want to install Linux.



                  This blocksize is effecting the count of I/O operations you do. If you do e.g. 8 sequential read (or write, just not mixed) operations the I/O scheduler of the OS will merge them. If it does not, the controller's cache will do the merge. There is practically no difference if you read 8 sequential blocks of 512 bytes or one 4096 bytes chunk. One exception - if you manage to do direct sync IO and wait for the 512 bytes before you request the next 512 bytes. In this case, increasing the block size is like adding cache.



                  Also you should be aware that there is sync and async IO: With sync IO you will not issue the next IO request before the current one returns. With async IO you can request e.g. 10 chunks of data and then wait as they arrive. Disctinct database threads will typically use sync IO for log and async IO for data. The IOPS tool takes care of that by measuring all relevant block sizes starting from 512 bytes.




                • Will you read or write: Usually reading is faster than writing. But note that caching works quite a different way for reads and writes:




                  • For writes, the data will be handed over to the controller and if it caches, it will acknowledge before the data is on disk unless the cache is full. Using the tool iozone you can draw beautiful graphs of plateaus of cache effects (CPU cache effect and buffer cache effect). The caches becomes less efficient the more has been written.


                  • For reads, read data is held in cache after the first read. The first reads take longest and caching becomes more and more effective during uptime. Noteable caches are the CPU cache, the OS' file system cache, the IO controller's cache and the storage's cache. The IOPS tool only measures reads. This allows it to "read all over the place" and you do not want it to write instead of read.




                • How many threads will you use: If you use one thread (using dd for disk benchmarks) you will probably get a much worse performance than with several threads. The IOPS tool takes this into account and reads on several threads.


                • How important is latency for you: Looking at databases, IO latency becomes enormously important. Any insert/update/delete SQL command will be written into the database journal ("log" in database lingo) on commit before it is acknowledged. This means the complete database may be waiting for this IO operation to be completed. I show here how to measure the average wait time (await) using the iostat tool.


                • How important is CPU utilization for you: Your CPU may easily become the bottleneck for your application's performance. In this case you must know how much CPU cycles get burned per byte read/written and optimize into that direction. This can mean to decide for/against PCIe flash memory depending on your measurement results. Again the iostat tool can give you a rough estimation on CPU utilization by your IO operations.








                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Jan 19 '14 at 14:50

























                answered Jan 19 '14 at 11:06









                Thorsten StaerkThorsten Staerk

                2,4021 gold badge14 silver badges24 bronze badges




                2,4021 gold badge14 silver badges24 bronze badges








                • 1





                  The iops script is nice, I was really confused that it wasn't on apt or pip though. It does work though.

                  – ThorSummoner
                  Jan 23 '16 at 5:53











                • The iops tool seem to be abandoned. Also, it just measures reads and doesn't print any statistic figures (e.g. stddev/quantitative ones).

                  – maxschlepzig
                  Apr 8 '17 at 10:42











                • The iops tool is simple and that is what you need to accomplish comparability. It is basically a wrapper to the read syscall, done by random on a file (everything is a file). Believe it or read the source - it is finished and the code does not need an update. Think about it - do you really want another tool like IOMeter with 1000s of lines of code where each one is debatable? And what do you do with a new version? Will you have to re-do all benchmarks?

                  – Thorsten Staerk
                  Apr 10 '17 at 18:55














                • 1





                  The iops script is nice, I was really confused that it wasn't on apt or pip though. It does work though.

                  – ThorSummoner
                  Jan 23 '16 at 5:53











                • The iops tool seem to be abandoned. Also, it just measures reads and doesn't print any statistic figures (e.g. stddev/quantitative ones).

                  – maxschlepzig
                  Apr 8 '17 at 10:42











                • The iops tool is simple and that is what you need to accomplish comparability. It is basically a wrapper to the read syscall, done by random on a file (everything is a file). Believe it or read the source - it is finished and the code does not need an update. Think about it - do you really want another tool like IOMeter with 1000s of lines of code where each one is debatable? And what do you do with a new version? Will you have to re-do all benchmarks?

                  – Thorsten Staerk
                  Apr 10 '17 at 18:55








                1




                1





                The iops script is nice, I was really confused that it wasn't on apt or pip though. It does work though.

                – ThorSummoner
                Jan 23 '16 at 5:53





                The iops script is nice, I was really confused that it wasn't on apt or pip though. It does work though.

                – ThorSummoner
                Jan 23 '16 at 5:53













                The iops tool seem to be abandoned. Also, it just measures reads and doesn't print any statistic figures (e.g. stddev/quantitative ones).

                – maxschlepzig
                Apr 8 '17 at 10:42





                The iops tool seem to be abandoned. Also, it just measures reads and doesn't print any statistic figures (e.g. stddev/quantitative ones).

                – maxschlepzig
                Apr 8 '17 at 10:42













                The iops tool is simple and that is what you need to accomplish comparability. It is basically a wrapper to the read syscall, done by random on a file (everything is a file). Believe it or read the source - it is finished and the code does not need an update. Think about it - do you really want another tool like IOMeter with 1000s of lines of code where each one is debatable? And what do you do with a new version? Will you have to re-do all benchmarks?

                – Thorsten Staerk
                Apr 10 '17 at 18:55





                The iops tool is simple and that is what you need to accomplish comparability. It is basically a wrapper to the read syscall, done by random on a file (everything is a file). Believe it or read the source - it is finished and the code does not need an update. Think about it - do you really want another tool like IOMeter with 1000s of lines of code where each one is debatable? And what do you do with a new version? Will you have to re-do all benchmarks?

                – Thorsten Staerk
                Apr 10 '17 at 18:55











                8














                If you have installed PostgreSQL, you can use their excellent pg_test_fsync benchmark. It basically test your write sync performance.



                On Ubuntu you find it here: /usr/lib/postgresql/9.5/bin/pg_test_fsync



                The great thing about it, is that this tool will show you why enterprise SSD's are worth the extra $.






                share|improve this answer



















                • 2





                  On Debian it is available in postgresql-contrib package.

                  – TranslucentCloud
                  Mar 15 '17 at 18:27
















                8














                If you have installed PostgreSQL, you can use their excellent pg_test_fsync benchmark. It basically test your write sync performance.



                On Ubuntu you find it here: /usr/lib/postgresql/9.5/bin/pg_test_fsync



                The great thing about it, is that this tool will show you why enterprise SSD's are worth the extra $.






                share|improve this answer



















                • 2





                  On Debian it is available in postgresql-contrib package.

                  – TranslucentCloud
                  Mar 15 '17 at 18:27














                8












                8








                8







                If you have installed PostgreSQL, you can use their excellent pg_test_fsync benchmark. It basically test your write sync performance.



                On Ubuntu you find it here: /usr/lib/postgresql/9.5/bin/pg_test_fsync



                The great thing about it, is that this tool will show you why enterprise SSD's are worth the extra $.






                share|improve this answer













                If you have installed PostgreSQL, you can use their excellent pg_test_fsync benchmark. It basically test your write sync performance.



                On Ubuntu you find it here: /usr/lib/postgresql/9.5/bin/pg_test_fsync



                The great thing about it, is that this tool will show you why enterprise SSD's are worth the extra $.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 17 '16 at 22:47









                Olav Grønås GjerdeOlav Grønås Gjerde

                1811 silver badge3 bronze badges




                1811 silver badge3 bronze badges








                • 2





                  On Debian it is available in postgresql-contrib package.

                  – TranslucentCloud
                  Mar 15 '17 at 18:27














                • 2





                  On Debian it is available in postgresql-contrib package.

                  – TranslucentCloud
                  Mar 15 '17 at 18:27








                2




                2





                On Debian it is available in postgresql-contrib package.

                – TranslucentCloud
                Mar 15 '17 at 18:27





                On Debian it is available in postgresql-contrib package.

                – TranslucentCloud
                Mar 15 '17 at 18:27











                3














                You can use fio - the Multithreaded IO generation tool. It is packaged by several distributions, e.g. Fedora 25, Debian and OpenCSW.



                The fio tool is very flexible, it can be easily used to benchmark various IO
                scenarios - including concurrent ones. The package comes with some example
                configuration files (cf. e.g. /usr/share/doc/fio/examples). It properly measures things, i.e. it also prints the
                standard deviation and quantitative statistics for some figures. Things some
                other popular benchmarking tools don't care about.



                A simple example (a sequence of simple scenarios: sequential/random X read/write):



                $ cat fio.cfg
                [global]
                size=1g
                filename=/dev/sdz

                [randwrite]
                rw=randwrite

                [randread]
                wait_for=randwrite
                rw=randread
                size=256m

                [seqread]
                wait_for=randread
                rw=read

                [seqwrite]
                wait_for=seqread
                rw=write


                The call:



                # fio -o fio-seagate-usb-xyz.log fio.cfg
                $ cat fio-seagate-usb-xyz.log
                [..]
                randwrite: (groupid=0, jobs=1): err= 0: pid=11858: Sun Apr 2 21:23:30 2017
                write: io=1024.0MB, bw=16499KB/s, iops=4124, runt= 63552msec
                clat (usec): min=1, max=148280, avg=240.21, stdev=2216.91
                lat (usec): min=1, max=148280, avg=240.49, stdev=2216.91
                clat percentiles (usec):
                | 1.00th=[ 2], 5.00th=[ 2], 10.00th=[ 2], 20.00th=[ 7],
                | 30.00th=[ 10], 40.00th=[ 11], 50.00th=[ 11], 60.00th=[ 12],
                | 70.00th=[ 14], 80.00th=[ 16], 90.00th=[ 19], 95.00th=[ 25],
                | 99.00th=[ 9408], 99.50th=[10432], 99.90th=[21888], 99.95th=[38144],
                | 99.99th=[92672]
                bw (KB /s): min= 7143, max=371874, per=45.77%, avg=15104.53, stdev=32105.17
                lat (usec) : 2=0.20%, 4=15.36%, 10=6.58%, 20=69.35%, 50=6.07%
                lat (usec) : 100=0.49%, 250=0.07%, 500=0.01%, 750=0.01%
                lat (msec) : 4=0.01%, 10=1.20%, 20=0.54%, 50=0.08%, 100=0.03%
                lat (msec) : 250=0.01%
                cpu : usr=1.04%, sys=4.79%, ctx=4977, majf=0, minf=11
                IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
                submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
                complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
                issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
                latency : target=0, window=0, percentile=100.00%, depth=1
                randread: (groupid=0, jobs=1): err= 0: pid=11876: Sun Apr 2 21:23:30 2017
                read : io=262144KB, bw=797863B/s, iops=194, runt=336443msec
                [..]
                bw (KB /s): min= 312, max= 4513, per=15.19%, avg=591.51, stdev=222.35
                [..]


                Note that the [global] section has global defaults that can be overriden by
                other sections. Each section describes a job, the section name is the job name
                and can be freely choosen. By default, different jobs are started in parallel,
                thus the above example explicitly serializes the job execution with the
                wait_for key. Also, fio uses a block size of 4 KiB - which can be changed, as
                well. The example directly uses the raw device for read and write jobs, thus,
                make sure that you use the right device. The tool also supports using a
                file/directory on existing filesystems.



                Other Tools



                The hdparm utility provides a very simple read benchmark, e.g.:



                # hdparm -t -T /dev/sdz


                It's not a replacement for a state-of-the-art benchmarking tool like fio, it
                just should be used for a first plausibility check. For example, to check if
                the external USB 3 drive is wrongly recognized as USB 2 device (you would see ~
                100 MiB/s vs. ~ 30 MiB/s rates then).






                share|improve this answer





















                • 1





                  This answer is essentially a different version of the summary answer unix.stackexchange.com/a/138516/134856 (but with an expanded fio section). I'm torn because it does provide an fio summary but it's quite long and you be able to get away with linking to fio.readthedocs.io/en/latest/fio_doc.html#job-file-format ...

                  – Anon
                  Jun 30 '17 at 3:28











                • PS: I'd recommend adding direct=1 to the global section of your job so you bypass Linux's page cache and only see the disk's speed (but since your iodepth is only 1... [insert discussion about submitting disk I/O]). It's also easier to use stonewall (fio.readthedocs.io/en/latest/… ) globally to make all jobs run sequentially.

                  – Anon
                  Jun 30 '17 at 3:41
















                3














                You can use fio - the Multithreaded IO generation tool. It is packaged by several distributions, e.g. Fedora 25, Debian and OpenCSW.



                The fio tool is very flexible, it can be easily used to benchmark various IO
                scenarios - including concurrent ones. The package comes with some example
                configuration files (cf. e.g. /usr/share/doc/fio/examples). It properly measures things, i.e. it also prints the
                standard deviation and quantitative statistics for some figures. Things some
                other popular benchmarking tools don't care about.



                A simple example (a sequence of simple scenarios: sequential/random X read/write):



                $ cat fio.cfg
                [global]
                size=1g
                filename=/dev/sdz

                [randwrite]
                rw=randwrite

                [randread]
                wait_for=randwrite
                rw=randread
                size=256m

                [seqread]
                wait_for=randread
                rw=read

                [seqwrite]
                wait_for=seqread
                rw=write


                The call:



                # fio -o fio-seagate-usb-xyz.log fio.cfg
                $ cat fio-seagate-usb-xyz.log
                [..]
                randwrite: (groupid=0, jobs=1): err= 0: pid=11858: Sun Apr 2 21:23:30 2017
                write: io=1024.0MB, bw=16499KB/s, iops=4124, runt= 63552msec
                clat (usec): min=1, max=148280, avg=240.21, stdev=2216.91
                lat (usec): min=1, max=148280, avg=240.49, stdev=2216.91
                clat percentiles (usec):
                | 1.00th=[ 2], 5.00th=[ 2], 10.00th=[ 2], 20.00th=[ 7],
                | 30.00th=[ 10], 40.00th=[ 11], 50.00th=[ 11], 60.00th=[ 12],
                | 70.00th=[ 14], 80.00th=[ 16], 90.00th=[ 19], 95.00th=[ 25],
                | 99.00th=[ 9408], 99.50th=[10432], 99.90th=[21888], 99.95th=[38144],
                | 99.99th=[92672]
                bw (KB /s): min= 7143, max=371874, per=45.77%, avg=15104.53, stdev=32105.17
                lat (usec) : 2=0.20%, 4=15.36%, 10=6.58%, 20=69.35%, 50=6.07%
                lat (usec) : 100=0.49%, 250=0.07%, 500=0.01%, 750=0.01%
                lat (msec) : 4=0.01%, 10=1.20%, 20=0.54%, 50=0.08%, 100=0.03%
                lat (msec) : 250=0.01%
                cpu : usr=1.04%, sys=4.79%, ctx=4977, majf=0, minf=11
                IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
                submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
                complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
                issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
                latency : target=0, window=0, percentile=100.00%, depth=1
                randread: (groupid=0, jobs=1): err= 0: pid=11876: Sun Apr 2 21:23:30 2017
                read : io=262144KB, bw=797863B/s, iops=194, runt=336443msec
                [..]
                bw (KB /s): min= 312, max= 4513, per=15.19%, avg=591.51, stdev=222.35
                [..]


                Note that the [global] section has global defaults that can be overriden by
                other sections. Each section describes a job, the section name is the job name
                and can be freely choosen. By default, different jobs are started in parallel,
                thus the above example explicitly serializes the job execution with the
                wait_for key. Also, fio uses a block size of 4 KiB - which can be changed, as
                well. The example directly uses the raw device for read and write jobs, thus,
                make sure that you use the right device. The tool also supports using a
                file/directory on existing filesystems.



                Other Tools



                The hdparm utility provides a very simple read benchmark, e.g.:



                # hdparm -t -T /dev/sdz


                It's not a replacement for a state-of-the-art benchmarking tool like fio, it
                just should be used for a first plausibility check. For example, to check if
                the external USB 3 drive is wrongly recognized as USB 2 device (you would see ~
                100 MiB/s vs. ~ 30 MiB/s rates then).






                share|improve this answer





















                • 1





                  This answer is essentially a different version of the summary answer unix.stackexchange.com/a/138516/134856 (but with an expanded fio section). I'm torn because it does provide an fio summary but it's quite long and you be able to get away with linking to fio.readthedocs.io/en/latest/fio_doc.html#job-file-format ...

                  – Anon
                  Jun 30 '17 at 3:28











                • PS: I'd recommend adding direct=1 to the global section of your job so you bypass Linux's page cache and only see the disk's speed (but since your iodepth is only 1... [insert discussion about submitting disk I/O]). It's also easier to use stonewall (fio.readthedocs.io/en/latest/… ) globally to make all jobs run sequentially.

                  – Anon
                  Jun 30 '17 at 3:41














                3












                3








                3







                You can use fio - the Multithreaded IO generation tool. It is packaged by several distributions, e.g. Fedora 25, Debian and OpenCSW.



                The fio tool is very flexible, it can be easily used to benchmark various IO
                scenarios - including concurrent ones. The package comes with some example
                configuration files (cf. e.g. /usr/share/doc/fio/examples). It properly measures things, i.e. it also prints the
                standard deviation and quantitative statistics for some figures. Things some
                other popular benchmarking tools don't care about.



                A simple example (a sequence of simple scenarios: sequential/random X read/write):



                $ cat fio.cfg
                [global]
                size=1g
                filename=/dev/sdz

                [randwrite]
                rw=randwrite

                [randread]
                wait_for=randwrite
                rw=randread
                size=256m

                [seqread]
                wait_for=randread
                rw=read

                [seqwrite]
                wait_for=seqread
                rw=write


                The call:



                # fio -o fio-seagate-usb-xyz.log fio.cfg
                $ cat fio-seagate-usb-xyz.log
                [..]
                randwrite: (groupid=0, jobs=1): err= 0: pid=11858: Sun Apr 2 21:23:30 2017
                write: io=1024.0MB, bw=16499KB/s, iops=4124, runt= 63552msec
                clat (usec): min=1, max=148280, avg=240.21, stdev=2216.91
                lat (usec): min=1, max=148280, avg=240.49, stdev=2216.91
                clat percentiles (usec):
                | 1.00th=[ 2], 5.00th=[ 2], 10.00th=[ 2], 20.00th=[ 7],
                | 30.00th=[ 10], 40.00th=[ 11], 50.00th=[ 11], 60.00th=[ 12],
                | 70.00th=[ 14], 80.00th=[ 16], 90.00th=[ 19], 95.00th=[ 25],
                | 99.00th=[ 9408], 99.50th=[10432], 99.90th=[21888], 99.95th=[38144],
                | 99.99th=[92672]
                bw (KB /s): min= 7143, max=371874, per=45.77%, avg=15104.53, stdev=32105.17
                lat (usec) : 2=0.20%, 4=15.36%, 10=6.58%, 20=69.35%, 50=6.07%
                lat (usec) : 100=0.49%, 250=0.07%, 500=0.01%, 750=0.01%
                lat (msec) : 4=0.01%, 10=1.20%, 20=0.54%, 50=0.08%, 100=0.03%
                lat (msec) : 250=0.01%
                cpu : usr=1.04%, sys=4.79%, ctx=4977, majf=0, minf=11
                IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
                submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
                complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
                issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
                latency : target=0, window=0, percentile=100.00%, depth=1
                randread: (groupid=0, jobs=1): err= 0: pid=11876: Sun Apr 2 21:23:30 2017
                read : io=262144KB, bw=797863B/s, iops=194, runt=336443msec
                [..]
                bw (KB /s): min= 312, max= 4513, per=15.19%, avg=591.51, stdev=222.35
                [..]


                Note that the [global] section has global defaults that can be overriden by
                other sections. Each section describes a job, the section name is the job name
                and can be freely choosen. By default, different jobs are started in parallel,
                thus the above example explicitly serializes the job execution with the
                wait_for key. Also, fio uses a block size of 4 KiB - which can be changed, as
                well. The example directly uses the raw device for read and write jobs, thus,
                make sure that you use the right device. The tool also supports using a
                file/directory on existing filesystems.



                Other Tools



                The hdparm utility provides a very simple read benchmark, e.g.:



                # hdparm -t -T /dev/sdz


                It's not a replacement for a state-of-the-art benchmarking tool like fio, it
                just should be used for a first plausibility check. For example, to check if
                the external USB 3 drive is wrongly recognized as USB 2 device (you would see ~
                100 MiB/s vs. ~ 30 MiB/s rates then).






                share|improve this answer















                You can use fio - the Multithreaded IO generation tool. It is packaged by several distributions, e.g. Fedora 25, Debian and OpenCSW.



                The fio tool is very flexible, it can be easily used to benchmark various IO
                scenarios - including concurrent ones. The package comes with some example
                configuration files (cf. e.g. /usr/share/doc/fio/examples). It properly measures things, i.e. it also prints the
                standard deviation and quantitative statistics for some figures. Things some
                other popular benchmarking tools don't care about.



                A simple example (a sequence of simple scenarios: sequential/random X read/write):



                $ cat fio.cfg
                [global]
                size=1g
                filename=/dev/sdz

                [randwrite]
                rw=randwrite

                [randread]
                wait_for=randwrite
                rw=randread
                size=256m

                [seqread]
                wait_for=randread
                rw=read

                [seqwrite]
                wait_for=seqread
                rw=write


                The call:



                # fio -o fio-seagate-usb-xyz.log fio.cfg
                $ cat fio-seagate-usb-xyz.log
                [..]
                randwrite: (groupid=0, jobs=1): err= 0: pid=11858: Sun Apr 2 21:23:30 2017
                write: io=1024.0MB, bw=16499KB/s, iops=4124, runt= 63552msec
                clat (usec): min=1, max=148280, avg=240.21, stdev=2216.91
                lat (usec): min=1, max=148280, avg=240.49, stdev=2216.91
                clat percentiles (usec):
                | 1.00th=[ 2], 5.00th=[ 2], 10.00th=[ 2], 20.00th=[ 7],
                | 30.00th=[ 10], 40.00th=[ 11], 50.00th=[ 11], 60.00th=[ 12],
                | 70.00th=[ 14], 80.00th=[ 16], 90.00th=[ 19], 95.00th=[ 25],
                | 99.00th=[ 9408], 99.50th=[10432], 99.90th=[21888], 99.95th=[38144],
                | 99.99th=[92672]
                bw (KB /s): min= 7143, max=371874, per=45.77%, avg=15104.53, stdev=32105.17
                lat (usec) : 2=0.20%, 4=15.36%, 10=6.58%, 20=69.35%, 50=6.07%
                lat (usec) : 100=0.49%, 250=0.07%, 500=0.01%, 750=0.01%
                lat (msec) : 4=0.01%, 10=1.20%, 20=0.54%, 50=0.08%, 100=0.03%
                lat (msec) : 250=0.01%
                cpu : usr=1.04%, sys=4.79%, ctx=4977, majf=0, minf=11
                IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
                submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
                complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
                issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
                latency : target=0, window=0, percentile=100.00%, depth=1
                randread: (groupid=0, jobs=1): err= 0: pid=11876: Sun Apr 2 21:23:30 2017
                read : io=262144KB, bw=797863B/s, iops=194, runt=336443msec
                [..]
                bw (KB /s): min= 312, max= 4513, per=15.19%, avg=591.51, stdev=222.35
                [..]


                Note that the [global] section has global defaults that can be overriden by
                other sections. Each section describes a job, the section name is the job name
                and can be freely choosen. By default, different jobs are started in parallel,
                thus the above example explicitly serializes the job execution with the
                wait_for key. Also, fio uses a block size of 4 KiB - which can be changed, as
                well. The example directly uses the raw device for read and write jobs, thus,
                make sure that you use the right device. The tool also supports using a
                file/directory on existing filesystems.



                Other Tools



                The hdparm utility provides a very simple read benchmark, e.g.:



                # hdparm -t -T /dev/sdz


                It's not a replacement for a state-of-the-art benchmarking tool like fio, it
                just should be used for a first plausibility check. For example, to check if
                the external USB 3 drive is wrongly recognized as USB 2 device (you would see ~
                100 MiB/s vs. ~ 30 MiB/s rates then).







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Apr 8 '17 at 11:14

























                answered Apr 8 '17 at 10:37









                maxschlepzigmaxschlepzig

                35.8k35 gold badges146 silver badges221 bronze badges




                35.8k35 gold badges146 silver badges221 bronze badges








                • 1





                  This answer is essentially a different version of the summary answer unix.stackexchange.com/a/138516/134856 (but with an expanded fio section). I'm torn because it does provide an fio summary but it's quite long and you be able to get away with linking to fio.readthedocs.io/en/latest/fio_doc.html#job-file-format ...

                  – Anon
                  Jun 30 '17 at 3:28











                • PS: I'd recommend adding direct=1 to the global section of your job so you bypass Linux's page cache and only see the disk's speed (but since your iodepth is only 1... [insert discussion about submitting disk I/O]). It's also easier to use stonewall (fio.readthedocs.io/en/latest/… ) globally to make all jobs run sequentially.

                  – Anon
                  Jun 30 '17 at 3:41














                • 1





                  This answer is essentially a different version of the summary answer unix.stackexchange.com/a/138516/134856 (but with an expanded fio section). I'm torn because it does provide an fio summary but it's quite long and you be able to get away with linking to fio.readthedocs.io/en/latest/fio_doc.html#job-file-format ...

                  – Anon
                  Jun 30 '17 at 3:28











                • PS: I'd recommend adding direct=1 to the global section of your job so you bypass Linux's page cache and only see the disk's speed (but since your iodepth is only 1... [insert discussion about submitting disk I/O]). It's also easier to use stonewall (fio.readthedocs.io/en/latest/… ) globally to make all jobs run sequentially.

                  – Anon
                  Jun 30 '17 at 3:41








                1




                1





                This answer is essentially a different version of the summary answer unix.stackexchange.com/a/138516/134856 (but with an expanded fio section). I'm torn because it does provide an fio summary but it's quite long and you be able to get away with linking to fio.readthedocs.io/en/latest/fio_doc.html#job-file-format ...

                – Anon
                Jun 30 '17 at 3:28





                This answer is essentially a different version of the summary answer unix.stackexchange.com/a/138516/134856 (but with an expanded fio section). I'm torn because it does provide an fio summary but it's quite long and you be able to get away with linking to fio.readthedocs.io/en/latest/fio_doc.html#job-file-format ...

                – Anon
                Jun 30 '17 at 3:28













                PS: I'd recommend adding direct=1 to the global section of your job so you bypass Linux's page cache and only see the disk's speed (but since your iodepth is only 1... [insert discussion about submitting disk I/O]). It's also easier to use stonewall (fio.readthedocs.io/en/latest/… ) globally to make all jobs run sequentially.

                – Anon
                Jun 30 '17 at 3:41





                PS: I'd recommend adding direct=1 to the global section of your job so you bypass Linux's page cache and only see the disk's speed (but since your iodepth is only 1... [insert discussion about submitting disk I/O]). It's also easier to use stonewall (fio.readthedocs.io/en/latest/… ) globally to make all jobs run sequentially.

                – Anon
                Jun 30 '17 at 3:41











                1














                As pointed out here here, you can use gnome-disks (if you use Gnome).



                Click to the the drive that you want to test and the click on "Additional partition options" (the wheels). Then Benchmark Partition. You'll get average read/write in MB/s and average access times in milliseconds. I found that very comfortable.






                share|improve this answer






























                  1














                  As pointed out here here, you can use gnome-disks (if you use Gnome).



                  Click to the the drive that you want to test and the click on "Additional partition options" (the wheels). Then Benchmark Partition. You'll get average read/write in MB/s and average access times in milliseconds. I found that very comfortable.






                  share|improve this answer




























                    1












                    1








                    1







                    As pointed out here here, you can use gnome-disks (if you use Gnome).



                    Click to the the drive that you want to test and the click on "Additional partition options" (the wheels). Then Benchmark Partition. You'll get average read/write in MB/s and average access times in milliseconds. I found that very comfortable.






                    share|improve this answer















                    As pointed out here here, you can use gnome-disks (if you use Gnome).



                    Click to the the drive that you want to test and the click on "Additional partition options" (the wheels). Then Benchmark Partition. You'll get average read/write in MB/s and average access times in milliseconds. I found that very comfortable.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Apr 13 '17 at 12:22









                    Community

                    1




                    1










                    answered Dec 22 '16 at 15:24









                    SörenSören

                    1296 bronze badges




                    1296 bronze badges























                        1














                        It's a little crude, but this works in a pinch:



                        find <path> -type f -print0 | cpio -0o >/dev/null


                        You can do some interesting things with this technique, including caching all the /lib and /usr/bin files. You can also use this as part of a benchmarking effort:



                        find / -xdev -type f -print0 | 
                        sort -R --from0-file=- |
                        timeout "5m" cpio -0o >/dev/null


                        All filenames on root are found, sorted randomly, and copy them into cache for up to 1 minute. The output from cpio tells you how many blocks were copied. Repeat 3 times to get an average of blocks-per-minute. (Note, the find/sort operation may take a long time -- much longer than the copy. It would be better to cache the find / sort and use split to get a sample of files.)






                        share|improve this answer




























                          1














                          It's a little crude, but this works in a pinch:



                          find <path> -type f -print0 | cpio -0o >/dev/null


                          You can do some interesting things with this technique, including caching all the /lib and /usr/bin files. You can also use this as part of a benchmarking effort:



                          find / -xdev -type f -print0 | 
                          sort -R --from0-file=- |
                          timeout "5m" cpio -0o >/dev/null


                          All filenames on root are found, sorted randomly, and copy them into cache for up to 1 minute. The output from cpio tells you how many blocks were copied. Repeat 3 times to get an average of blocks-per-minute. (Note, the find/sort operation may take a long time -- much longer than the copy. It would be better to cache the find / sort and use split to get a sample of files.)






                          share|improve this answer


























                            1












                            1








                            1







                            It's a little crude, but this works in a pinch:



                            find <path> -type f -print0 | cpio -0o >/dev/null


                            You can do some interesting things with this technique, including caching all the /lib and /usr/bin files. You can also use this as part of a benchmarking effort:



                            find / -xdev -type f -print0 | 
                            sort -R --from0-file=- |
                            timeout "5m" cpio -0o >/dev/null


                            All filenames on root are found, sorted randomly, and copy them into cache for up to 1 minute. The output from cpio tells you how many blocks were copied. Repeat 3 times to get an average of blocks-per-minute. (Note, the find/sort operation may take a long time -- much longer than the copy. It would be better to cache the find / sort and use split to get a sample of files.)






                            share|improve this answer













                            It's a little crude, but this works in a pinch:



                            find <path> -type f -print0 | cpio -0o >/dev/null


                            You can do some interesting things with this technique, including caching all the /lib and /usr/bin files. You can also use this as part of a benchmarking effort:



                            find / -xdev -type f -print0 | 
                            sort -R --from0-file=- |
                            timeout "5m" cpio -0o >/dev/null


                            All filenames on root are found, sorted randomly, and copy them into cache for up to 1 minute. The output from cpio tells you how many blocks were copied. Repeat 3 times to get an average of blocks-per-minute. (Note, the find/sort operation may take a long time -- much longer than the copy. It would be better to cache the find / sort and use split to get a sample of files.)







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Mar 16 '17 at 11:20









                            OtheusOtheus

                            3,63411 silver badges34 bronze badges




                            3,63411 silver badges34 bronze badges






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f108838%2fhow-can-i-benchmark-my-hdd%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

                                Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

                                Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...