Why doesn't an NVMe connection on an SSD make non-sequential access faster?NVMe ssd: Why is 4k writing faster...

How to create large inductors (1H) for audio use?

Why Is Sojdlg123aljg a Common Password?

Temporarily simulate being offline programmatically

"syntax error near unexpected token" after editing .bashrc

Professor refuses to write a recommendation letter to students who haven't written a research paper with him

How do German speakers decide what should be on the left side of the verb?

Remaining in the US beyond VWP admission period

Notation: grace note played on the beat with a chord

Sinning and G-d's will, what's wrong with this logic?

Balm of the Summer Court fey energy dice usage limits

convenient Vector3f class

Male viewpoint in an erotic novel

In-universe, why does Doc Brown program the time machine to go to 1955?

Add builder hat to other people with tikzpeople

Was the lunar landing site always in the same plane as the CM's orbit?

Do I need to declare engagement ring bought in UK when flying on holiday to US?

How do I make my fill-in-the-blank exercise more obvious?

Dissuading my girlfriend from a scam

Friend is very nit picky about side comments I don't intend to be taken too seriously

Can taking my 1-week-old on a 6-7 hours journey in the car lead to medical complications?

How could a planet have one hemisphere way warmer than the other without the planet being tidally locked?

Was Rosie the Riveter sourced from a Michelangelo painting?

Why there is no wireless switch?

Are there mathematical concepts that exist in the fourth dimension, but not in the third dimension?



Why doesn't an NVMe connection on an SSD make non-sequential access faster?


NVMe ssd: Why is 4k writing faster than reading?How can I use my small SSD as a cache for a larger hard disk?ExpressCache vs. Intel's Rapid Storage and Smart Response TechnologiesDo I need to connect the jumper pins in a SATA hard drive to anything?NVMe ssd: Why is 4k writing faster than reading?Identical SSDs on same port: Why is one SATA/600 and the other SATA/150?NVMe SSD and Windows NTFS compression - effects on performance?Different rpm but same speed?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







1















Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.



But in real life non-sequential access turns out to have only little benefit over a SATA SSD.



EDIT



Since the answers are concentrating on the difference between large and small files, let me clarify my question:




  • Yes, small files will have overhead.

  • And yes, they will waste time by reading data that will be ignored.

  • But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.


Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:




(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)











share|improve this question






















  • 1





    in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

    – Twisty Impersonator
    8 hours ago













  • @TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

    – ispiro
    7 hours ago











  • Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

    – Mokubai
    7 hours ago













  • @Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

    – ispiro
    7 hours ago






  • 1





    @JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

    – ispiro
    7 hours ago


















1















Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.



But in real life non-sequential access turns out to have only little benefit over a SATA SSD.



EDIT



Since the answers are concentrating on the difference between large and small files, let me clarify my question:




  • Yes, small files will have overhead.

  • And yes, they will waste time by reading data that will be ignored.

  • But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.


Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:




(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)











share|improve this question






















  • 1





    in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

    – Twisty Impersonator
    8 hours ago













  • @TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

    – ispiro
    7 hours ago











  • Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

    – Mokubai
    7 hours ago













  • @Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

    – ispiro
    7 hours ago






  • 1





    @JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

    – ispiro
    7 hours ago














1












1








1








Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.



But in real life non-sequential access turns out to have only little benefit over a SATA SSD.



EDIT



Since the answers are concentrating on the difference between large and small files, let me clarify my question:




  • Yes, small files will have overhead.

  • And yes, they will waste time by reading data that will be ignored.

  • But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.


Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:




(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)











share|improve this question
















Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.



But in real life non-sequential access turns out to have only little benefit over a SATA SSD.



EDIT



Since the answers are concentrating on the difference between large and small files, let me clarify my question:




  • Yes, small files will have overhead.

  • And yes, they will waste time by reading data that will be ignored.

  • But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.


Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:




(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)








hard-drive ssd sata nvme






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 6 hours ago







ispiro

















asked 8 hours ago









ispiroispiro

7123 gold badges13 silver badges34 bronze badges




7123 gold badges13 silver badges34 bronze badges











  • 1





    in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

    – Twisty Impersonator
    8 hours ago













  • @TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

    – ispiro
    7 hours ago











  • Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

    – Mokubai
    7 hours ago













  • @Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

    – ispiro
    7 hours ago






  • 1





    @JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

    – ispiro
    7 hours ago














  • 1





    in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

    – Twisty Impersonator
    8 hours ago













  • @TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

    – ispiro
    7 hours ago











  • Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

    – Mokubai
    7 hours ago













  • @Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

    – ispiro
    7 hours ago






  • 1





    @JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

    – ispiro
    7 hours ago








1




1





in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

– Twisty Impersonator
8 hours ago







in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.

– Twisty Impersonator
8 hours ago















@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

– ispiro
7 hours ago





@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.

– ispiro
7 hours ago













Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

– Mokubai
7 hours ago







Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...

– Mokubai
7 hours ago















@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

– ispiro
7 hours ago





@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.

– ispiro
7 hours ago




1




1





@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

– ispiro
7 hours ago





@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?

– ispiro
7 hours ago










2 Answers
2






active

oldest

votes


















3
















The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.






share|improve this answer


























  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    7 hours ago













  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago








  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago



















1
















Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.






share|improve this answer




























  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago













  • I added more info to the answer.

    – harrymc
    8 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    8 hours ago













  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago














Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});















draft saved

draft discarded
















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1479346%2fwhy-doesnt-an-nvme-connection-on-an-ssd-make-non-sequential-access-faster%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









3
















The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.






share|improve this answer


























  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    7 hours ago













  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago








  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago
















3
















The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.






share|improve this answer


























  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    7 hours ago













  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago








  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago














3














3










3









The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.






share|improve this answer













The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.



Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.



Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.



The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.



The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.



If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.



So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.







share|improve this answer












share|improve this answer



share|improve this answer










answered 7 hours ago









MokubaiMokubai

60.5k16 gold badges140 silver badges161 bronze badges




60.5k16 gold badges140 silver badges161 bronze badges
















  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    7 hours ago













  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago








  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago



















  • There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

    – harrymc
    7 hours ago













  • Thanks. This makes sense.

    – ispiro
    6 hours ago











  • @harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

    – ispiro
    6 hours ago








  • 1





    @harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

    – Mokubai
    6 hours ago






  • 1





    @ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

    – Mokubai
    6 hours ago

















There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

– harrymc
7 hours ago







There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.

– harrymc
7 hours ago















Thanks. This makes sense.

– ispiro
6 hours ago





Thanks. This makes sense.

– ispiro
6 hours ago













@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

– ispiro
6 hours ago







@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?

– ispiro
6 hours ago






1




1





@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

– Mokubai
6 hours ago





@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.

– Mokubai
6 hours ago




1




1





@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

– Mokubai
6 hours ago





@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.

– Mokubai
6 hours ago













1
















Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.






share|improve this answer




























  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago













  • I added more info to the answer.

    – harrymc
    8 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    8 hours ago













  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago
















1
















Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.






share|improve this answer




























  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago













  • I added more info to the answer.

    – harrymc
    8 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    8 hours ago













  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago














1














1










1









Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.






share|improve this answer















Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).



This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.



It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.



Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.







share|improve this answer














share|improve this answer



share|improve this answer








edited 6 hours ago

























answered 8 hours ago









harrymcharrymc

283k16 gold badges300 silver badges615 bronze badges




283k16 gold badges300 silver badges615 bronze badges
















  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago













  • I added more info to the answer.

    – harrymc
    8 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    8 hours ago













  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago



















  • This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

    – ispiro
    8 hours ago













  • I added more info to the answer.

    – harrymc
    8 hours ago











  • updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

    – ispiro
    8 hours ago













  • It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

    – harrymc
    6 hours ago

















This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

– ispiro
8 hours ago







This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)

– ispiro
8 hours ago















I added more info to the answer.

– harrymc
8 hours ago





I added more info to the answer.

– harrymc
8 hours ago













updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

– ispiro
8 hours ago







updating RAM tables shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts. Closing a file also causes the operating system to flush it to the disk - this should be 5 times (or whatever) as fast. system calls - they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.

– ispiro
8 hours ago















It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

– harrymc
6 hours ago





It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.

– harrymc
6 hours ago



















draft saved

draft discarded



















































Thanks for contributing an answer to Super User!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1479346%2fwhy-doesnt-an-nvme-connection-on-an-ssd-make-non-sequential-access-faster%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...