Why doesn't an NVMe connection on an SSD make non-sequential access faster?NVMe ssd: Why is 4k writing faster...
How to create large inductors (1H) for audio use?
Why Is Sojdlg123aljg a Common Password?
Temporarily simulate being offline programmatically
"syntax error near unexpected token" after editing .bashrc
Professor refuses to write a recommendation letter to students who haven't written a research paper with him
How do German speakers decide what should be on the left side of the verb?
Remaining in the US beyond VWP admission period
Notation: grace note played on the beat with a chord
Sinning and G-d's will, what's wrong with this logic?
Balm of the Summer Court fey energy dice usage limits
convenient Vector3f class
Male viewpoint in an erotic novel
In-universe, why does Doc Brown program the time machine to go to 1955?
Add builder hat to other people with tikzpeople
Was the lunar landing site always in the same plane as the CM's orbit?
Do I need to declare engagement ring bought in UK when flying on holiday to US?
How do I make my fill-in-the-blank exercise more obvious?
Dissuading my girlfriend from a scam
Friend is very nit picky about side comments I don't intend to be taken too seriously
Can taking my 1-week-old on a 6-7 hours journey in the car lead to medical complications?
How could a planet have one hemisphere way warmer than the other without the planet being tidally locked?
Was Rosie the Riveter sourced from a Michelangelo painting?
Why there is no wireless switch?
Are there mathematical concepts that exist in the fourth dimension, but not in the third dimension?
Why doesn't an NVMe connection on an SSD make non-sequential access faster?
NVMe ssd: Why is 4k writing faster than reading?How can I use my small SSD as a cache for a larger hard disk?ExpressCache vs. Intel's Rapid Storage and Smart Response TechnologiesDo I need to connect the jumper pins in a SATA hard drive to anything?NVMe ssd: Why is 4k writing faster than reading?Identical SSDs on same port: Why is one SATA/600 and the other SATA/150?NVMe SSD and Windows NTFS compression - effects on performance?Different rpm but same speed?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.
But in real life non-sequential access turns out to have only little benefit over a SATA SSD.
EDIT
Since the answers are concentrating on the difference between large and small files, let me clarify my question:
- Yes, small files will have overhead.
- And yes, they will waste time by reading data that will be ignored.
- But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.
Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:
(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)
hard-drive ssd sata nvme
|
show 6 more comments
Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.
But in real life non-sequential access turns out to have only little benefit over a SATA SSD.
EDIT
Since the answers are concentrating on the difference between large and small files, let me clarify my question:
- Yes, small files will have overhead.
- And yes, they will waste time by reading data that will be ignored.
- But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.
Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:
(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)
hard-drive ssd sata nvme
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
8 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
7 hours ago
|
show 6 more comments
Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.
But in real life non-sequential access turns out to have only little benefit over a SATA SSD.
EDIT
Since the answers are concentrating on the difference between large and small files, let me clarify my question:
- Yes, small files will have overhead.
- And yes, they will waste time by reading data that will be ignored.
- But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.
Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:
(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)
hard-drive ssd sata nvme
Why would it matter if you’re transferring a 1GB file at quintuple the speed (of SATA SSDs) or transferring 1,000 1MB files at quintuple the speed? It should always amount to being quintuple the speed of the SSD.
But in real life non-sequential access turns out to have only little benefit over a SATA SSD.
EDIT
Since the answers are concentrating on the difference between large and small files, let me clarify my question:
- Yes, small files will have overhead.
- And yes, they will waste time by reading data that will be ignored.
- But this is irrelevant to my question since every read and write (including those pesky little MFT writes etc…) will (or rather should) see the x5 speed gain.
Saying that there is wasted drive access doesn't change that. I'm not asking why is 1GB not as fast as 1000 1MBs. I'm asking why:
(1GB_NVMe / 1GB_SSD) != (1000x1MB_NVMe / 1000x1MB_SSD)
hard-drive ssd sata nvme
hard-drive ssd sata nvme
edited 6 hours ago
ispiro
asked 8 hours ago
ispiroispiro
7123 gold badges13 silver badges34 bronze badges
7123 gold badges13 silver badges34 bronze badges
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
8 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
7 hours ago
|
show 6 more comments
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
8 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
7 hours ago
1
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
8 hours ago
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
8 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
7 hours ago
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
7 hours ago
|
show 6 more comments
2 Answers
2
active
oldest
votes
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
7 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
8 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
8 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1479346%2fwhy-doesnt-an-nvme-connection-on-an-ssd-make-non-sequential-access-faster%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
7 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
7 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
The problem here is that while NVMe and SSDs in general are faster than spinning rust due to using flash memories, the ability of NVMe to transfer multiple gigabytes of data per second is due to the way the flash memory is arranged around the controller.
Fast flash devices effectively use a scheme similar to RAID0 across what are (on their own) simply fast flash memory chips. Each chip on its own can handle a certain speed, but tied together with its siblings can achieve a much higher aggregate speed by having data written to and read from multiple devices simultaneously.
Effectively large transfers can take advantage of transfer parallelism and request multiple blocks from multiple chips and so reduce what would be 8 seeks times down to a single seek (across multiple chips) along with a larger transfer. The controller will have buffering and queueing to be able to sequentially stream out the data in whichever direction is required.
The individual flash chips themselves may also be configured to read ahead a few blocks for future requests and (for writes) cache it in a small internal buffer to further reduce delays for future requests.
The problem with working with lots of small files is that it ends up defeating all of the smarts used to achieve a single massive transfer. The controller has to operate in a queue going between flash devices requesting a block of data, waiting for a response, looking at the next item in the queue, requesting that data and so on.
If the data being read or written is on another chip then it might be able to use multiple channels but if a lot of the requests end up on the same chip for a period, as it could for lots of small writes, then what you end up seeing is the performance of a single flash chip rather than the full performance of an array of chips.
So thousands of small reads or writes could actually show you the performance of only a small part of your NVMe device, rather than what the device is fully capable of under so-called "perfect" conditions.
answered 7 hours ago
Mokubai♦Mokubai
60.5k16 gold badges140 silver badges161 bronze badges
60.5k16 gold badges140 silver badges161 bronze badges
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
7 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
7 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
7 hours ago
There is a problem with that: The disk knows nothing of files, only of sectors. So if the OS fed it enough data it wouldn't need to slow down. This means that the bottleneck is with the OS being too slow on many files, not with the way NVMe works.
– harrymc
7 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
Thanks. This makes sense.
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
@harrymc Are you saying that the OS should have fed the drive more but doesn't, or that it does, but that the loss is by the OS wasting other time?
– ispiro
6 hours ago
1
1
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
@harrymc that could be the case if a flash device were a dumb slab of disk like spinning rust, but there is another source of seek latency, the wear leveller or "flash transition layer" that is wholly within the NVMe controller. For large reads and writes again this becomes a single check and then burst out to the memory devices, for a queue of things it becomes yet another bottleneck of "where's this" and "where's that" within the drive itself. I'm not meaning to say mine is the one true answer, but the drive itself, being a more complicated device, holds a lot of the cards performance wise.
– Mokubai♦
6 hours ago
1
1
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
@ispiro performance SATA SSDs would have effectively the same internals as NVMe, you just don't really notice it because it is bottlenecked at the interface. Both SATA and NVMe get the same "worst case" performance, but for best case the NVMe can go way ahead.
– Mokubai♦
6 hours ago
|
show 3 more comments
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
8 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
8 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
8 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
8 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
Copying many files involves the overhead of creating their entries in the
Master File Table (MFT)
which is an integral component of the NTFS file system
(with equivalents for other file-systems than NTFS).
This means that creating a file entails first searching the MFT for the name,
in order to avoid duplicates, then allocating the space, copying the file,
and finally completing the entry in the MFT.
It is this bookkeeping overhead that slows down dramatically the copying of
many files.
The overhead involves matching work in the operating system,
updating RAM tables, computer interrupts, system calls etc etc.
Closing a file also causes the operating system to flush it to the disk,
and that takes time and disturbs the smooth copying of the data,
not letting NVMe achieve performance that is closer to its potential.
Note: This problem of the slow copy of many files is not unique to NVMe.
We see exactly the same problem for any kind of fast disk, either mechanical
or SSD, SATA or NVMe.
This to my way of thinking proves that the problem is with inefficient OS
handling of this case, perhaps because of inefficient cache memory
algorithms and/or disk-driver.
edited 6 hours ago
answered 8 hours ago
harrymcharrymc
283k16 gold badges300 silver badges615 bronze badges
283k16 gold badges300 silver badges615 bronze badges
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
8 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
8 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
8 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes forcomputer interrupts
.Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast.system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.
– ispiro
8 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
This shouldn't matter because every access to the MFT should be 5 times as fast. The bottom line is we're doing the same amount of work, and every part of it is 5 times as fast. (And I assume that the CPU work here, which would, indeed, be the same, is negligible.)
– ispiro
8 hours ago
I added more info to the answer.
– harrymc
8 hours ago
I added more info to the answer.
– harrymc
8 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts
. Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast. system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.– ispiro
8 hours ago
updating RAM tables
shouldn't be a bottleneck compared with IO (unless you're saying that NVMe is that fast). Same goes for computer interrupts
. Closing a file also causes the operating system to flush it to the disk
- this should be 5 times (or whatever) as fast. system calls
- they contain 2 parts: IO - which should be with that speed gain, and non-IO which should be negligible. Everything breaks up into 2: IO - where we should see the full speed gain, and non-IO which should be negligible. Unless: NVMe is really reaching near RAM speed.– ispiro
8 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
It is a problem when the OS operates in an inefficient manner that cannot drive the NVMe at full speed when there are many files.
– harrymc
6 hours ago
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1479346%2fwhy-doesnt-an-nvme-connection-on-an-ssd-make-non-sequential-access-faster%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
in real life... You say that like it's a universal truth. Can you substantiate this claim? Please do not only respond in the comments. Instead, edit the post with this information.
– Twisty Impersonator
8 hours ago
@TwistyImpersonator I've searched the web for a contradictory source and only found one lead which turned out to be useless. It seems like there's practically no disagreement about that. That's why I omitted that from the question. Just like I omitted the fact that SSDs are faster than HDDs.
– ispiro
7 hours ago
Seek latency becomes a real problem with random reads and all the speed benefits of SSDs are lost with tiny reads: superuser.com/a/1168029/19943 This question feels like a slightly differently phrased duplicate of that one...
– Mokubai♦
7 hours ago
@Mokubai Every seek should see the speed difference. Your answer there explains the difference between small and large files. Not the difference between different types of drives. Every seek, every write amplification, every part should see the speed gain.
– ispiro
7 hours ago
1
@JakeGould Thanks. Point taken. I edited that line. Did you mean there was another part where the tone was harsh?
– ispiro
7 hours ago