Suspicion one drive is faulty in zpool, but showing four?zpool import - cannot import: one or more devices is...

Most practical knots for hitching a line to an object while keeping the bitter end as tight as possible, without sag?

What does it mean to have a subnet mask /32?

Starships without computers?

Thread-safe, Convenient and Performant Random Number Generator

Was 'help' pronounced starting with a vowel sound?

The teacher logged me in as administrator for doing a short task, is the whole system now compromised?

How much code would a codegolf golf if a codegolf could golf code?

!I!n!s!e!r!t! !n!b!e!t!w!e!e!n!

Are there any plans for handling people floating away during an EVA?

Importing ES6 module in LWC project (sfdx)

Why my earth simulation is slower than the reality?

Ask for a paid taxi in order to arrive as early as possible for an interview within the city

Was Switzerland really impossible to invade during WW2?

Something in the TV

In an emergency, how do I find and share my position?

How do I find the fastest route from Heathrow to an address in London using all forms of transport?

Taking out number of subarrays from an array which contains all the distinct elements of that array

Why we don't have vaccination against all diseases which are caused by microbes?

Dark side of an exoplanet - if it was earth-like would its surface light be detectable?

Would nanotechnology-scale devices be vulnerable to EMP?

What is the evidence on the danger of feeding whole blueberries and grapes to infants and toddlers?

Vacuum collapse -- why do strong metals implode but glass doesn't?

Why is 日本 read as "nihon" but not "nitsuhon"?

How to avoid using System.String with Rfc2898DeriveBytes in C#



Suspicion one drive is faulty in zpool, but showing four?


zpool import - cannot import: one or more devices is currently unavailableCannot create new zpoolzpool online doesn't workChecking for a failed drive in a ZFS poolReplacing a failed disk in a ZFS poolZFS pool disappears when adding (enabling) disks to computerForcing zpool to use /dev/disk/by-id in Ubuntu XenialReplacing disk when using FreeBSD ZFS zroot (ZFS on partition)?ZFS replace disks by id






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







0















I have a server that I am running for myself and my friends. We host games using Ubuntu 18.04 LTS Server on the primary boot drive, and use a RAIDZ2 pool for storing backups of those games, our music, movies, etc.



Every week to two weeks I get a faulted pool and a lot of read/write errors.



me@server:/$ zpool status NAS
pool: NAS
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool
clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: scrub repaired 0B in 3h19m with 0 errors on Sun Aug 11 07:14:28 2019
config:

NAME STATE READ WRITE CKSUM
NAS ONLINE 0 511 0
raidz2-0 ONLINE 0 200 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 3 224 0
sdf ONLINE 12 225 0
sdg ONLINE 3 226 0
sdh ONLINE 3 227 0
spares
sdb AVAIL


These errors never cause any data loss, and scrubbing never causes the pool to have to repair any bytes. I always have to restart the machine to get the pool to mount to the file system again. I've had this same pattern for months, now. Looking at this, this suggests to me that either one disk is bad (sdf), or in fact e-h are all on their way to failure and showing pre failure signs. Running disk self tests using S.M.A.R.T. always come back fine, showing no issues on the drives after I reset the machine and run tests. I assigned a hot-spare, hoping that it might be of use in case of a failure. At this point I am thinking I should replace drive sdf with sdb and see if this resolves the issue.



So, my question essentially is, when I see errors in a pool on multiple drives like this, is it always the case that all of the drives are pre-failure, or can the redundancy algorithm cause one bad disk to "spread" errors across other drives?



EDIT: Added in a comment but here as well for visibility.
I bought all of these drives used. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, [f] has the most, [egh] have less and all around the same.










share|improve this question









New contributor



Matt Bucklew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






















  • either you have a bad batch of drives or (IMO, more likely) your disk controller is faulty. what are the drives plugged into? motherboard sata ports? an SAS controller? USB (no!!!)?

    – cas
    2 days ago











  • from the errors shown, i'd guess that maybe sdc and sdd are plugged into motherboard sata ports, and sd[e-f] are on some kind of 4-port card? do you ever get errors on sd[cd] or only on sd[e-f]?

    – cas
    2 days ago













  • I bought all of these drives used, so that's a good change. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, f has the most, egh have less and all around the same.

    – Matt Bucklew
    2 days ago











  • IME extra sata ports on motherboards provided by 3rd party chips tend to be cheap and nasty, not as reliable as those provided by the CPU's main chipset. not always, but often enough that I tend to avoid using them. What brand/model of motherboard is it? BTW, the fact that you don't have a hardware raid controller is a good thing - you don't want to use hardware raid with ZFS, give it the raw drives so ZFS can manage the redundancy.

    – cas
    2 days ago













  • try replacing sdf with sdb (and physically remove sdf from the system). I still suspect a controller error because of the results of your smart tests and the fact that scrubbing the pool doesn't result in any errors or repairs.

    – cas
    2 days ago


















0















I have a server that I am running for myself and my friends. We host games using Ubuntu 18.04 LTS Server on the primary boot drive, and use a RAIDZ2 pool for storing backups of those games, our music, movies, etc.



Every week to two weeks I get a faulted pool and a lot of read/write errors.



me@server:/$ zpool status NAS
pool: NAS
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool
clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: scrub repaired 0B in 3h19m with 0 errors on Sun Aug 11 07:14:28 2019
config:

NAME STATE READ WRITE CKSUM
NAS ONLINE 0 511 0
raidz2-0 ONLINE 0 200 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 3 224 0
sdf ONLINE 12 225 0
sdg ONLINE 3 226 0
sdh ONLINE 3 227 0
spares
sdb AVAIL


These errors never cause any data loss, and scrubbing never causes the pool to have to repair any bytes. I always have to restart the machine to get the pool to mount to the file system again. I've had this same pattern for months, now. Looking at this, this suggests to me that either one disk is bad (sdf), or in fact e-h are all on their way to failure and showing pre failure signs. Running disk self tests using S.M.A.R.T. always come back fine, showing no issues on the drives after I reset the machine and run tests. I assigned a hot-spare, hoping that it might be of use in case of a failure. At this point I am thinking I should replace drive sdf with sdb and see if this resolves the issue.



So, my question essentially is, when I see errors in a pool on multiple drives like this, is it always the case that all of the drives are pre-failure, or can the redundancy algorithm cause one bad disk to "spread" errors across other drives?



EDIT: Added in a comment but here as well for visibility.
I bought all of these drives used. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, [f] has the most, [egh] have less and all around the same.










share|improve this question









New contributor



Matt Bucklew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






















  • either you have a bad batch of drives or (IMO, more likely) your disk controller is faulty. what are the drives plugged into? motherboard sata ports? an SAS controller? USB (no!!!)?

    – cas
    2 days ago











  • from the errors shown, i'd guess that maybe sdc and sdd are plugged into motherboard sata ports, and sd[e-f] are on some kind of 4-port card? do you ever get errors on sd[cd] or only on sd[e-f]?

    – cas
    2 days ago













  • I bought all of these drives used, so that's a good change. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, f has the most, egh have less and all around the same.

    – Matt Bucklew
    2 days ago











  • IME extra sata ports on motherboards provided by 3rd party chips tend to be cheap and nasty, not as reliable as those provided by the CPU's main chipset. not always, but often enough that I tend to avoid using them. What brand/model of motherboard is it? BTW, the fact that you don't have a hardware raid controller is a good thing - you don't want to use hardware raid with ZFS, give it the raw drives so ZFS can manage the redundancy.

    – cas
    2 days ago













  • try replacing sdf with sdb (and physically remove sdf from the system). I still suspect a controller error because of the results of your smart tests and the fact that scrubbing the pool doesn't result in any errors or repairs.

    – cas
    2 days ago














0












0








0








I have a server that I am running for myself and my friends. We host games using Ubuntu 18.04 LTS Server on the primary boot drive, and use a RAIDZ2 pool for storing backups of those games, our music, movies, etc.



Every week to two weeks I get a faulted pool and a lot of read/write errors.



me@server:/$ zpool status NAS
pool: NAS
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool
clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: scrub repaired 0B in 3h19m with 0 errors on Sun Aug 11 07:14:28 2019
config:

NAME STATE READ WRITE CKSUM
NAS ONLINE 0 511 0
raidz2-0 ONLINE 0 200 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 3 224 0
sdf ONLINE 12 225 0
sdg ONLINE 3 226 0
sdh ONLINE 3 227 0
spares
sdb AVAIL


These errors never cause any data loss, and scrubbing never causes the pool to have to repair any bytes. I always have to restart the machine to get the pool to mount to the file system again. I've had this same pattern for months, now. Looking at this, this suggests to me that either one disk is bad (sdf), or in fact e-h are all on their way to failure and showing pre failure signs. Running disk self tests using S.M.A.R.T. always come back fine, showing no issues on the drives after I reset the machine and run tests. I assigned a hot-spare, hoping that it might be of use in case of a failure. At this point I am thinking I should replace drive sdf with sdb and see if this resolves the issue.



So, my question essentially is, when I see errors in a pool on multiple drives like this, is it always the case that all of the drives are pre-failure, or can the redundancy algorithm cause one bad disk to "spread" errors across other drives?



EDIT: Added in a comment but here as well for visibility.
I bought all of these drives used. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, [f] has the most, [egh] have less and all around the same.










share|improve this question









New contributor



Matt Bucklew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I have a server that I am running for myself and my friends. We host games using Ubuntu 18.04 LTS Server on the primary boot drive, and use a RAIDZ2 pool for storing backups of those games, our music, movies, etc.



Every week to two weeks I get a faulted pool and a lot of read/write errors.



me@server:/$ zpool status NAS
pool: NAS
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool
clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: scrub repaired 0B in 3h19m with 0 errors on Sun Aug 11 07:14:28 2019
config:

NAME STATE READ WRITE CKSUM
NAS ONLINE 0 511 0
raidz2-0 ONLINE 0 200 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 3 224 0
sdf ONLINE 12 225 0
sdg ONLINE 3 226 0
sdh ONLINE 3 227 0
spares
sdb AVAIL


These errors never cause any data loss, and scrubbing never causes the pool to have to repair any bytes. I always have to restart the machine to get the pool to mount to the file system again. I've had this same pattern for months, now. Looking at this, this suggests to me that either one disk is bad (sdf), or in fact e-h are all on their way to failure and showing pre failure signs. Running disk self tests using S.M.A.R.T. always come back fine, showing no issues on the drives after I reset the machine and run tests. I assigned a hot-spare, hoping that it might be of use in case of a failure. At this point I am thinking I should replace drive sdf with sdb and see if this resolves the issue.



So, my question essentially is, when I see errors in a pool on multiple drives like this, is it always the case that all of the drives are pre-failure, or can the redundancy algorithm cause one bad disk to "spread" errors across other drives?



EDIT: Added in a comment but here as well for visibility.
I bought all of these drives used. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, [f] has the most, [egh] have less and all around the same.







ubuntu hard-disk disk zfs






share|improve this question









New contributor



Matt Bucklew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|improve this question









New contributor



Matt Bucklew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|improve this question




share|improve this question








edited 2 days ago







Matt Bucklew













New contributor



Matt Bucklew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked 2 days ago









Matt BucklewMatt Bucklew

11 bronze badge




11 bronze badge




New contributor



Matt Bucklew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




Matt Bucklew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















  • either you have a bad batch of drives or (IMO, more likely) your disk controller is faulty. what are the drives plugged into? motherboard sata ports? an SAS controller? USB (no!!!)?

    – cas
    2 days ago











  • from the errors shown, i'd guess that maybe sdc and sdd are plugged into motherboard sata ports, and sd[e-f] are on some kind of 4-port card? do you ever get errors on sd[cd] or only on sd[e-f]?

    – cas
    2 days ago













  • I bought all of these drives used, so that's a good change. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, f has the most, egh have less and all around the same.

    – Matt Bucklew
    2 days ago











  • IME extra sata ports on motherboards provided by 3rd party chips tend to be cheap and nasty, not as reliable as those provided by the CPU's main chipset. not always, but often enough that I tend to avoid using them. What brand/model of motherboard is it? BTW, the fact that you don't have a hardware raid controller is a good thing - you don't want to use hardware raid with ZFS, give it the raw drives so ZFS can manage the redundancy.

    – cas
    2 days ago













  • try replacing sdf with sdb (and physically remove sdf from the system). I still suspect a controller error because of the results of your smart tests and the fact that scrubbing the pool doesn't result in any errors or repairs.

    – cas
    2 days ago



















  • either you have a bad batch of drives or (IMO, more likely) your disk controller is faulty. what are the drives plugged into? motherboard sata ports? an SAS controller? USB (no!!!)?

    – cas
    2 days ago











  • from the errors shown, i'd guess that maybe sdc and sdd are plugged into motherboard sata ports, and sd[e-f] are on some kind of 4-port card? do you ever get errors on sd[cd] or only on sd[e-f]?

    – cas
    2 days ago













  • I bought all of these drives used, so that's a good change. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, f has the most, egh have less and all around the same.

    – Matt Bucklew
    2 days ago











  • IME extra sata ports on motherboards provided by 3rd party chips tend to be cheap and nasty, not as reliable as those provided by the CPU's main chipset. not always, but often enough that I tend to avoid using them. What brand/model of motherboard is it? BTW, the fact that you don't have a hardware raid controller is a good thing - you don't want to use hardware raid with ZFS, give it the raw drives so ZFS can manage the redundancy.

    – cas
    2 days ago













  • try replacing sdf with sdb (and physically remove sdf from the system). I still suspect a controller error because of the results of your smart tests and the fact that scrubbing the pool doesn't result in any errors or repairs.

    – cas
    2 days ago

















either you have a bad batch of drives or (IMO, more likely) your disk controller is faulty. what are the drives plugged into? motherboard sata ports? an SAS controller? USB (no!!!)?

– cas
2 days ago





either you have a bad batch of drives or (IMO, more likely) your disk controller is faulty. what are the drives plugged into? motherboard sata ports? an SAS controller? USB (no!!!)?

– cas
2 days ago













from the errors shown, i'd guess that maybe sdc and sdd are plugged into motherboard sata ports, and sd[e-f] are on some kind of 4-port card? do you ever get errors on sd[cd] or only on sd[e-f]?

– cas
2 days ago







from the errors shown, i'd guess that maybe sdc and sdd are plugged into motherboard sata ports, and sd[e-f] are on some kind of 4-port card? do you ever get errors on sd[cd] or only on sd[e-f]?

– cas
2 days ago















I bought all of these drives used, so that's a good change. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, f has the most, egh have less and all around the same.

– Matt Bucklew
2 days ago





I bought all of these drives used, so that's a good change. All of these are plugged directly into the board. I cannot recall the exact setup, but I think there are two chips on the mobo that handle 2/3 of the ports, and the intel southbridge handles the rest - I don't have a hardware raid controller. I never get errors on sd[cd] only on those other four, and always in that pattern, f has the most, egh have less and all around the same.

– Matt Bucklew
2 days ago













IME extra sata ports on motherboards provided by 3rd party chips tend to be cheap and nasty, not as reliable as those provided by the CPU's main chipset. not always, but often enough that I tend to avoid using them. What brand/model of motherboard is it? BTW, the fact that you don't have a hardware raid controller is a good thing - you don't want to use hardware raid with ZFS, give it the raw drives so ZFS can manage the redundancy.

– cas
2 days ago







IME extra sata ports on motherboards provided by 3rd party chips tend to be cheap and nasty, not as reliable as those provided by the CPU's main chipset. not always, but often enough that I tend to avoid using them. What brand/model of motherboard is it? BTW, the fact that you don't have a hardware raid controller is a good thing - you don't want to use hardware raid with ZFS, give it the raw drives so ZFS can manage the redundancy.

– cas
2 days ago















try replacing sdf with sdb (and physically remove sdf from the system). I still suspect a controller error because of the results of your smart tests and the fact that scrubbing the pool doesn't result in any errors or repairs.

– cas
2 days ago





try replacing sdf with sdb (and physically remove sdf from the system). I still suspect a controller error because of the results of your smart tests and the fact that scrubbing the pool doesn't result in any errors or repairs.

– cas
2 days ago










0






active

oldest

votes














Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






Matt Bucklew is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f536179%2fsuspicion-one-drive-is-faulty-in-zpool-but-showing-four%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes








Matt Bucklew is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















Matt Bucklew is a new contributor. Be nice, and check out our Code of Conduct.













Matt Bucklew is a new contributor. Be nice, and check out our Code of Conduct.












Matt Bucklew is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f536179%2fsuspicion-one-drive-is-faulty-in-zpool-but-showing-four%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...