Failing to untarUntar directory from large tarballUntar Without Top-Level Directoryuntar specific...

90s(?) book series about two people transported to a parallel medieval world, she joins city watch, he becomes wizard

Is this kind of description not recommended?

Repurpose telephone line to ethernet

Is there a commercial liquid with refractive index greater than n=2?

The Lucky House

Have made several mistakes during the course of my PhD. Can't help but feel resentment. Can I get some advice about how to move forward?

Are there any OR challenges that are similar to kaggle's competitions?

Why is su world executable?

Check disk usage of files returned with spaces

Would it be illegal for Facebook to actively promote a political agenda?

Do living authors still get paid royalties for their old work?

Syncing bitcoin node with multiple cores

Have only girls been born for a long time in this village?

is it possible to use the organization's name to published a paper in a conference even after I graduate from it

How best to join tables, which have different lengths on the same column values which exist in both tables?

How could Tony Stark wield the Infinity Nano Gauntlet - at all?

Is "stainless" a bulk or a surface property of stainless steel?

Unsolved Problems due to Lack of Computational Power

Why should someone be willing to write a strong recommendation even if that means losing a undergraduate from their lab?

Land Registry Clause

Total force on upper block in two block system

Is a butterfly one or two animals?

How do neutron star binaries form?

Using は before 欲しい instead が



Failing to untar


Untar directory from large tarballUntar Without Top-Level Directoryuntar specific directoryUntar a specifc folder within tar.gzExtracting a certain folder from a tarball - how do I tell it where to put the file once extracted?Combining multiple files without tarRecursive UNTAR / UNZIPUntar single file to desired name






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







2















I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using



tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR


Once finished with the tarring process, the following is done:



split -d -b 2G file.tar.gz file_part_


This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:



md5sum PART_NAME >> list_md5.start


Once each part has been hashed, I do the following:



sort -u list_md5.start


(This sorts them and remove duplicates, just to be safe ya know)



The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:



diff list_md5.start list_md5_2.start


If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:



cat file_part_* > file.tar.gz.incomplete


(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across).
Once the cat is done, the file is renamed using:



mv file.tar.gz.incomplete file.tar.gz


At this point, the watchdog detects it and untars it using:



tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink


At this point, I get an error I can't debug:



Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
/PATH/TO/DEST


After untarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).



It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).



I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.



This is all done on Ubuntu machines which have both been updated (No update pending).



Does anyone have an idea as to how to solve this issue?










share|improve this question









New contributor



Siewiei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






















  • It is worth noting that sometimes the md5sum don't match up. At this point you can start over.

    – Cyrus
    2 days ago













  • Sorry, missed that. Will edit that in. Thank you!

    – Siewiei
    2 days ago






  • 1





    The environment does not allow using sftp/scp. We go through a diode. They're split up because of intermediary storage restrictions.

    – Siewiei
    2 days ago











  • Thank you for explaining why you split the file and not transfer it via sftp.

    – Cyrus
    2 days ago











  • Have you considered calculating the master hash of file.tar.gz before the split, and comparing to the hash of file.tar.gz.incomplete before doing the rename?

    – Jim L.
    2 days ago


















2















I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using



tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR


Once finished with the tarring process, the following is done:



split -d -b 2G file.tar.gz file_part_


This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:



md5sum PART_NAME >> list_md5.start


Once each part has been hashed, I do the following:



sort -u list_md5.start


(This sorts them and remove duplicates, just to be safe ya know)



The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:



diff list_md5.start list_md5_2.start


If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:



cat file_part_* > file.tar.gz.incomplete


(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across).
Once the cat is done, the file is renamed using:



mv file.tar.gz.incomplete file.tar.gz


At this point, the watchdog detects it and untars it using:



tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink


At this point, I get an error I can't debug:



Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
/PATH/TO/DEST


After untarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).



It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).



I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.



This is all done on Ubuntu machines which have both been updated (No update pending).



Does anyone have an idea as to how to solve this issue?










share|improve this question









New contributor



Siewiei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






















  • It is worth noting that sometimes the md5sum don't match up. At this point you can start over.

    – Cyrus
    2 days ago













  • Sorry, missed that. Will edit that in. Thank you!

    – Siewiei
    2 days ago






  • 1





    The environment does not allow using sftp/scp. We go through a diode. They're split up because of intermediary storage restrictions.

    – Siewiei
    2 days ago











  • Thank you for explaining why you split the file and not transfer it via sftp.

    – Cyrus
    2 days ago











  • Have you considered calculating the master hash of file.tar.gz before the split, and comparing to the hash of file.tar.gz.incomplete before doing the rename?

    – Jim L.
    2 days ago














2












2








2


0






I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using



tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR


Once finished with the tarring process, the following is done:



split -d -b 2G file.tar.gz file_part_


This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:



md5sum PART_NAME >> list_md5.start


Once each part has been hashed, I do the following:



sort -u list_md5.start


(This sorts them and remove duplicates, just to be safe ya know)



The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:



diff list_md5.start list_md5_2.start


If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:



cat file_part_* > file.tar.gz.incomplete


(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across).
Once the cat is done, the file is renamed using:



mv file.tar.gz.incomplete file.tar.gz


At this point, the watchdog detects it and untars it using:



tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink


At this point, I get an error I can't debug:



Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
/PATH/TO/DEST


After untarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).



It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).



I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.



This is all done on Ubuntu machines which have both been updated (No update pending).



Does anyone have an idea as to how to solve this issue?










share|improve this question









New contributor



Siewiei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using



tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR


Once finished with the tarring process, the following is done:



split -d -b 2G file.tar.gz file_part_


This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:



md5sum PART_NAME >> list_md5.start


Once each part has been hashed, I do the following:



sort -u list_md5.start


(This sorts them and remove duplicates, just to be safe ya know)



The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:



diff list_md5.start list_md5_2.start


If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:



cat file_part_* > file.tar.gz.incomplete


(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across).
Once the cat is done, the file is renamed using:



mv file.tar.gz.incomplete file.tar.gz


At this point, the watchdog detects it and untars it using:



tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink


At this point, I get an error I can't debug:



Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
/PATH/TO/DEST


After untarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).



It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).



I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.



This is all done on Ubuntu machines which have both been updated (No update pending).



Does anyone have an idea as to how to solve this issue?







ubuntu tar split hashsum large-files






share|improve this question









New contributor



Siewiei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|improve this question









New contributor



Siewiei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|improve this question




share|improve this question








edited 2 days ago







Siewiei













New contributor



Siewiei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked Aug 16 at 17:24









SiewieiSiewiei

113 bronze badges




113 bronze badges




New contributor



Siewiei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




Siewiei is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















  • It is worth noting that sometimes the md5sum don't match up. At this point you can start over.

    – Cyrus
    2 days ago













  • Sorry, missed that. Will edit that in. Thank you!

    – Siewiei
    2 days ago






  • 1





    The environment does not allow using sftp/scp. We go through a diode. They're split up because of intermediary storage restrictions.

    – Siewiei
    2 days ago











  • Thank you for explaining why you split the file and not transfer it via sftp.

    – Cyrus
    2 days ago











  • Have you considered calculating the master hash of file.tar.gz before the split, and comparing to the hash of file.tar.gz.incomplete before doing the rename?

    – Jim L.
    2 days ago



















  • It is worth noting that sometimes the md5sum don't match up. At this point you can start over.

    – Cyrus
    2 days ago













  • Sorry, missed that. Will edit that in. Thank you!

    – Siewiei
    2 days ago






  • 1





    The environment does not allow using sftp/scp. We go through a diode. They're split up because of intermediary storage restrictions.

    – Siewiei
    2 days ago











  • Thank you for explaining why you split the file and not transfer it via sftp.

    – Cyrus
    2 days ago











  • Have you considered calculating the master hash of file.tar.gz before the split, and comparing to the hash of file.tar.gz.incomplete before doing the rename?

    – Jim L.
    2 days ago

















It is worth noting that sometimes the md5sum don't match up. At this point you can start over.

– Cyrus
2 days ago







It is worth noting that sometimes the md5sum don't match up. At this point you can start over.

– Cyrus
2 days ago















Sorry, missed that. Will edit that in. Thank you!

– Siewiei
2 days ago





Sorry, missed that. Will edit that in. Thank you!

– Siewiei
2 days ago




1




1





The environment does not allow using sftp/scp. We go through a diode. They're split up because of intermediary storage restrictions.

– Siewiei
2 days ago





The environment does not allow using sftp/scp. We go through a diode. They're split up because of intermediary storage restrictions.

– Siewiei
2 days ago













Thank you for explaining why you split the file and not transfer it via sftp.

– Cyrus
2 days ago





Thank you for explaining why you split the file and not transfer it via sftp.

– Cyrus
2 days ago













Have you considered calculating the master hash of file.tar.gz before the split, and comparing to the hash of file.tar.gz.incomplete before doing the rename?

– Jim L.
2 days ago





Have you considered calculating the master hash of file.tar.gz before the split, and comparing to the hash of file.tar.gz.incomplete before doing the rename?

– Jim L.
2 days ago










1 Answer
1






active

oldest

votes


















0














Rsync is a free software utility for Unix- and Linux-like systems that copies files and directories from one host to another.



Use rsync to transfer file from one system to other.
You can use screen and the start rsync then deattach screen



Rsync is considered to be a lightweight application because file transfers are incremental -- after the initial full transfer, only bits in files that have been changed are transferred. Rsynch is often used to provide offsite backups by syncing data to a remote machine outside a firewall. It is also used for mirroring Web sites.






share|improve this answer








New contributor



user367435 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






















  • Although all of this is true it does not in any way address the issue with the seemingly corrupt tar archive in the question. Also note that the user mentions that scp and sftp can't be used (in comments), which likely means that the SSH transport (which rsync uses) probably can't be used at all.

    – Kusalananda
    2 days ago
















Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






Siewiei is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f535916%2ffailing-to-untar%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Rsync is a free software utility for Unix- and Linux-like systems that copies files and directories from one host to another.



Use rsync to transfer file from one system to other.
You can use screen and the start rsync then deattach screen



Rsync is considered to be a lightweight application because file transfers are incremental -- after the initial full transfer, only bits in files that have been changed are transferred. Rsynch is often used to provide offsite backups by syncing data to a remote machine outside a firewall. It is also used for mirroring Web sites.






share|improve this answer








New contributor



user367435 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






















  • Although all of this is true it does not in any way address the issue with the seemingly corrupt tar archive in the question. Also note that the user mentions that scp and sftp can't be used (in comments), which likely means that the SSH transport (which rsync uses) probably can't be used at all.

    – Kusalananda
    2 days ago


















0














Rsync is a free software utility for Unix- and Linux-like systems that copies files and directories from one host to another.



Use rsync to transfer file from one system to other.
You can use screen and the start rsync then deattach screen



Rsync is considered to be a lightweight application because file transfers are incremental -- after the initial full transfer, only bits in files that have been changed are transferred. Rsynch is often used to provide offsite backups by syncing data to a remote machine outside a firewall. It is also used for mirroring Web sites.






share|improve this answer








New contributor



user367435 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






















  • Although all of this is true it does not in any way address the issue with the seemingly corrupt tar archive in the question. Also note that the user mentions that scp and sftp can't be used (in comments), which likely means that the SSH transport (which rsync uses) probably can't be used at all.

    – Kusalananda
    2 days ago
















0












0








0







Rsync is a free software utility for Unix- and Linux-like systems that copies files and directories from one host to another.



Use rsync to transfer file from one system to other.
You can use screen and the start rsync then deattach screen



Rsync is considered to be a lightweight application because file transfers are incremental -- after the initial full transfer, only bits in files that have been changed are transferred. Rsynch is often used to provide offsite backups by syncing data to a remote machine outside a firewall. It is also used for mirroring Web sites.






share|improve this answer








New contributor



user367435 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









Rsync is a free software utility for Unix- and Linux-like systems that copies files and directories from one host to another.



Use rsync to transfer file from one system to other.
You can use screen and the start rsync then deattach screen



Rsync is considered to be a lightweight application because file transfers are incremental -- after the initial full transfer, only bits in files that have been changed are transferred. Rsynch is often used to provide offsite backups by syncing data to a remote machine outside a firewall. It is also used for mirroring Web sites.







share|improve this answer








New contributor



user367435 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|improve this answer



share|improve this answer






New contributor



user367435 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








answered 2 days ago









user367435user367435

11




11




New contributor



user367435 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




user367435 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















  • Although all of this is true it does not in any way address the issue with the seemingly corrupt tar archive in the question. Also note that the user mentions that scp and sftp can't be used (in comments), which likely means that the SSH transport (which rsync uses) probably can't be used at all.

    – Kusalananda
    2 days ago





















  • Although all of this is true it does not in any way address the issue with the seemingly corrupt tar archive in the question. Also note that the user mentions that scp and sftp can't be used (in comments), which likely means that the SSH transport (which rsync uses) probably can't be used at all.

    – Kusalananda
    2 days ago



















Although all of this is true it does not in any way address the issue with the seemingly corrupt tar archive in the question. Also note that the user mentions that scp and sftp can't be used (in comments), which likely means that the SSH transport (which rsync uses) probably can't be used at all.

– Kusalananda
2 days ago







Although all of this is true it does not in any way address the issue with the seemingly corrupt tar archive in the question. Also note that the user mentions that scp and sftp can't be used (in comments), which likely means that the SSH transport (which rsync uses) probably can't be used at all.

– Kusalananda
2 days ago












Siewiei is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















Siewiei is a new contributor. Be nice, and check out our Code of Conduct.













Siewiei is a new contributor. Be nice, and check out our Code of Conduct.












Siewiei is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f535916%2ffailing-to-untar%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Hudson River Historic District Contents Geography History The district today Aesthetics Cultural...

The number designs the writing. Feandra Aversely Definition: The act of ingrafting a sprig or shoot of one...

Ayherre Geografie Demografie Externe links Navigatiemenu43° 23′ NB, 1° 15′ WL43° 23′ NB, 1°...