Given a list of files, some duplicates, some not, show checksum of only the duplicatesCreating a list of...

Why is softmax function used to calculate probabilities although we can divide each value by the sum of the vector?

Do the books ever say oliphaunts aren’t elephants?

Unknown indication below upper stave

Should I accept an invitation to give a talk from someone who might review my proposal?

Was Donald Trump at ground zero helping out on 9-11?

Self-deportation of American Citizens from US

Exploiting the delay when a festival ticket is scanned

How can Paypal know my card is being used in another account?

Should I intervene when a colleague in a different department makes students run laps as part of their grade?

Does dual boot harm a laptop battery or reduce its life?

How should I quote American English speakers in a British English essay?

What is the meaning of "stationarity of statistics" and "locality of pixel dependencies"?

Composing fill in the blanks

Why would a personal invisible shield be necessary?

What are the cons of stateless password generators?

How did astronauts using rovers tell direction without compasses on the Moon?

How to efficiently shred a lot of cabbage?

Why did I lose on time with 3 pawns vs Knight. Shouldn't it be a draw?

What is the reason for cards stating "Until end of turn, you don't lose this mana as steps and phases end"?

Why does the Rust compiler not optimize code assuming that two mutable references cannot alias?

Do 3/8 (37.5%) of Quadratics Have No x-Intercepts?

A variant of the Multiple Traveling Salesman Problem

Is it okay for me to decline a project on ethical grounds?

Did Vladimir Lenin have a cat?

Given a list of files, some duplicates, some not, show checksum of only the duplicates

Creating a list of files, removing “duplicates” with different suffixClustering identical files ignoring spaces & linebreaksHow to see if there are any matching characters in a string?Print only unique lines from file not the duplicatesOn applying commands to groups of lines from stdinWhy does tar list multiple entries for some files?What's the best way to sort?Assigning variables from text in filenamefind the relevant files with their checksumRe-order lines and merge others based on a specific criteria

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}

There must be an "easy" way to do this, but I can't figure out what it is.

Assume you have a plain text "file.txt" which has lines in this format (md5 sums followed by filenames):

5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

365a6d8b18cab348d92db610dfc46264 bar.txt

ae42d992bf622bdc425d37b04ec9c2d5 mini.txt

b8e9ff5502d5dbe38b3fd5e3363caacf tyrion.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

310ee92ebc69ed79c1837fc53983b7f8 mini luoma.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

I would like to sort file.txt and have the output:

Only show me lines if the md5 sum indicates the files are duplicates

Put a blank line between each "group" of duplicates.

so it would look like this:

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

(In the real case, it could be 2 duplicates or 10, or more.)

I'm guessing there might be a ruby or python guru out there who can figure this one out, but I'm open to pretty much any practical solution out there.

asked 1 hour ago

TJ Luoma

1011 bronze badge

New contributor

add a comment |

There must be an "easy" way to do this, but I can't figure out what it is.

Assume you have a plain text "file.txt" which has lines in this format (md5 sums followed by filenames):

5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

365a6d8b18cab348d92db610dfc46264 bar.txt

ae42d992bf622bdc425d37b04ec9c2d5 mini.txt

b8e9ff5502d5dbe38b3fd5e3363caacf tyrion.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

310ee92ebc69ed79c1837fc53983b7f8 mini luoma.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

I would like to sort file.txt and have the output:

Only show me lines if the md5 sum indicates the files are duplicates

Put a blank line between each "group" of duplicates.

so it would look like this:

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

(In the real case, it could be 2 duplicates or 10, or more.)

I'm guessing there might be a ruby or python guru out there who can figure this one out, but I'm open to pretty much any practical solution out there.

asked 1 hour ago

TJ Luoma

1011 bronze badge

New contributor

add a comment |

There must be an "easy" way to do this, but I can't figure out what it is.

Assume you have a plain text "file.txt" which has lines in this format (md5 sums followed by filenames):

5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

365a6d8b18cab348d92db610dfc46264 bar.txt

ae42d992bf622bdc425d37b04ec9c2d5 mini.txt

b8e9ff5502d5dbe38b3fd5e3363caacf tyrion.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

310ee92ebc69ed79c1837fc53983b7f8 mini luoma.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

I would like to sort file.txt and have the output:

Only show me lines if the md5 sum indicates the files are duplicates

Put a blank line between each "group" of duplicates.

so it would look like this:

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

(In the real case, it could be 2 duplicates or 10, or more.)

I'm guessing there might be a ruby or python guru out there who can figure this one out, but I'm open to pretty much any practical solution out there.

asked 1 hour ago

TJ Luoma

1011 bronze badge

New contributor

There must be an "easy" way to do this, but I can't figure out what it is.

Assume you have a plain text "file.txt" which has lines in this format (md5 sums followed by filenames):

5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

365a6d8b18cab348d92db610dfc46264 bar.txt

ae42d992bf622bdc425d37b04ec9c2d5 mini.txt

b8e9ff5502d5dbe38b3fd5e3363caacf tyrion.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

310ee92ebc69ed79c1837fc53983b7f8 mini luoma.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

I would like to sort file.txt and have the output:

Only show me lines if the md5 sum indicates the files are duplicates

Put a blank line between each "group" of duplicates.

so it would look like this:

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

(In the real case, it could be 2 duplicates or 10, or more.)

I'm guessing there might be a ruby or python guru out there who can figure this one out, but I'm open to pretty much any practical solution out there.

shell-script text-processing python

asked 1 hour ago

TJ Luoma

1011 bronze badge

New contributor

asked 1 hour ago

TJ Luoma

1011 bronze badge

New contributor

asked 1 hour ago

TJ Luoma

1011 bronze badge

New contributor

asked 1 hour ago

TJ Luoma

1011 bronze badge

asked 1 hour ago

TJ Luoma

1011 bronze badge

New contributor

add a comment |

2 Answers
2

active

oldest

votes

$ grep -f <(cut -d' ' -f1 file.txt | sort | uniq -d) file.txt 

| awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt



542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

(Thanks to "cas" for the awk suggestion.)

edited 14 mins ago

answered 47 mins ago

Ray Butterworth

1815 bronze badges

1

+1. and just pipe the output to awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'. This prints a blank line whenever variable last is not empty and is different to the current $1. feel free to add this to your answer :)

– cas
28 mins ago

@cas, thanks. I'd just come up with something similar, but not quite as nice: awk 'BEGIN { x="" }; { if (x != $1) print(""); print; x=$1 }'. It's amazing how much one forgets in only a few years, and even more amazing how quickly one relearns it.

– Ray Butterworth
11 mins ago

add a comment |

With a perl Hash of Arrays:

$ perl -alne '

    push @{ $h{$F[0]} }, $_ 

    }{ 

    for $k (sort keys %h) {

      @a = @{ $h{$k} }; 

      print join "n", @a, "" if $#a  > 0

    }

' file.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

Note that this prints a trailing blank line after the last record. The sort is optional.

answered 6 mins ago

steeldriver

41.3k4 gold badges56 silver badges93 bronze badges

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

TJ Luoma is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f533030%2fgiven-a-list-of-files-some-duplicates-some-not-show-checksum-of-only-the-dupl%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

$ grep -f <(cut -d' ' -f1 file.txt | sort | uniq -d) file.txt 

| awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt



542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

(Thanks to "cas" for the awk suggestion.)

edited 14 mins ago

answered 47 mins ago

Ray Butterworth

1815 bronze badges

1

+1. and just pipe the output to awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'. This prints a blank line whenever variable last is not empty and is different to the current $1. feel free to add this to your answer :)

– cas
28 mins ago

@cas, thanks. I'd just come up with something similar, but not quite as nice: awk 'BEGIN { x="" }; { if (x != $1) print(""); print; x=$1 }'. It's amazing how much one forgets in only a few years, and even more amazing how quickly one relearns it.

– Ray Butterworth
11 mins ago

add a comment |

$ grep -f <(cut -d' ' -f1 file.txt | sort | uniq -d) file.txt 

| awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt



542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

(Thanks to "cas" for the awk suggestion.)

edited 14 mins ago

answered 47 mins ago

Ray Butterworth

1815 bronze badges

1

+1. and just pipe the output to awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'. This prints a blank line whenever variable last is not empty and is different to the current $1. feel free to add this to your answer :)

– cas
28 mins ago

@cas, thanks. I'd just come up with something similar, but not quite as nice: awk 'BEGIN { x="" }; { if (x != $1) print(""); print; x=$1 }'. It's amazing how much one forgets in only a few years, and even more amazing how quickly one relearns it.

– Ray Butterworth
11 mins ago

add a comment |

$ grep -f <(cut -d' ' -f1 file.txt | sort | uniq -d) file.txt 

| awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt



542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

(Thanks to "cas" for the awk suggestion.)

edited 14 mins ago

answered 47 mins ago

Ray Butterworth

1815 bronze badges

$ grep -f <(cut -d' ' -f1 file.txt | sort | uniq -d) file.txt 

| awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt



542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt

(Thanks to "cas" for the awk suggestion.)

edited 14 mins ago

answered 47 mins ago

Ray Butterworth

1815 bronze badges

edited 14 mins ago

answered 47 mins ago

Ray Butterworth

1815 bronze badges

answered 47 mins ago

Ray Butterworth

1815 bronze badges

answered 47 mins ago

Ray Butterworth

1815 bronze badges

1

+1. and just pipe the output to awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'. This prints a blank line whenever variable last is not empty and is different to the current $1. feel free to add this to your answer :)

– cas
28 mins ago

@cas, thanks. I'd just come up with something similar, but not quite as nice: awk 'BEGIN { x="" }; { if (x != $1) print(""); print; x=$1 }'. It's amazing how much one forgets in only a few years, and even more amazing how quickly one relearns it.

– Ray Butterworth
11 mins ago

add a comment |

1

+1. and just pipe the output to awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'. This prints a blank line whenever variable last is not empty and is different to the current $1. feel free to add this to your answer :)

– cas
28 mins ago

@cas, thanks. I'd just come up with something similar, but not quite as nice: awk 'BEGIN { x="" }; { if (x != $1) print(""); print; x=$1 }'. It's amazing how much one forgets in only a few years, and even more amazing how quickly one relearns it.

– Ray Butterworth
11 mins ago

+1. and just pipe the output to awk 'last && last != $1 { printf "n" }; { last=$1 ; print}'. This prints a blank line whenever variable last is not empty and is different to the current $1. feel free to add this to your answer :)

– cas
28 mins ago

@cas, thanks. I'd just come up with something similar, but not quite as nice: awk 'BEGIN { x="" }; { if (x != $1) print(""); print; x=$1 }'. It's amazing how much one forgets in only a few years, and even more amazing how quickly one relearns it.

– Ray Butterworth
11 mins ago

add a comment |

With a perl Hash of Arrays:

$ perl -alne '

    push @{ $h{$F[0]} }, $_ 

    }{ 

    for $k (sort keys %h) {

      @a = @{ $h{$k} }; 

      print join "n", @a, "" if $#a  > 0

    }

' file.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

Note that this prints a trailing blank line after the last record. The sort is optional.

answered 6 mins ago

steeldriver

41.3k4 gold badges56 silver badges93 bronze badges

add a comment |

With a perl Hash of Arrays:

$ perl -alne '

    push @{ $h{$F[0]} }, $_ 

    }{ 

    for $k (sort keys %h) {

      @a = @{ $h{$k} }; 

      print join "n", @a, "" if $#a  > 0

    }

' file.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

Note that this prints a trailing blank line after the last record. The sort is optional.

answered 6 mins ago

steeldriver

41.3k4 gold badges56 silver badges93 bronze badges

add a comment |

With a perl Hash of Arrays:

$ perl -alne '

    push @{ $h{$F[0]} }, $_ 

    }{ 

    for $k (sort keys %h) {

      @a = @{ $h{$k} }; 

      print join "n", @a, "" if $#a  > 0

    }

' file.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

Note that this prints a trailing blank line after the last record. The sort is optional.

answered 6 mins ago

steeldriver

41.3k4 gold badges56 silver badges93 bronze badges

With a perl Hash of Arrays:

$ perl -alne '

    push @{ $h{$F[0]} }, $_ 

    }{ 

    for $k (sort keys %h) {

      @a = @{ $h{$k} }; 

      print join "n", @a, "" if $#a  > 0

    }

' file.txt

542ed609dfc4d0cae44c4b7be6d66382 mba.txt

542ed609dfc4d0cae44c4b7be6d66382 tyrion final.txt



5ee434a2ebcf4c3c98ee07e9c1efddc0 foo.txt

5ee434a2ebcf4c3c98ee07e9c1efddc0 imac.txt

Note that this prints a trailing blank line after the last record. The sort is optional.

answered 6 mins ago

steeldriver

41.3k4 gold badges56 silver badges93 bronze badges

answered 6 mins ago

steeldriver

41.3k4 gold badges56 silver badges93 bronze badges

answered 6 mins ago

steeldriver

41.3k4 gold badges56 silver badges93 bronze badges

answered 6 mins ago

steeldriver

41.3k4 gold badges56 silver badges93 bronze badges

add a comment |

TJ Luoma is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

TJ Luoma is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mdthbs