Concatenate lines by first column by awk or sed Announcing the arrival of Valued Associate...

What does this icon in iOS Stardew Valley mean?

How widely used is the term Treppenwitz? Is it something that most Germans know?

What to do with chalk when deepwater soloing?

What would be the ideal power source for a cybernetic eye?

What is the meaning of the new sigil in Game of Thrones Season 8 intro?

In predicate logic, does existential quantification (∃) include universal quantification (∀), i.e. can 'some' imply 'all'?

What exactly is a "Meth" in Altered Carbon?

Is it true that "carbohydrates are of no use for the basal metabolic need"?

What LEGO pieces have "real-world" functionality?

How to bypass password on Windows XP account?

Why didn't this character "real die" when they blew their stack out in Altered Carbon?

What is a non-alternating simple group with big order, but relatively few conjugacy classes?

Single word antonym of "flightless"

What's the meaning of 間時肆拾貳 at a car parking sign

How does the particle を relate to the verb 行く in the structure「A を + B に行く」?

What is the role of the transistor and diode in a soft start circuit?

51k Euros annually for a family of 4 in Berlin: Is it enough?

Short Story with Cinderella as a Voo-doo Witch

prime numbers and expressing non-prime numbers

Using et al. for a last / senior author rather than for a first author

How does debian/ubuntu knows a package has a updated version

Is the Standard Deduction better than Itemized when both are the same amount?

Why did the rest of the Eastern Bloc not invade Yugoslavia?

Resolving to minmaj7

Concatenate lines by first column by awk or sed

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)

2019 Community Moderator Election Results

Why I closed the “Why is Kali so hard” questionAwk: expanding first field along the columnMerging two tables including multiple ocurrence of column identifiers and unique linesHow to print empty spaces in first column using awk or sedConcatenate several files with a common headerconcatenate n lines with sedSelect the lines with exactly two columns in Linuxconcatenate lines based on first char of next linereplace blank value with previous first column value using awk and sedmodify specific column with sed or awkawk or sed, move first column to end?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}

How can I use awk in the following situation?

I want to concatenate lines that start with the same column. Only the first column is kept after the join (in this case aaa, www, hhh).

The file may be space- or tab-separated.

Example input:

aaa bbb ccc ddd NULL NULL NULL

aaa NULL NULL NULL NULL NULL NULL

aaa bbb ccc NULL NULL NULL NULL

www yyy hhh NULL NULL NULL NULL

hhh 111 333 yyy ooo hyy uuuioooy

hhh 111 333 yyy ooo hyy NULL

Desired output:

aaa bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc NULL NULL NULL NULL

www yyy hhh NULL NULL NULL NULL

hhh 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

The background to this is that I want to set up a very simple file-based database, where the first column is always the identifier for the entity. All lines based on the same identifier column are concatenated.

edited Jun 24 '15 at 6:10

Volker Siegel

11.2k33361

asked Sep 11 '12 at 6:42

tiny

3232412

1

where did uuu line come from (in the output)?

– saeedn
Sep 11 '12 at 6:45

Sorry, my bad. I'll edit it.

– tiny
Sep 11 '12 at 7:05

add a comment |

How can I use awk in the following situation?

I want to concatenate lines that start with the same column. Only the first column is kept after the join (in this case aaa, www, hhh).

The file may be space- or tab-separated.

Example input:

aaa bbb ccc ddd NULL NULL NULL

aaa NULL NULL NULL NULL NULL NULL

aaa bbb ccc NULL NULL NULL NULL

www yyy hhh NULL NULL NULL NULL

hhh 111 333 yyy ooo hyy uuuioooy

hhh 111 333 yyy ooo hyy NULL

Desired output:

aaa bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc NULL NULL NULL NULL

www yyy hhh NULL NULL NULL NULL

hhh 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

edited Jun 24 '15 at 6:10

Volker Siegel

11.2k33361

asked Sep 11 '12 at 6:42

tiny

3232412

1

where did uuu line come from (in the output)?

– saeedn
Sep 11 '12 at 6:45

Sorry, my bad. I'll edit it.

– tiny
Sep 11 '12 at 7:05

add a comment |

How can I use awk in the following situation?

I want to concatenate lines that start with the same column. Only the first column is kept after the join (in this case aaa, www, hhh).

The file may be space- or tab-separated.

Example input:

aaa bbb ccc ddd NULL NULL NULL

aaa NULL NULL NULL NULL NULL NULL

aaa bbb ccc NULL NULL NULL NULL

www yyy hhh NULL NULL NULL NULL

hhh 111 333 yyy ooo hyy uuuioooy

hhh 111 333 yyy ooo hyy NULL

Desired output:

aaa bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc NULL NULL NULL NULL

www yyy hhh NULL NULL NULL NULL

hhh 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

edited Jun 24 '15 at 6:10

Volker Siegel

11.2k33361

asked Sep 11 '12 at 6:42

tiny

3232412

How can I use awk in the following situation?

I want to concatenate lines that start with the same column. Only the first column is kept after the join (in this case aaa, www, hhh).

The file may be space- or tab-separated.

Example input:

aaa bbb ccc ddd NULL NULL NULL

aaa NULL NULL NULL NULL NULL NULL

aaa bbb ccc NULL NULL NULL NULL

www yyy hhh NULL NULL NULL NULL

hhh 111 333 yyy ooo hyy uuuioooy

hhh 111 333 yyy ooo hyy NULL

Desired output:

aaa bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc NULL NULL NULL NULL

www yyy hhh NULL NULL NULL NULL

hhh 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

text-processing sed awk

edited Jun 24 '15 at 6:10

Volker Siegel

11.2k33361

asked Sep 11 '12 at 6:42

tiny

3232412

edited Jun 24 '15 at 6:10

Volker Siegel

11.2k33361

asked Sep 11 '12 at 6:42

tiny

3232412

edited Jun 24 '15 at 6:10

Volker Siegel

11.2k33361

edited Jun 24 '15 at 6:10

Volker Siegel

11.2k33361

edited Jun 24 '15 at 6:10

Volker Siegel

11.2k33361

asked Sep 11 '12 at 6:42

tiny

3232412

asked Sep 11 '12 at 6:42

tiny

3232412

asked Sep 11 '12 at 6:42

tiny

3232412

1

where did uuu line come from (in the output)?

– saeedn
Sep 11 '12 at 6:45

Sorry, my bad. I'll edit it.

– tiny
Sep 11 '12 at 7:05

add a comment |

1

where did uuu line come from (in the output)?

– saeedn
Sep 11 '12 at 6:45

Sorry, my bad. I'll edit it.

– tiny
Sep 11 '12 at 7:05

where did uuu line come from (in the output)?

– saeedn
Sep 11 '12 at 6:45

Sorry, my bad. I'll edit it.

– tiny
Sep 11 '12 at 7:05

add a comment |

4 Answers
4

active

oldest

votes

To get the first columns in each line using awk you can do the following:

< testfile awk '{print $1}'

aaa

aaa

aaa

www

hhh

hhh

These are your keys for the rest of the lines. So you may create a hash table, using the first column as a key and the second column of the line as the value:

< testfile awk '{table[$1]=table[$1] $2;} END {for (key in table) print key " => " table[key];}'

www => yyy

aaa => bbbNULLbbb

hhh => 111111

To get the whole rest of the line, starting with column 2, you need to collect all columns:

< testfile awk '{line="";for (i = 2; i <= NF; i++) line = line $i " "; table[$1]=table[$1] line;} END {for (key in table) print key " => " table[key];}'

www => yyy hhh NULL NULL NULL NULL 

aaa => bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc    NULL NULL NULL NULL 

hhh => 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

edited 5 hours ago

αғsнιη

17.2k103069

answered Sep 11 '12 at 7:26

binfalse

3,46311727

Hi, yeah it really needed breakdown to hash tables. Thank you!

– tiny
Sep 11 '12 at 7:34

2

@tiny - I was assuming the ordering needed to be preserved. Is this not the case (this answer produces ordering corresponding to the hashing mechanism, not your original order)?

– ire_and_curses
Sep 11 '12 at 7:36

add a comment |

Someone else can answer in awk or sed, but a Python version is straightforward and might be helpful to you.

#!/usr/bin/env python



input_file = 'input.dat'

in_fh      = open(input_file, 'r')



input_order = []

seen        = {}

for line in in_fh:    

    # Remove the newline character...

    line = line[:-1]



    # Separate the first column from the rest of the line...

    key_col, sep, rest_of_line = line.partition(" ")

    rest_of_line = sep + rest_of_line  



    # If we've seen this key already, concatenate the line...

    if key_col in seen:

        seen[key_col] += rest_of_line

    # ...otherwise, record the ordering, and store the new info

    else:

        input_order.append(key_col)

        seen[key_col] = rest_of_line



in_fh.close()



# Dump the ordered output to stdout

for unique_col in input_order:

    print unique_col + seen[unique_col]

edited Sep 11 '12 at 7:25

answered Sep 11 '12 at 7:19

ire_and_curses

9,86232731

Very cool. With my zero experience python I even managed to edit script that it takes first argument as input file name :)

– tiny
Sep 11 '12 at 7:30

add a comment |

This is more of an interesting application of coreutils, I suspect it's not very efficient with large input as it invokes join for each line in the input.

touch outfile

while read; do

  join -a1 -a2 outfile <(echo $REPLY) > tmp

  mv tmp outfile

done < infile

To improve it's efficiency, saving outfile and tmp to a ramdisk might help.

Edit

Or without temporary files:

out=""

while read; do

  out=$(join -a1 -a2 <(echo -n "$out") <(echo -n "$REPLY"))

done < infile



echo "$out"

edited Sep 11 '12 at 12:14

answered Sep 11 '12 at 11:37

Thor

12.2k13862

add a comment |

And here's a PERL one-liner:

$ perl -e 'my %h; while(<>){chomp; @a=split(/s+/); $k=shift(@a); $h{$k}.=join(" ", @a) . " "; } map{$h{$_}=~s/s*$//; print "$_ $h{$_}n}keys(%hash);' infile

answered Sep 11 '12 at 12:17

terdon♦

134k33270450

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f47786%2fconcatenate-lines-by-first-column-by-awk-or-sed%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

To get the first columns in each line using awk you can do the following:

< testfile awk '{print $1}'

aaa

aaa

aaa

www

hhh

hhh

These are your keys for the rest of the lines. So you may create a hash table, using the first column as a key and the second column of the line as the value:

< testfile awk '{table[$1]=table[$1] $2;} END {for (key in table) print key " => " table[key];}'

www => yyy

aaa => bbbNULLbbb

hhh => 111111

To get the whole rest of the line, starting with column 2, you need to collect all columns:

< testfile awk '{line="";for (i = 2; i <= NF; i++) line = line $i " "; table[$1]=table[$1] line;} END {for (key in table) print key " => " table[key];}'

www => yyy hhh NULL NULL NULL NULL 

aaa => bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc    NULL NULL NULL NULL 

hhh => 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

edited 5 hours ago

αғsнιη

17.2k103069

answered Sep 11 '12 at 7:26

binfalse

3,46311727

Hi, yeah it really needed breakdown to hash tables. Thank you!

– tiny
Sep 11 '12 at 7:34

2

@tiny - I was assuming the ordering needed to be preserved. Is this not the case (this answer produces ordering corresponding to the hashing mechanism, not your original order)?

– ire_and_curses
Sep 11 '12 at 7:36

add a comment |

To get the first columns in each line using awk you can do the following:

< testfile awk '{print $1}'

aaa

aaa

aaa

www

hhh

hhh

These are your keys for the rest of the lines. So you may create a hash table, using the first column as a key and the second column of the line as the value:

< testfile awk '{table[$1]=table[$1] $2;} END {for (key in table) print key " => " table[key];}'

www => yyy

aaa => bbbNULLbbb

hhh => 111111

To get the whole rest of the line, starting with column 2, you need to collect all columns:

< testfile awk '{line="";for (i = 2; i <= NF; i++) line = line $i " "; table[$1]=table[$1] line;} END {for (key in table) print key " => " table[key];}'

www => yyy hhh NULL NULL NULL NULL 

aaa => bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc    NULL NULL NULL NULL 

hhh => 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

edited 5 hours ago

αғsнιη

17.2k103069

answered Sep 11 '12 at 7:26

binfalse

3,46311727

Hi, yeah it really needed breakdown to hash tables. Thank you!

– tiny
Sep 11 '12 at 7:34

2

@tiny - I was assuming the ordering needed to be preserved. Is this not the case (this answer produces ordering corresponding to the hashing mechanism, not your original order)?

– ire_and_curses
Sep 11 '12 at 7:36

add a comment |

To get the first columns in each line using awk you can do the following:

< testfile awk '{print $1}'

aaa

aaa

aaa

www

hhh

hhh

These are your keys for the rest of the lines. So you may create a hash table, using the first column as a key and the second column of the line as the value:

< testfile awk '{table[$1]=table[$1] $2;} END {for (key in table) print key " => " table[key];}'

www => yyy

aaa => bbbNULLbbb

hhh => 111111

To get the whole rest of the line, starting with column 2, you need to collect all columns:

< testfile awk '{line="";for (i = 2; i <= NF; i++) line = line $i " "; table[$1]=table[$1] line;} END {for (key in table) print key " => " table[key];}'

www => yyy hhh NULL NULL NULL NULL 

aaa => bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc    NULL NULL NULL NULL 

hhh => 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

edited 5 hours ago

αғsнιη

17.2k103069

answered Sep 11 '12 at 7:26

binfalse

3,46311727

To get the first columns in each line using awk you can do the following:

< testfile awk '{print $1}'

aaa

aaa

aaa

www

hhh

hhh

These are your keys for the rest of the lines. So you may create a hash table, using the first column as a key and the second column of the line as the value:

< testfile awk '{table[$1]=table[$1] $2;} END {for (key in table) print key " => " table[key];}'

www => yyy

aaa => bbbNULLbbb

hhh => 111111

To get the whole rest of the line, starting with column 2, you need to collect all columns:

< testfile awk '{line="";for (i = 2; i <= NF; i++) line = line $i " "; table[$1]=table[$1] line;} END {for (key in table) print key " => " table[key];}'

www => yyy hhh NULL NULL NULL NULL 

aaa => bbb ccc ddd NULL NULL NULL NULL NULL NULL NULL NULL NULL bbb ccc    NULL NULL NULL NULL 

hhh => 111 333 yyy ooo hyy uuuioooy 111 333 yyy ooo hyy NULL

edited 5 hours ago

αғsнιη

17.2k103069

answered Sep 11 '12 at 7:26

binfalse

3,46311727

edited 5 hours ago

αғsнιη

17.2k103069

edited 5 hours ago

αғsнιη

17.2k103069

edited 5 hours ago

αғsнιη

17.2k103069

answered Sep 11 '12 at 7:26

binfalse

3,46311727

answered Sep 11 '12 at 7:26

binfalse

3,46311727

answered Sep 11 '12 at 7:26

binfalse

3,46311727

Hi, yeah it really needed breakdown to hash tables. Thank you!

– tiny
Sep 11 '12 at 7:34

2

@tiny - I was assuming the ordering needed to be preserved. Is this not the case (this answer produces ordering corresponding to the hashing mechanism, not your original order)?

– ire_and_curses
Sep 11 '12 at 7:36

add a comment |

Hi, yeah it really needed breakdown to hash tables. Thank you!

– tiny
Sep 11 '12 at 7:34

2

@tiny - I was assuming the ordering needed to be preserved. Is this not the case (this answer produces ordering corresponding to the hashing mechanism, not your original order)?

– ire_and_curses
Sep 11 '12 at 7:36

Hi, yeah it really needed breakdown to hash tables. Thank you!

– tiny
Sep 11 '12 at 7:34

@tiny - I was assuming the ordering needed to be preserved. Is this not the case (this answer produces ordering corresponding to the hashing mechanism, not your original order)?

– ire_and_curses
Sep 11 '12 at 7:36

add a comment |

Someone else can answer in awk or sed, but a Python version is straightforward and might be helpful to you.

#!/usr/bin/env python



input_file = 'input.dat'

in_fh      = open(input_file, 'r')



input_order = []

seen        = {}

for line in in_fh:    

    # Remove the newline character...

    line = line[:-1]



    # Separate the first column from the rest of the line...

    key_col, sep, rest_of_line = line.partition(" ")

    rest_of_line = sep + rest_of_line  



    # If we've seen this key already, concatenate the line...

    if key_col in seen:

        seen[key_col] += rest_of_line

    # ...otherwise, record the ordering, and store the new info

    else:

        input_order.append(key_col)

        seen[key_col] = rest_of_line



in_fh.close()



# Dump the ordered output to stdout

for unique_col in input_order:

    print unique_col + seen[unique_col]

edited Sep 11 '12 at 7:25

answered Sep 11 '12 at 7:19

ire_and_curses

9,86232731

Very cool. With my zero experience python I even managed to edit script that it takes first argument as input file name :)

– tiny
Sep 11 '12 at 7:30

add a comment |

Someone else can answer in awk or sed, but a Python version is straightforward and might be helpful to you.

#!/usr/bin/env python



input_file = 'input.dat'

in_fh      = open(input_file, 'r')



input_order = []

seen        = {}

for line in in_fh:    

    # Remove the newline character...

    line = line[:-1]



    # Separate the first column from the rest of the line...

    key_col, sep, rest_of_line = line.partition(" ")

    rest_of_line = sep + rest_of_line  



    # If we've seen this key already, concatenate the line...

    if key_col in seen:

        seen[key_col] += rest_of_line

    # ...otherwise, record the ordering, and store the new info

    else:

        input_order.append(key_col)

        seen[key_col] = rest_of_line



in_fh.close()



# Dump the ordered output to stdout

for unique_col in input_order:

    print unique_col + seen[unique_col]

edited Sep 11 '12 at 7:25

answered Sep 11 '12 at 7:19

ire_and_curses

9,86232731

Very cool. With my zero experience python I even managed to edit script that it takes first argument as input file name :)

– tiny
Sep 11 '12 at 7:30

add a comment |

Someone else can answer in awk or sed, but a Python version is straightforward and might be helpful to you.

#!/usr/bin/env python



input_file = 'input.dat'

in_fh      = open(input_file, 'r')



input_order = []

seen        = {}

for line in in_fh:    

    # Remove the newline character...

    line = line[:-1]



    # Separate the first column from the rest of the line...

    key_col, sep, rest_of_line = line.partition(" ")

    rest_of_line = sep + rest_of_line  



    # If we've seen this key already, concatenate the line...

    if key_col in seen:

        seen[key_col] += rest_of_line

    # ...otherwise, record the ordering, and store the new info

    else:

        input_order.append(key_col)

        seen[key_col] = rest_of_line



in_fh.close()



# Dump the ordered output to stdout

for unique_col in input_order:

    print unique_col + seen[unique_col]

edited Sep 11 '12 at 7:25

answered Sep 11 '12 at 7:19

ire_and_curses

9,86232731

Someone else can answer in awk or sed, but a Python version is straightforward and might be helpful to you.

#!/usr/bin/env python



input_file = 'input.dat'

in_fh      = open(input_file, 'r')



input_order = []

seen        = {}

for line in in_fh:    

    # Remove the newline character...

    line = line[:-1]



    # Separate the first column from the rest of the line...

    key_col, sep, rest_of_line = line.partition(" ")

    rest_of_line = sep + rest_of_line  



    # If we've seen this key already, concatenate the line...

    if key_col in seen:

        seen[key_col] += rest_of_line

    # ...otherwise, record the ordering, and store the new info

    else:

        input_order.append(key_col)

        seen[key_col] = rest_of_line



in_fh.close()



# Dump the ordered output to stdout

for unique_col in input_order:

    print unique_col + seen[unique_col]

edited Sep 11 '12 at 7:25

answered Sep 11 '12 at 7:19

ire_and_curses

9,86232731

edited Sep 11 '12 at 7:25

answered Sep 11 '12 at 7:19

ire_and_curses

9,86232731

answered Sep 11 '12 at 7:19

ire_and_curses

9,86232731

answered Sep 11 '12 at 7:19

ire_and_curses

9,86232731

Very cool. With my zero experience python I even managed to edit script that it takes first argument as input file name :)

– tiny
Sep 11 '12 at 7:30

add a comment |

Very cool. With my zero experience python I even managed to edit script that it takes first argument as input file name :)

– tiny
Sep 11 '12 at 7:30

Very cool. With my zero experience python I even managed to edit script that it takes first argument as input file name :)

– tiny
Sep 11 '12 at 7:30

add a comment |

This is more of an interesting application of coreutils, I suspect it's not very efficient with large input as it invokes join for each line in the input.

touch outfile

while read; do

  join -a1 -a2 outfile <(echo $REPLY) > tmp

  mv tmp outfile

done < infile

To improve it's efficiency, saving outfile and tmp to a ramdisk might help.

Edit

Or without temporary files:

out=""

while read; do

  out=$(join -a1 -a2 <(echo -n "$out") <(echo -n "$REPLY"))

done < infile



echo "$out"

edited Sep 11 '12 at 12:14

answered Sep 11 '12 at 11:37

Thor

12.2k13862

add a comment |

This is more of an interesting application of coreutils, I suspect it's not very efficient with large input as it invokes join for each line in the input.

touch outfile

while read; do

  join -a1 -a2 outfile <(echo $REPLY) > tmp

  mv tmp outfile

done < infile

To improve it's efficiency, saving outfile and tmp to a ramdisk might help.

Edit

Or without temporary files:

out=""

while read; do

  out=$(join -a1 -a2 <(echo -n "$out") <(echo -n "$REPLY"))

done < infile



echo "$out"

edited Sep 11 '12 at 12:14

answered Sep 11 '12 at 11:37

Thor

12.2k13862

add a comment |

This is more of an interesting application of coreutils, I suspect it's not very efficient with large input as it invokes join for each line in the input.

touch outfile

while read; do

  join -a1 -a2 outfile <(echo $REPLY) > tmp

  mv tmp outfile

done < infile

To improve it's efficiency, saving outfile and tmp to a ramdisk might help.

Edit

Or without temporary files:

out=""

while read; do

  out=$(join -a1 -a2 <(echo -n "$out") <(echo -n "$REPLY"))

done < infile



echo "$out"

edited Sep 11 '12 at 12:14

answered Sep 11 '12 at 11:37

Thor

12.2k13862

This is more of an interesting application of coreutils, I suspect it's not very efficient with large input as it invokes join for each line in the input.

touch outfile

while read; do

  join -a1 -a2 outfile <(echo $REPLY) > tmp

  mv tmp outfile

done < infile

To improve it's efficiency, saving outfile and tmp to a ramdisk might help.

Edit

Or without temporary files:

out=""

while read; do

  out=$(join -a1 -a2 <(echo -n "$out") <(echo -n "$REPLY"))

done < infile



echo "$out"

edited Sep 11 '12 at 12:14

answered Sep 11 '12 at 11:37

Thor

12.2k13862

edited Sep 11 '12 at 12:14

answered Sep 11 '12 at 11:37

Thor

12.2k13862

answered Sep 11 '12 at 11:37

Thor

12.2k13862

answered Sep 11 '12 at 11:37

Thor

12.2k13862

add a comment |

And here's a PERL one-liner:

$ perl -e 'my %h; while(<>){chomp; @a=split(/s+/); $k=shift(@a); $h{$k}.=join(" ", @a) . " "; } map{$h{$_}=~s/s*$//; print "$_ $h{$_}n}keys(%hash);' infile

answered Sep 11 '12 at 12:17

terdon♦

134k33270450

add a comment |

And here's a PERL one-liner:

$ perl -e 'my %h; while(<>){chomp; @a=split(/s+/); $k=shift(@a); $h{$k}.=join(" ", @a) . " "; } map{$h{$_}=~s/s*$//; print "$_ $h{$_}n}keys(%hash);' infile

answered Sep 11 '12 at 12:17

terdon♦

134k33270450

add a comment |

And here's a PERL one-liner:

$ perl -e 'my %h; while(<>){chomp; @a=split(/s+/); $k=shift(@a); $h{$k}.=join(" ", @a) . " "; } map{$h{$_}=~s/s*$//; print "$_ $h{$_}n}keys(%hash);' infile

answered Sep 11 '12 at 12:17

terdon♦

134k33270450

And here's a PERL one-liner:

$ perl -e 'my %h; while(<>){chomp; @a=split(/s+/); $k=shift(@a); $h{$k}.=join(" ", @a) . " "; } map{$h{$_}=~s/s*$//; print "$_ $h{$_}n}keys(%hash);' infile

answered Sep 11 '12 at 12:17

terdon♦

134k33270450

answered Sep 11 '12 at 12:17

terdon♦

134k33270450

answered Sep 11 '12 at 12:17

terdon♦

134k33270450

answered Sep 11 '12 at 12:17

terdon♦

134k33270450

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mdthbs