How to generate URL+title from URL list automatically? (using bash or other tool)Is it ok to have no newline...
I sent an angry e-mail to my interviewers about a conflict at my home institution. Could this affect my application?
Short story about psychologist analyzing demon
Does WiFi affect the quality of images downloaded from the internet?
Can I get a photo of an Ancient Arrow?
Boss making me feel guilty for leaving the company at the end of my internship
Print the phrase "And she said, 'But that's his.'" using only the alphabet
ISP is not hashing the password I log in with online. Should I take any action?
Parsing text written the millitext font
DBCC SHRINKFILE on the distribution database
Can an open source licence be revoked if it violates employer's IP?
New Site Design!
How to represent jealousy in a cute way?
Was the Lonely Mountain, where Smaug lived, a volcano?
Why is gun control associated with the socially liberal Democratic party?
Is there a term for someone whose preferred policies are a mix of Left and Right?
Why not make one big cpu core?
Are athletes' college degrees discounted by employers and graduate school admissions?
What do I need to do, tax-wise, for a sudden windfall?
Nth term of Van Eck Sequence
What are the advantages of using TLRs to rangefinders?
Can an escape pod land on Earth from orbit and not be immediately detected?
Can Mage Hand be used to indirectly trigger an attack?
Difference between grep -R and -r
Commencez à vous connecter -- I don't understand the phrasing of this
How to generate URL+title from URL list automatically? (using bash or other tool)
Is it ok to have no newline at end of /etc/group?How can I extract the numbers in the file using sed or any other tool?How do I get size of (deb) file from download link/URL?Trying to grep url from html source in .txt file using sedHow to block keywords in https URL using squid proxy?How to extract all links from a given url in a decent timeHow to use a greped url provided by tshark inside a bash script?How to remove repository from sources.list using bash command?How to generate all character trigrams from a file?Bash converting date in a csv file with awk or other linux tool (csvcut)How to grep against a list of domains without using a bash script
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
Using Linux Bash, how can one turn a text file with:
http://example.org/
https://en.wikipedia.org/wiki/Main_Page
https://www.youtube.com/watch?v=mGQFZxIuURE
into:
http://example.org/ Example Domain
https://en.wikipedia.org/wiki/Main_Page Wikipedia, the free encyclopedia
https://www.youtube.com/watch?v=mGQFZxIuURE Mike Perry - The Ocean (ft. Shy Martin) - YouTube
or into:
http://example.org/
Example Domain
https://en.wikipedia.org/wiki/Main_Page
Wikipedia, the free encyclopedia
https://www.youtube.com/watch?v=mGQFZxIuURE
Mike Perry - The Ocean (ft. Shy Martin) - YouTube
?
How can one
- pull a URL from a list of URLs in a file,
- load the page,
- extract its page title,
- add that page title following that URL on the same line as the URL or on the line immediately following, then
perform those steps 1-4 for each subsequent URL in that list?
If not using Linux Bash, what other way is there?
text-processing url
|
show 1 more comment
Using Linux Bash, how can one turn a text file with:
http://example.org/
https://en.wikipedia.org/wiki/Main_Page
https://www.youtube.com/watch?v=mGQFZxIuURE
into:
http://example.org/ Example Domain
https://en.wikipedia.org/wiki/Main_Page Wikipedia, the free encyclopedia
https://www.youtube.com/watch?v=mGQFZxIuURE Mike Perry - The Ocean (ft. Shy Martin) - YouTube
or into:
http://example.org/
Example Domain
https://en.wikipedia.org/wiki/Main_Page
Wikipedia, the free encyclopedia
https://www.youtube.com/watch?v=mGQFZxIuURE
Mike Perry - The Ocean (ft. Shy Martin) - YouTube
?
How can one
- pull a URL from a list of URLs in a file,
- load the page,
- extract its page title,
- add that page title following that URL on the same line as the URL or on the line immediately following, then
perform those steps 1-4 for each subsequent URL in that list?
If not using Linux Bash, what other way is there?
text-processing url
1
bash isn't (much of) a text processing tool
– Jeff Schaller♦
8 hours ago
@JeffSchaller How can it be done then? How can one turn an extermely long URL list (e.g. list of YouTube videos) into URL + title?
– neverMind9
8 hours ago
2
I'm sure you'll have several good answers momentarily; just because bash is your shell doesn't mean it has to do everything. If you can spell out exactly how you want the transformation to happen, that would help answerers. How did you get "Example Domain" out of "example.org", for example(!) ? Are you sending a request to that URL and extracting an HTML tag?
– Jeff Schaller♦
8 hours ago
1
That should go ^^^ up in your Question as an edit, please & thank you!
– Jeff Schaller♦
8 hours ago
1
I'd recommend using Perl rather than bash scripting. Text processing is Perl's speciality.
– David Yockey
8 hours ago
|
show 1 more comment
Using Linux Bash, how can one turn a text file with:
http://example.org/
https://en.wikipedia.org/wiki/Main_Page
https://www.youtube.com/watch?v=mGQFZxIuURE
into:
http://example.org/ Example Domain
https://en.wikipedia.org/wiki/Main_Page Wikipedia, the free encyclopedia
https://www.youtube.com/watch?v=mGQFZxIuURE Mike Perry - The Ocean (ft. Shy Martin) - YouTube
or into:
http://example.org/
Example Domain
https://en.wikipedia.org/wiki/Main_Page
Wikipedia, the free encyclopedia
https://www.youtube.com/watch?v=mGQFZxIuURE
Mike Perry - The Ocean (ft. Shy Martin) - YouTube
?
How can one
- pull a URL from a list of URLs in a file,
- load the page,
- extract its page title,
- add that page title following that URL on the same line as the URL or on the line immediately following, then
perform those steps 1-4 for each subsequent URL in that list?
If not using Linux Bash, what other way is there?
text-processing url
Using Linux Bash, how can one turn a text file with:
http://example.org/
https://en.wikipedia.org/wiki/Main_Page
https://www.youtube.com/watch?v=mGQFZxIuURE
into:
http://example.org/ Example Domain
https://en.wikipedia.org/wiki/Main_Page Wikipedia, the free encyclopedia
https://www.youtube.com/watch?v=mGQFZxIuURE Mike Perry - The Ocean (ft. Shy Martin) - YouTube
or into:
http://example.org/
Example Domain
https://en.wikipedia.org/wiki/Main_Page
Wikipedia, the free encyclopedia
https://www.youtube.com/watch?v=mGQFZxIuURE
Mike Perry - The Ocean (ft. Shy Martin) - YouTube
?
How can one
- pull a URL from a list of URLs in a file,
- load the page,
- extract its page title,
- add that page title following that URL on the same line as the URL or on the line immediately following, then
perform those steps 1-4 for each subsequent URL in that list?
If not using Linux Bash, what other way is there?
text-processing url
text-processing url
edited 25 mins ago
K7AAY
1,99411029
1,99411029
asked 8 hours ago
neverMind9neverMind9
688321
688321
1
bash isn't (much of) a text processing tool
– Jeff Schaller♦
8 hours ago
@JeffSchaller How can it be done then? How can one turn an extermely long URL list (e.g. list of YouTube videos) into URL + title?
– neverMind9
8 hours ago
2
I'm sure you'll have several good answers momentarily; just because bash is your shell doesn't mean it has to do everything. If you can spell out exactly how you want the transformation to happen, that would help answerers. How did you get "Example Domain" out of "example.org", for example(!) ? Are you sending a request to that URL and extracting an HTML tag?
– Jeff Schaller♦
8 hours ago
1
That should go ^^^ up in your Question as an edit, please & thank you!
– Jeff Schaller♦
8 hours ago
1
I'd recommend using Perl rather than bash scripting. Text processing is Perl's speciality.
– David Yockey
8 hours ago
|
show 1 more comment
1
bash isn't (much of) a text processing tool
– Jeff Schaller♦
8 hours ago
@JeffSchaller How can it be done then? How can one turn an extermely long URL list (e.g. list of YouTube videos) into URL + title?
– neverMind9
8 hours ago
2
I'm sure you'll have several good answers momentarily; just because bash is your shell doesn't mean it has to do everything. If you can spell out exactly how you want the transformation to happen, that would help answerers. How did you get "Example Domain" out of "example.org", for example(!) ? Are you sending a request to that URL and extracting an HTML tag?
– Jeff Schaller♦
8 hours ago
1
That should go ^^^ up in your Question as an edit, please & thank you!
– Jeff Schaller♦
8 hours ago
1
I'd recommend using Perl rather than bash scripting. Text processing is Perl's speciality.
– David Yockey
8 hours ago
1
1
bash isn't (much of) a text processing tool
– Jeff Schaller♦
8 hours ago
bash isn't (much of) a text processing tool
– Jeff Schaller♦
8 hours ago
@JeffSchaller How can it be done then? How can one turn an extermely long URL list (e.g. list of YouTube videos) into URL + title?
– neverMind9
8 hours ago
@JeffSchaller How can it be done then? How can one turn an extermely long URL list (e.g. list of YouTube videos) into URL + title?
– neverMind9
8 hours ago
2
2
I'm sure you'll have several good answers momentarily; just because bash is your shell doesn't mean it has to do everything. If you can spell out exactly how you want the transformation to happen, that would help answerers. How did you get "Example Domain" out of "example.org", for example(!) ? Are you sending a request to that URL and extracting an HTML tag?
– Jeff Schaller♦
8 hours ago
I'm sure you'll have several good answers momentarily; just because bash is your shell doesn't mean it has to do everything. If you can spell out exactly how you want the transformation to happen, that would help answerers. How did you get "Example Domain" out of "example.org", for example(!) ? Are you sending a request to that URL and extracting an HTML tag?
– Jeff Schaller♦
8 hours ago
1
1
That should go ^^^ up in your Question as an edit, please & thank you!
– Jeff Schaller♦
8 hours ago
That should go ^^^ up in your Question as an edit, please & thank you!
– Jeff Schaller♦
8 hours ago
1
1
I'd recommend using Perl rather than bash scripting. Text processing is Perl's speciality.
– David Yockey
8 hours ago
I'd recommend using Perl rather than bash scripting. Text processing is Perl's speciality.
– David Yockey
8 hours ago
|
show 1 more comment
1 Answer
1
active
oldest
votes
With curl
and pup:
while IFS= read -r url
do
printf "%s " "$url"
curl -sL "$url" | # fetch the page
pup 'title text{}' # get the text of the title tag
done < input
Works but seems to require a blank line at the end of the input file. Otherwise, the last URL in the file isn't processed.
– David Yockey
8 hours ago
1
Not a blank line, just a newline at the end. That's the definition of a line
– muru
8 hours ago
You're right: a newline. Just pointed it out since the prob manifests if you paste a bunch of URLs in a text editor and neglect to hit Enter at the end of the last URL in the list.
– David Yockey
8 hours ago
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f524484%2fhow-to-generate-urltitle-from-url-list-automatically-using-bash-or-other-tool%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
With curl
and pup:
while IFS= read -r url
do
printf "%s " "$url"
curl -sL "$url" | # fetch the page
pup 'title text{}' # get the text of the title tag
done < input
Works but seems to require a blank line at the end of the input file. Otherwise, the last URL in the file isn't processed.
– David Yockey
8 hours ago
1
Not a blank line, just a newline at the end. That's the definition of a line
– muru
8 hours ago
You're right: a newline. Just pointed it out since the prob manifests if you paste a bunch of URLs in a text editor and neglect to hit Enter at the end of the last URL in the list.
– David Yockey
8 hours ago
add a comment |
With curl
and pup:
while IFS= read -r url
do
printf "%s " "$url"
curl -sL "$url" | # fetch the page
pup 'title text{}' # get the text of the title tag
done < input
Works but seems to require a blank line at the end of the input file. Otherwise, the last URL in the file isn't processed.
– David Yockey
8 hours ago
1
Not a blank line, just a newline at the end. That's the definition of a line
– muru
8 hours ago
You're right: a newline. Just pointed it out since the prob manifests if you paste a bunch of URLs in a text editor and neglect to hit Enter at the end of the last URL in the list.
– David Yockey
8 hours ago
add a comment |
With curl
and pup:
while IFS= read -r url
do
printf "%s " "$url"
curl -sL "$url" | # fetch the page
pup 'title text{}' # get the text of the title tag
done < input
With curl
and pup:
while IFS= read -r url
do
printf "%s " "$url"
curl -sL "$url" | # fetch the page
pup 'title text{}' # get the text of the title tag
done < input
answered 8 hours ago
murumuru
39.5k595171
39.5k595171
Works but seems to require a blank line at the end of the input file. Otherwise, the last URL in the file isn't processed.
– David Yockey
8 hours ago
1
Not a blank line, just a newline at the end. That's the definition of a line
– muru
8 hours ago
You're right: a newline. Just pointed it out since the prob manifests if you paste a bunch of URLs in a text editor and neglect to hit Enter at the end of the last URL in the list.
– David Yockey
8 hours ago
add a comment |
Works but seems to require a blank line at the end of the input file. Otherwise, the last URL in the file isn't processed.
– David Yockey
8 hours ago
1
Not a blank line, just a newline at the end. That's the definition of a line
– muru
8 hours ago
You're right: a newline. Just pointed it out since the prob manifests if you paste a bunch of URLs in a text editor and neglect to hit Enter at the end of the last URL in the list.
– David Yockey
8 hours ago
Works but seems to require a blank line at the end of the input file. Otherwise, the last URL in the file isn't processed.
– David Yockey
8 hours ago
Works but seems to require a blank line at the end of the input file. Otherwise, the last URL in the file isn't processed.
– David Yockey
8 hours ago
1
1
Not a blank line, just a newline at the end. That's the definition of a line
– muru
8 hours ago
Not a blank line, just a newline at the end. That's the definition of a line
– muru
8 hours ago
You're right: a newline. Just pointed it out since the prob manifests if you paste a bunch of URLs in a text editor and neglect to hit Enter at the end of the last URL in the list.
– David Yockey
8 hours ago
You're right: a newline. Just pointed it out since the prob manifests if you paste a bunch of URLs in a text editor and neglect to hit Enter at the end of the last URL in the list.
– David Yockey
8 hours ago
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f524484%2fhow-to-generate-urltitle-from-url-list-automatically-using-bash-or-other-tool%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
bash isn't (much of) a text processing tool
– Jeff Schaller♦
8 hours ago
@JeffSchaller How can it be done then? How can one turn an extermely long URL list (e.g. list of YouTube videos) into URL + title?
– neverMind9
8 hours ago
2
I'm sure you'll have several good answers momentarily; just because bash is your shell doesn't mean it has to do everything. If you can spell out exactly how you want the transformation to happen, that would help answerers. How did you get "Example Domain" out of "example.org", for example(!) ? Are you sending a request to that URL and extracting an HTML tag?
– Jeff Schaller♦
8 hours ago
1
That should go ^^^ up in your Question as an edit, please & thank you!
– Jeff Schaller♦
8 hours ago
1
I'd recommend using Perl rather than bash scripting. Text processing is Perl's speciality.
– David Yockey
8 hours ago