Need help: How to download embedded audio from a web site
- Login o registrati per inviare commenti
Hello! I was wondereing if any of you know how to download embedded audio from this website:
http://www.speakenglish.co.uk/phrases/at_a_restaurant
I tried but I couldn't. I think that page was created using Drupal. Your help is welcome. Thanks!
В 19:18 +0100 на 26.02.2014 (ср), name at domain
написа:
> Hello! I was wondereing if any of you know how to download embedded audio
> from this website:
> http://www.speakenglish.co.uk/phrases/at_a_restaurant
With a little bit of black Bash magic :) (and Abrowser developers
console to find the tracks URLs)
wget http://www.speakenglish.co.uk/phrases/at_a_restaurant -O - | \
grep -E "soundManager\.play.*[a-zA-Z0-9_\\-]+'" -o | \
cut -d"'" -f2 | \
sed -e 's@^@http://speaklanguages.cachefly.net/sound/english/mp3/@g' \
-e 's/$/.mp3/g' | wget -i -
This will download all files on the page in the current directory.
Better use the attached bash script, because my mail client might have
deformed the one-liner. If you want a different page, theoretically you
only need to change the first URL (at_a_restaurant). If consider this to
be software, then consider that I've released it into the public domain.
Enjoy! :)
Wow! It works wonders! Thank you so much again ;)
Just for the pleasure of showing a simpler solution:
$ wget http://www.speakenglish.co.uk/phrases/at_a_restaurant -O - | awk -F \' '/play\(/ { print $2 ".mp3" }' | wget -B http://speaklanguages.cachefly.net/sound/english/mp3/ -i -
;-)
В 00:01 +0100 на 27.02.2014 (чт), name at domain написа:
> Just for the pleasure of showing a simpler solution:
> ;-)
:)
Hi guys, I'm sorry to bother you again, but I tried downloading mp3's from this website again, and I wasn't able.
http://www.speakenglish.co.uk/vocab/countries_and_nationalities
In this page (countries and nationalities), after running the script I only download countries. I don't get neither Adjectives, Nationalities or Inhabitants.
Any suggestion or help?
Thanks again in advance.
That is because our solutions only look at the first "play(" string in every line of the page. In the page you now show, there are several "play(" per line. We just need an additional 'tr' command to break those lines:
$ wget http://www.speakenglish.co.uk/vocab/countries_and_nationalities -O - | tr '.' '\n' | awk -F \' '/^play\(/ { print $2 ".mp3" }' | wget -B http://speaklanguages.cachefly.net/sound/english/mp3/ -i -
Amazing! It works pefectly! You all are very smart. Thanks again!! :)
You do not need to be smart to use those commands. Merely to learn them. The 'tr' command I added simply turns the dots into newlines (in the general case, the characters in the first set in argument into the respective characters in the second set in argument).
If you want to learn those text processing commands (the web page is the text in your problem), you can take a look at the sets of slides numbered 2 to 7 on that page: http://dcc.ufmg.br/~lcerf/en/mda.html#slides
The data for the exercises are still available at the mentioned addresses and solutions are included (try to achieve the desired results without looking at the solutions!).
I will take a look at it... It seems interesting ;)
- Login o registrati per inviare commenti