Need help: How to download embedded audio from a web site

9 replies [Last post]
xilixi
Offline
Joined: 11/28/2013

Hello! I was wondereing if any of you know how to download embedded audio from this website:

http://www.speakenglish.co.uk/phrases/at_a_restaurant

I tried but I couldn't. I think that page was created using Drupal. Your help is welcome. Thanks!

ivaylo
Offline
Joined: 07/26/2010

В 19:18 +0100 на 26.02.2014 (ср), name at domain
написа:
> Hello! I was wondereing if any of you know how to download embedded audio
> from this website:

> http://www.speakenglish.co.uk/phrases/at_a_restaurant

With a little bit of black Bash magic :) (and Abrowser developers
console to find the tracks URLs)

wget http://www.speakenglish.co.uk/phrases/at_a_restaurant -O - | \
grep -E "soundManager\.play.*[a-zA-Z0-9_\\-]+'" -o | \
cut -d"'" -f2 | \
sed -e 's@^@http://speaklanguages.cachefly.net/sound/english/mp3/@g' \
-e 's/$/.mp3/g' | wget -i -

This will download all files on the page in the current directory.
Better use the attached bash script, because my mail client might have
deformed the one-liner. If you want a different page, theoretically you
only need to change the first URL (at_a_restaurant). If consider this to
be software, then consider that I've released it into the public domain.

Enjoy! :)

xilixi
Offline
Joined: 11/28/2013

Wow! It works wonders! Thank you so much again ;)

Magic Banana

I am a member!

Offline
Joined: 07/24/2010

Just for the pleasure of showing a simpler solution:
$ wget http://www.speakenglish.co.uk/phrases/at_a_restaurant -O - | awk -F \' '/play\(/ { print $2 ".mp3" }' | wget -B http://speaklanguages.cachefly.net/sound/english/mp3/ -i -

;-)

ivaylo
Offline
Joined: 07/26/2010

В 00:01 +0100 на 27.02.2014 (чт), name at domain написа:
> Just for the pleasure of showing a simpler solution:

> ;-)

:)

xilixi
Offline
Joined: 11/28/2013

Hi guys, I'm sorry to bother you again, but I tried downloading mp3's from this website again, and I wasn't able.

http://www.speakenglish.co.uk/vocab/countries_and_nationalities

In this page (countries and nationalities), after running the script I only download countries. I don't get neither Adjectives, Nationalities or Inhabitants.

Any suggestion or help?

Thanks again in advance.

Magic Banana

I am a member!

Offline
Joined: 07/24/2010

That is because our solutions only look at the first "play(" string in every line of the page. In the page you now show, there are several "play(" per line. We just need an additional 'tr' command to break those lines:
$ wget http://www.speakenglish.co.uk/vocab/countries_and_nationalities -O - | tr '.' '\n' | awk -F \' '/^play\(/ { print $2 ".mp3" }' | wget -B http://speaklanguages.cachefly.net/sound/english/mp3/ -i -

xilixi
Offline
Joined: 11/28/2013

Amazing! It works pefectly! You all are very smart. Thanks again!! :)

Magic Banana

I am a member!

Offline
Joined: 07/24/2010

You do not need to be smart to use those commands. Merely to learn them. The 'tr' command I added simply turns the dots into newlines (in the general case, the characters in the first set in argument into the respective characters in the second set in argument).

If you want to learn those text processing commands (the web page is the text in your problem), you can take a look at the sets of slides numbered 2 to 7 on that page: http://dcc.ufmg.br/~lcerf/en/mda.html#slides

The data for the exercises are still available at the mentioned addresses and solutions are included (try to achieve the desired results without looking at the solutions!).

xilixi
Offline
Joined: 11/28/2013

I will take a look at it... It seems interesting ;)