youtube-dl through tor?

22 replies [Last post]
tonlee
Offline
Joined: 09/08/2014

Can you let youtube-dl go through tor?
Thank you.

zigote
Offline
Joined: 03/04/2019

Yes.

Better: use avideo and an onion instance of invidious with command line options like:

--proxy socks5://user:pass@127.0.0.1:9050

and HTTP headers like those of Tor browser:

--user-agent <...>
--add-header <...>
--add-header <...>
--add-header <...>

Read 'avideo --help'.

tonlee
Offline
Joined: 09/08/2014

Is this what you write
youtube-dl --proxy "socks5://127.0.0.1:9050/"

I read you can utilize proxychains.

https://github.com/WWBN/AVideo
Is avideo software, you have to compile?

zigote
Offline
Joined: 03/04/2019

pip install avideo

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

Why avideo? That fork hasn't been maintained in years. The [version pip will install][1] is based on youtube-dl 2017.9.27.

[1]: https://pypi.org/project/avideo/

zigote
Offline
Joined: 03/04/2019

> Why avideo?

Because (AFAIK) it does not execute JavaScript. youtube-dl does.

> The [version pip will install][1] is based on youtube-dl 2017.9.27.

That is the latest version. Of course - avideo itself is based on youtube-dl with "improvements"

https://notabug.org/GPast/avideo

> That fork hasn't been maintained in years.

It works though.

Considering the above - what would you choose and why? Personally I don't like some script downloading other scripts (especially non-free ones) and running them on my machine.

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

> Because (AFAIK) it does not execute JavaScript. youtube-dl does.

There has been some confusion around youtube-dl's JS "interpreter". It is not a full, turing-complete JS interpreter like web browsers have. It's closer to a scraper. It parses JS files in order to find and calculate video URLs. I've read several analyses as to what youtube-dl actually does with JS files. Here's the first one I was able to find just now.

https://lists.nongnu.org/archive/html/gnu-linux-libre/2017-07/msg00000.html

> That is the latest version. Of course - avideo itself is based on youtube-dl with "improvements"

avideo removes youtube-dl's interpreter, and for the (many) situations in which this causes YouTube downloads to fail adds a misleading message incorrectly attributing the issue to DRM. It does not otherwise "improve" youtube-dl. It can be seen the commit history that after making these changes, ongoing maintainance consisted of merging new upstream versions, until September 2017 when the developer disappeared. The latest version is not a palemoon-esque fork of an old youtube-dl version. It's just a very out-of-date youtube-dl version with some functionality removed.

> It works though.

Not really. I used to use it a couple years ago, and it did not work for most YouTube videos because it would not extract video URLs from JS files. That was back when it was still reasonably up-to-date with youtube-dl. Considering that old youtube-dl releases tend not to work either, I presume that avideo's performance is more likely worse, not better, than it was a few years ago.

> Considering the above - what would you choose and why? Personally I don't like some script downloading other scripts (especially non-free ones) and running them on my machine.

If this were an accurate description of what youtube-dl does, I would choose avideo, and I indeed did choose avideo back when I believed that this was what youtube-dl did. After learning more, I changed my mind and now choose youtube-dl.

zigote
Offline
Joined: 03/04/2019

Thanks for this info. Although it clarifies some points I think the issue with JS requires a more in-depth analysis. I have not been digging into the python code of youtube-dl's JS interpreter because its readability is not good enough for my eye. Here is another comment/question, you may want to have a look at the links:

https://notabug.org/GPast/avideo/issues/12#issuecomment-11299

Generally my thoughts are:

1. Anything that downloads a script and attempts to work depending on "directions" inside that script may open the door to security issues. E.g. you don't know what input may result in an exploit. To know you need the in-depth analysis. All this doesn't need Turing completeness or any reputed person's affirmations of license compatibility.

2. It may not be ideal but if something works without 1 - it is surely safer

3. youtube-dl has had commits after the date of that mailing list link you give

If the issues above need to be resolved I would rather write my own downloader than relying on the dated comments of people on a mailing list which is not even part of the actual software development.

However as I said avideo in combination with invidious (Tor instance) works for me and I haven't had an issue with any video. You say "not really" - can you show an example of video which cannot be downloaded with avideo but only with youtube-dl?

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

> I think the issue with JS requires a more in-depth analysis

I'd be happy to see more in-depth analysis. So far I have only seen numerous repetitions of (1) someone claiming that youtube-dl executes non-free JS, (2) someone else clarifying that it actually parses the JS for the information it needs to calculate video URLs, and (3) the conversation fizzling out with no additional information provided. I'm open to new information, but in the absence of new information I see the youtube-dl debate as a similar situation to the debate over whether Chromium is non-free. It's just the same rumors being repeated over and over and when you press people for evidence the conversation ends. (I'm not referring to you. I believe that you're actually interested applying further scrutiny and determining the truth.)

> However as I said avideo in combination with invidious (Tor instance) works for me and I haven't had an issue with any video. You say "not really" - can you show an example of video which cannot be downloaded with avideo but only with youtube-dl?

I remember it being an issue for nearly any video containing a studio recording of a song. Live versions of songs and covers sometimes worked. A home video by some random person would usually work. A video pusblished a well-known organization usually would not. It's been a while. I would need to reinstall avideo to be able to point to a specific video that doesn't work.

If you haven't run into this issue, my guess is that it is because you are using Invidious instead of YouTube as the video source, and Invidious is calculating the video URL for you, shifting the task of handling the video URL calculation from your machine via youtube-dl to the Invidious server via Invidious. When I have a chance I will install avideo and try to determine whether or not this is the case.

> If the issues above need to be resolved I would rather write my own downloader than relying on the dated comments of people on a mailing list which is not even part of the actual software development.

As far as I've been able to find, none of the people involved in the actual software development are of the opinion that this is an issue either. The only "evidence" I've seen that youtube-dl's handling of JS files *is* a problem is unsubstantiated rumors on mailing lists.

The kind of analysis I'd be interested in is one that specifically investigates whether youtube-dl's handling of the function that calculates the video URLs is secure or not, and, if it is not secure, filing an issue on youtube-dl's bug tracker.

I would also be interested to know how Invidious handles this function. I have no familiarity with the syntax of Crystal and have a hard time understanding Invidious's code, but either (a) Invidious handles it similarly, in which case shifting this task from youtube-dl to Invidious simply moves any potential security issues from the client side to the server side, or (b) Invidous handles this function in a more secure way than youtube-dl does, in which case someone should implement Invidious's approach for youtube-dl (that is if youtube-dl's approach is found to be insecure). However, threads like this

> https://notabug.org/GPast/avideo/issues/12#issuecomment-11299

dont provide any new information. It follows the same format as every other discussion I've seen of this, except that (1) and (2) are reversed. Also, the commenter refers to "DRM", which leads me to think that they have based their understanding of the situation on avideo's inaccurate error message rather than their own observations.

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

I believe that [this][1] is where Invidious does the equivalent of what youtube-dl does to extract the video URLs. At first glance, it looks to me like it similarly parses a JS file in order to determine a function (named "decrypt_function" in this case) that calculates the video URL, but I don't really understand it and am not qualified to say whether Invidious's approach and/or youtube-dl's approach is secure.

[1]: https://github.com/omarroth/invidious/blob/02d4186b110bb5cf8cc07672a4ff45f7189eb3e6/src/invidious/helpers/signatures.cr

zigote
Offline
Joined: 03/04/2019

As a whole I agree with what you wrote.
Some comments:

> I'd be happy to see more in-depth analysis.

The problems with that are:

1. Such analysis needs a lot of time. Trying to understand what code written by others does (especially when it has poor readability which I believe is the case with both youtube-dl and invidious) is tedious.

2. The code is not static, so continuous monitoring would be necessary. Not really a work-for-free task.

That's why I say it would be easier to write one's own downloader. If the only goal is to download YouTube videos with invidious it should be quite simple. The URL of the video is on the lines containing type="video/mp4" and the title is in the title tag. Zero JS required.

3. It is important who makes the analysis and with what purpose. Merely confirming "OK, it does not run non-free JS" is not an evaluation of security (or privacy). From discussions seen so far it seems the only concern is "freedom" which is too weak requirement (at least to my mind).

> similar situation to the debate over whether Chromium is non-free

Except that with Chromium it is not a discussion of security.

> I remember it being an issue for nearly any video containing a studio recording of a song.

Before posting my previous reply I explicitly checked popular VEVO musical videos with many millions of views and I could download them without problems. Invidious obviously does a good job and as long as it is doing its thing on the server and not on my machine that's fine with me.

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

> Trying to understand what code written by others does (especially when it has poor readability which I believe is the case with both youtube-dl and invidious) is tedious.

Yes, with enough time I could probaby understand how the relevant code works in both projects, but even then would not be qualified to confirm that either approach is secure. So this is something I would be "happy" to see, but I'm not in a position to make happen. Since I can't meaningfully advance this conversation myself, I'm only interested in having it again if/when there is new information.

> Invidious obviously does a good job and as long as it is doing its thing on the server and not on my machine that's fine with me.

I use my own Invidious instance, so for me a vulnerability in youtube-dl would not be worse than a similar vulnerability in Invidious. Even if I relied on someone else's server to run Invidious, I would still care whether or not their server was secure. My goals are much broader than just securing my own device, though.

You might be interested in [hypervideo][1]. It's a fork of youtube-dl similar to avideo, but it is more up-to-date with youtube-dl and it uses an API call to Invidious to get YouTube video URLs when. This approach is not for me, and it sounds like your avideo+Invidious setup is working well for now, but you might keep hypervideo in mind in case avideo ever gives you problems.

[1]: https://libregit.org/heckyel/hypervideo

zigote
Offline
Joined: 03/04/2019

Thanks for the info about hypervideo. It might be a better alternative as it seems kept up to date with the upstream version.

As for your usage of your own invidious instance - I don't know why you needs this. I don't deliver this web service to users, so I just don't need it. My comments are mainly on the client side which seems to be the topic.

zigote
Offline
Joined: 03/04/2019

OK, so I tried hypervideo.
Results (compared to avideo):

1. Trying to download a video from an invidious .onion instance (without --proxy switch)

hypervideo: detects that it is Invidious and starts downloading through Invidious API *without* proxy, i.e. not through Tor. This is wrong. onion instances must not be resolved without using Tor and accessing anything on them should not result in de-anonymization of the connection.

avideo: does not resolve the .onion domain (as expected)

2. Same as 1 but using --proxy socks5://127.0.0.1:9050/

hypervideo:

- shows: "ERROR: Unable to download webpage: HTTP Error 429: Too Many Requests (caused by )" for videos which show up on home page of invidious. Using alternative .onion instance seems to be a workaround

- shows: "ERROR: Unable to download webpage: HTTP Error 403: Forbidden (caused by )" for VEVO videos

avideo: downloads all videos successfully (no need for workaround)

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

> As for your usage of your own invidious instance - I don't know why you needs this.

High-traffic instances get blocked by Google and need humans to complete CAPTCHAs in order for them to remain functional. The main Invidious instances have dealt with this by using virtual sweatshop labor to solve the CAPTCHAs. I am concerned about privacy, but I am also concerned about globalization and economic exploitation, so I prefer to avoid this dilemma by running my own, low-traffic instance that does not run into these CAPTCHAs.

> OK, so I tried hypervideo.
> Results (compared to avideo):

It sounds like avideo is better for your use case. I wonder how much work would be involved in producing an up-to-date version of avideo. The commits involved in merging new upstream versions are pretty small, but rather opaque, so it would be a matter of learning to understand the build system, which seems a little convoluted: https://notabug.org/GPast/avideo/wiki/Developer+Documentation#avideo-patching-system

zigote
Offline
Joined: 03/04/2019

I have never had an issue with Invidious instances (except rare glitches of the site for a short time, e.g. channels not working or sth similar).

To my mind running an invidious instance would make sense if the instance serves many people. A single person running a single instance for the same single client is simply creating bloat on one's own machine (and still being connected to Google). If your goal is to simply avoid Youtube's JS perhaps you could rather use https://github.com/trizen/youtube-viewer as a client.

> It sounds like avideo is better for your use case.

What do you mean "my" use case? youtube-dl, avideo, hypervideo all do the same: download videos. What other use cases are there?

> producing an up-to-date version of avideo

Considering it works flawlessly I don't know why it may need an update. Is it missing any features? Does it have any bugs which are fixed upstream? IOW: let's first evaluate the needs and the purpose before estimating potential effort.

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

> I have never had an issue with Invidious instances

That is because there are human workers solving CAPTCHAs for pennies to keep the large Invidious instances working.

> To my mind running an invidious instance would make sense if the instance serves many people. A single person running a single instance for the same single client is simply creating bloat on one's own machine (and still being connected to Google).

It's not for a single client. It runs on my VPS, and I use it with multiple devices. I am also beginning to share it with some other people, though I want to increase the amount of traffic slowly to see how much is possible to get away with getting flagged by Google.

> What do you mean "my" use case? youtube-dl, avideo, hypervideo all do the same: download videos. What other use cases are there?

By "your use case" I mean downloading YouTube videos from a specific site (an Invidious onion instance) that is not supported by youtube-dl and probably only works at all with avideo as a side effect of YouTube support. Other use cases are downloading videos from YouTube or from the numerous other sites that youtube-dl supports. youtube-dl support for YouTube and these other sites breaks all the time due to API changes. Only up-to-date versions of youtube-dl work reliably, so avideo is not better for these use cases.

> If your goal is to simply avoid Youtube's JS perhaps you could rather use https://github.com/trizen/youtube-viewer as a client.

My goal is a more free and just society. Avoiding YouTube's JS and tracking and (equally importantly) helping other people to do the same is a very small part of that. I used to use and recommend youtube-viewer, but when Invidious was created switched to using and recommending that because it has fewer obstacles to adoption. Exploiting virtual sweatshop labor does not result in a more free and just society, so when I learned that the large Invidious instances rely on this I began using my own instance.

> Is it missing any features? Does it have any bugs which are fixed upstream?

Yes, many. You can check the last several years of youtube-dl's commit history to see what has changed. It sounds like these missing features and bugs have not impacted the way you use avideo though. I only brought up the idea of reviving avideo because you said "Thanks for the info about hypervideo. It might be a better alternative as it seems kept up to date with the upstream version", which led me to think you might be interested in a more up-to-date avideo given that hypervideo did not work for you.

I'll check this thread again to see if you have any other comments, but I otherwise don't see a need to continue this discussion. Your first comment answered the OP's question, and I only replied in order to understand why you reccommended avideo, which you have now explained. Thanks.

zigote
Offline
Joined: 03/04/2019

> Only up-to-date versions of youtube-dl work reliably, so avideo is not better for these use cases.

Actually I have used avideo with various other sites and those that are not supported are very very few (less than 10%). I don't know if youtube-dl supports those same sites which don't work with avideo.

> I mean downloading YouTube videos from a specific site (an Invidious onion instance)

Yes, I replied in that context because that is what the OP asked for. But it is nice that you provided extra info.

> My goal is a more free and just society.

Unfortunately that is not going to happen only through the software we (do not) use. But that is a different discussion.

> You can check the last several years of youtube-dl's commit history to see what has changed.

As soon as I finish the 324059812059 other things I am working on :))

> I otherwise don't see a need to continue this discussion

Yep. Thanks for sharing your thoughts.

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

> Unfortunately that is not going to happen only through the software we (do not) use. But that is a different discussion.

Agreed.

> As soon as I finish the 324059812059 other things I am working on :))

:)

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

I finally got around to installing avideo and testing some things.

First, I used avideo to attempt to download a music video in three different ways: via youtube.com, via an Invidious clearnet instance, and via an Invidious onion instance. avideo failed to download from youtube.com directly, but was able to download from both Invidious instances. This was what I anticipated. However, I then compared avideo's handling of the Invidious onion instance to that of youtube-dl, and the results were not what I expected.

As zigote descibed, without "--proxy socks5://127.0.0.1:9050/", avideo does not download the video, and with "--proxy" it downloads the video from the Invidious onion domain. youtube-dl on the other hand, with or without "--proxy", downloads the video from a "googlevideo.com" domain. Without "--proxy", the IP address contained in the googlevideo.com URL is that of my VPN. With "--proxy", the IP address is an unfamiliar one, presumably a Tor exit node. So youtube-dl does appear to go through Tor when told to, but whereas avideo only connects to the onion domain, youtube-dl downloads the video from a clearnet domain.

andyprough
Offline
Joined: 02/12/2015

So which is the better way? If you knew upfront that avideo only connects through the onion domain, and that is the feature you wanted, I guess that would be better. If you wanted to control how it connects, I guess yt-dl is better.

chaosmonk

I am a member!

I am a translator!

Offline
Joined: 07/07/2017

> So which is the better way?

There are two questions: (1) what should happen when the user attempts to download from an onion domain without Tor, and (2) when downloading a video through Invidious (whether via Tor or not) should Invidious proxy the video or provide a googlevideo.com URL for yt-dl to access (via Tor if appropriate)?

In the case of (1) I prefer avideo's behavior. I consider yt-dl's behavior dangerous. If a user is bothering to use an Invidious onion instance, they likely want to download the video anonymously. If they accidentally forget to use the "--proxy" option with yt-dl then Google will receive their IP address along with the ID of the video they are downloading. It's safer for the program to fail like avideo does, and force the user to either specify "--proxy" if they want to be anonymous or specify a non-onion URL if they don't.

In the case of (2) I can see a case for either behavior. Proxying puts a large load on the Invidious server, so if possible the ideal behavior would probably be to respect the default set by the instance admin unless overriden with "local=true" or "local=false" in the URL as per https://github.com/omarroth/invidious/wiki/List-of-URL-parameters

andyprough
Offline
Joined: 02/12/2015

Also it sounds like avideo's method is usually going to result in a slower download. Since you and I are already using VPNs, I'm thinking yt-dl's method might be faster and better for most instances. For those not using a VPN, then the avideo method may be better for the typical case.