Fora few domains, my .htaccess file isn't effective at blocking them from access by my Abrowser

12 risposte [Ultimo contenuto]
amenex
Offline
Iscritto: 01/04/2015

After viewing my Recent Visitors file in Cpanel, I set about blocking a couple of image pirates.

Adding their IP addresses (gleaned with nslookup) to my .htaccess file (located under my domain in which the images are stored) did not block their use of my images. Similarly, adding the actual domain names to my .htaccess file also failed to achieve any blocking effect.

Then I submitted a support ticket to my ISP ... shortly thereafter, I saw in the recent visitors file that they had tried to access those images and had gotten the desired 403 response ("access denied"). Nevertheless, still no luck with my installation of Abrowser ... apparently still not blocked. Blocked for other users, though.

Just checking, I tried the offending URL's in another installation of ABbrowser on a USB-connected hard drive on which Trisquel is also installed ... Whoopie ! - Images gone from the offending website ... seemingly, a success.

Not so lucky; images still appear in my Abrowser. I deleted the History from this reluctant-to-block Abrowser ... no luck.
Icecat also shows the blocked images; it's in the same Trisquel installation.

This has only happened with a couple of domains. All the rest are immediately blocked and return "access denied" right after I've added their domain name or its IP address to my .htaccess file.

Something about my installations of Abrowser and Icecat is over-riding the blocked status of those domains. What could that be ?

George Langford
amenex

Magic Banana

I am a member!

I am a translator!

Offline
Iscritto: 07/24/2010
jxself
Offline
Iscritto: 09/13/2010

For sure.

amenex
Offline
Iscritto: 01/04/2015

Magic Banana: Thanks for the heads-up. FSF doesn't like mis-appropriation either. It pollutes the code.

The first two images in question are part of an online course, open to all without any payment, which I participated in and then preserved fifty years later, all at my own expense. Folks access that course from all over the world, every day, so their education continues even after my own university has long since abandoned it.

Similarly, the third image is one which I made with a long-obsolete microscope, an early digital camera, and an early computer, but with a modern imaging technique that makes use of the unique abilities of the polarizing illumination system in the obsolete microscope, demonstrating that the obsolete microscope can produce an image every bit as good as a modern microscope, but with the flexibility of the old microscope.

George Langford
amenex

jxself
Offline
Iscritto: 09/13/2010

"Something about my installations of Abrowser and Icecat is over-riding the blocked status of those domains. What could that be ?"

It's nothing to do with the browser. At all. The determination of which HTTP response code to send (either "200 OK" along with the requested resource or to instead serve "403 Forbidden" is entirely the decision of the web server. Entirely.

It'll be almost impossible for us to determine exactly what is wrong without in-depth access to the full (and I mean full) web server configuration. Since you mention cPanel (a proprietary program.... eek!) you will probably want to get in touch with your hosting company to discuss the problem in more detail.

Sorry, but this is not a Trisquel problem.

amenex
Offline
Iscritto: 01/04/2015

jxself properly states that the decision of whether to block (403) or allow (200) is up to the server, which is where my .htaccess file resides. All the server can do is allow or deny access to the data that I am paying them to make available to the public. It's data which I created with my own efforts and expertise ... and it's copyrighted according to U.S. law. A lot is lost when that data is presented out of its original context. I do not mind when the source is displayed along with the data; some folks actually do ask for permission before re-publishing that data.

That said, what I am seeing is that when my ISP's service technician opens the same URL that is publishing my copyrighted material without permission, attribution, or paying for its storage, that material produces a 403 response. When I open that URL from the same laptop and wireless router as the one about which this discussion centers, but from the Trisquel installation that resides on a USB-connected hard drive, it produces the same 403 response as that which my ISP technician gets. Therefore, the rephrased question to be answered is: What is different between these two Trisquel installations, and how do I find that difference ? It happens for both Abrowser and Icecat. And it happens even after I have deleted the browsing history from Abrowser.

An aside: I belatedly discovered that on the very same webpage as the other two images are displayed, there is a third image of mine which has had its URL "laundered" through a third party who actually did make attribution, but who fails to protect its own presentation from being appropriated by the present two domains. All .I can do for that image is to block the third party's access to the image, because my domain is not in the URL that points to my image from the offending webpage.

George Langford
amenex

jxself
Offline
Iscritto: 09/13/2010

It's not a browser problem. Sorry, but you're barking up the wrong tree.

quantumgravity
Offline
Iscritto: 04/22/2013

I think it was already pointed out, but all the .htaccess file does is basically changing the config of the server on a limited scope without touching the server config files.
That means that there is no way Abrowser is to blame for that.

amenex
Offline
Iscritto: 01/04/2015

quantumgravity: When I started using the .htaccess file to block copyright usurpers back in May of 2017, the effect was immediate for all but one domain. Now it's two more who are escaping that block, but for only one of my two Trisquel installations; both have the same IP address because they access the internet through the same router. For everyone else, access to my images through those domain names is blocked. My own ISP showed that, as well as a couple other IP addresses.

Again: _Something_ is different about these two Trisquel installations, and how do I discover that difference ?
Keep in mind that the effect happens for _both_ Icecat and Abrowser.

quantumgravity
Offline
Iscritto: 04/22/2013

Can you checkout the access.log of the apache server or does your provider deny you permissions to do so?

amenex
Offline
Iscritto: 01/04/2015

quantumgravity asked:
[QUOTE] Can you checkout the access.log of the apache server or does your provider deny you permissions to do so? [/QUOTE]

Yes; that said, if I download the .GZ visitor log file mid-month, somehow that truncates the end-of-month version, and so I hesitate to do so because I'm keeping track of a bunch of malevolents who frequently run "HEAD / HTTP/1.1" queries (netting them nothing) and never look at any other URL's in my domain. That's why I try to preserve the whole-month logs.

However, I _can_ ask for the Latest Visitors access log, but I can view only the most recent thousand visits. The monthly logs are typically of 200,000 to 350,000 visits, but considerably more information is saved in them. I have been keeping track of the malevolents since September 2016. Their visits peaked in November 2016 and have continued at almost the same pace ever since.

I've been keeping Trisquel up-to-date on both hard drives - the external one as well as the internal one. If there's a place that the offending code might be stored on one hard drive and not the other, I'd very much like to know about it.

George Langford
amenex

gslima
Offline
Iscritto: 11/23/2017

first of all, it would only block if they use your site via proxy. There's no way of blocking image access from other sites, only if people use only https protocol, and high security browser config (with cross-site protection config). Else it would be nice if some foundation had the hashes of all images published on the net. On the server side, there's a way to configure https for only accepting within domain queries, you could config that with nginx (ok, it's badly complicated). But someone here is right, there's a way to track down from which website did the request originated, specially if the site uses some known system server to provide the images via proxy that can log and track visitors, so the author of this action can be properly sued. I have no knowledge from a technology that can simultaneously track visitors and deny queries from another domain.
For the difference between systems, DNS, maybe?

amenex
Offline
Iscritto: 01/04/2015

Continuing the saga ...

When I list an offending (hotlinking) domain in my .htaccess files as blocked, upon reload after updating the .htaccess file, the domain is about 95% of the time promptly denied access, though I always have to reload the hotlinking domain's URL in order to see the effect.

Turns out that the [nonfree] domain management software on my ISP's server lists the domains that I have explicitly blocked in my .htaccess file as explicitly _allowed_ access, exactly the opposite to what I intend. That has been a known bug for a long time, dating back to 2010.

Therefore, the few domains who have been successfully bypassing hotlink protection when their domains' URLs are accessed from _only my computer_ (and no one else's !) are accessing my domain's images through my server's domain managment software and not through my domain's .htaccess file. When my ISP's support team looks at the hotlinking URL's they get a 403 error, just as I intend. When I look at the recent visitors access log for my domain, my own router's IP address shows a 403 error for every attempt to load those domains' URLs from whichever Trisquel installation I am running at the time.

Nevertheless, for these few hotlinking domains, mine appear to be on the only computer that cannot see the effect of denying access from those domains to my domain's image files. I have actually blocked _all_ access to my domain from those hotlinking sites by listing their servers' IP addresses in the "deny from" section of my .htaccess file.

On a first-order level, this shouldn't bother me, but if the hotlinking domains are loading malevolent script from the webpage that contains the hotlink to my computer when I look at my domain's hotlinked image, that could threaten the security of my computer.

George Langford
amenex