comparison of search engines

15 Antworten [Letzter Beitrag]
chaosmonk

I am a member!

I am a translator!

Offline
Beigetreten: 07/07/2017

I would like to see a search engine that meets the following:

(1a) Basic functionality is accessible without proprietary JavaScript. I think it's obvious why this is important.

(1b) All functionality is accessible without proprietary JavaScript. If users sacrifice functionality by having JavaScript disabled, they will be tempted to enable it. We should avoid guiding users into a situation like this.

(1c) All JavaScript is free. Otherwise, the search engine can only be recommended with the caveat "but don't use it with JavaScript enabled." Even when explicitly recommending a search engine it is easy to forget to say this, or for the user to forget that you said it. Implicitly recommending a search engine (by using it while someone is watching, or by a distro/browser setting it as the default) also guides the user toward non-free JavaScript.

(2) Server-side software is free. A search engine provider can claim to not track its users, but without the ability to run your own instance you cannot really know that this is the case.

(3a) Search results are good enough that the user rarely or never needs to use another search engine, as the other search engine might be less free.

(3b) The search engine does not have severe bugs that require the user to frequently use another search engine, as the other search engine might be less free.

(4) Does not rely on a proprietary search engine. In the long-term, a free search engine cannot replace a proprietary search engine if it relies on the proprietary search engine in order to work.

I consider (1a-c) to be the bare minimum needed for FSDG-compatibility. ("A free system distribution must not steer users towards obtaining any nonfree information for practical use, or encourage them to do so.") I consider (2) to be good for freedom and important for privacy, but not strictly necessary for FSDG compatibility. (3a-b) are not freedom or privacy issues, but they are important for adoption. (4) is not critical for a short term solution, but it is important in the long term.

Here are some commonly-recommended search engines and how well they meet each criteria:

Searx
(1a) Yes
(1b) Yes
(1c) Yes
(2) Yes
(3a) Yes
(3b) Yes/No (still a little buggy, but improving)
(4) No (relies on several non-free search engines)

YaCy
(1a) Yes
(1b) Yes
(1c) Yes
(2) Yes (users are encouraged to run YaCy locally for decentralization)
(3a) No
(3b) Yes
(4) Yes (relies on users)

StartPage
(1a) Yes
(1b) Yes
(1c) No
(2) No
(3a) Yes
(3b) Yes
(4) No (relies on Google)

DuckDuckGo
(1a) Yes
(1b) No (the HTML-only site does not have feature parity with the regular site)
(1c) No
(2) No
(3a) Yes
(3b) Yes
(4) No (relies on several non-free search engines)

YaCy is fully free and seems like the best long term solution, but it suffers from a chicken-and-egg problem: it needs more users before it can start providing useful search results, but until it provides useful search results it will be difficult to gain users.

Searx is also fully free, and although it is a little buggy it is much more usable than YaCy. I sometimes need to refresh the page a couple of times, and every now and then it stops working for a few minutes unless I use a different instance. This is fine for someone who is willing to endure minor inconvenience for their freedom and privacy, but a normal person will go back to using Google the moment they encounter a bug.

StartPage and DuckDuckGo have some freedom issues, although StartPage is a little better. Both claim to not track their users, but without the ability to run an instance on our own server we have no choice but to just take them at their word.

Can anyone recommend other search engines that equally or better meet these criteria?

zangisharp
Offline
Beigetreten: 01/08/2019

For duckduckgo you can use a non javascript version : https://addons.mozilla.org/en-US/firefox/addon/duckduckgo-lite/
or use https://duckduckgo.com/lite/

chaosmonk

I am a member!

I am a translator!

Offline
Beigetreten: 07/07/2017

This satisfies (1a), but (1b) and (1c) are still problems.

zigote
Offline
Beigetreten: 03/04/2019

Perhaps we should also add:

(5) Is it hosted with malicious company (DDG is on Amazon)

(6) How does it perform on:

https://webbkoll.dataskydd.net/en
https://www.ssllabs.com/ssltest/
https://securityheaders.io/

(7) Has an .onion instance (searx has)

SuperTramp83

I am a translator!

Offline
Beigetreten: 10/31/2014

>(DDG is on Amazon)

The owner is also unreliable (this is my opinion, not generic conclusion) as he was previously the owner of this indecent shit --> https://archive.is/9wR4O

andyprough
Offline
Beigetreten: 02/12/2015

Ecosia.org works on Icecat with javascript disabled (looks horrible though), has pretty good search results, and many projects on its github page. I'm not sure to what degree they are running libre software server side or otherwise, but they clearly have a lot of open projects. Kind of hard to find much information on how their various software bits are cobbled together and licensed. They claim to be privacy respecting.

They purchase their search results from Bing, which may be a deal breaker for some.

andyprough
Offline
Beigetreten: 02/12/2015

Also, they plant trees based on revenue from ads they serve up to their users, so they've got a save-the-planet angle going for them.

Masaru Suzuqi -under review-
Offline
Beigetreten: 06/06/2018

I liked that duck, though...

6BA32C41-0E38-4023-ACB2-FC9998040DD2.jpeg
Dmitry Alexandrov
Offline
Beigetreten: 03/07/2019

name at domain wrote:
> Here are some commonly-recommended search engines and how well they meet each criteria:
> ...
> DuckDuckGo
> ...

> Can anyone recommend other search engines that equally or better meet these criteria?

Sure, that’s quite easy to do!

*Google*

(1a)
> Basic functionality is accessible without proprietary JavaScript.

Yes.

(1b)
> All functionality is accessible without proprietary JavaScript.

I am not aware of any, that is not accessible. At least, image, video and news search works, thus it’s better than DuckDuckGo.

(1c)
> All JavaScript is free.

No, of course, not.

(2)
> Server-side software is free.

This paragraph is unsuitable for a policy. We can never know that for sure.

Speaking loosely, though, unless you are using some unconventional definition of ‘free’, I bet, yes, all software used by Google is free.

But judging from the given objective, you are using ‘free’ unconventionally to mean something like ‘open source’ in a literal sense of these words.

(3a)
> Search results are good enough that the user rarely or never needs to use another search engine, as the other search engine might be less free.

Well... I do sometimes have to fall back to Yandex and in rare cases even to DuckDuckGo, both indeed meet these criteria worse than Google.

(4)
> Does not rely on [another] proprietary search engine.

Same as (2): we cannot know that for sure.

I believe, it does not, though. :-)

chaosmonk

I am a member!

I am a translator!

Offline
Beigetreten: 07/07/2017

> I am not aware of any, that is not accessible. At least, image, video and news search works, thus it’s better than DuckDuckGo.

Yes, Google is better than DuckDuckGo in this respect.

> (2)
> > Server-side software is free.

> This paragraph is unsuitable for a policy. We can never know that for sure.

> Speaking loosely, though, unless you are using some unconventional definition of ‘free’, I bet, yes, all software used by Google is free.

> But judging from the given objective, you are using ‘free’ unconventionally to mean something like ‘open source’ in a literal sense of these words.

Sorry, I worded (2) poorly. I mean that the server-side software is distributed as free software to users of the search engine. This is the case with Searx and YaCy. Google's server-side software is likely free, but they don't distribute it to their users. Google does not have an obligation to do this (unless some of that software is under the AGPL, but it probably isn't). Nor do StartPage or DuckDuckGo. However, when StartPage and DuckDuckGo claim to not track their users (Google does not claim this, of course) we have only their word to go off of. With Searx and YaCy, you don't have to trust the search provider not to track you, because you can run your own instance on a machine you control.

> (4)
> > Does not rely on [another] proprietary search engine.

> Same as (2): we cannot know that for sure.

In Google's case we cannot. With YaCy and Searx we do know because of (2). With StartPage and DuckDuckGo we can't know for sure either, but it would be a very strange thing for them to lie about.

> I believe, it does not, though. :-)

Lol. Yeah, probably not.

nadebula.1984
Offline
Beigetreten: 05/01/2018

The best search engine usable in China is the Chinese version of Microsoft Bing.

GrevenGull
Offline
Beigetreten: 12/18/2017

I don't understand YaCy. Is it a program one needs to download?

PS. I like posts like this :)

chaosmonk

I am a member!

I am a translator!

Offline
Beigetreten: 07/07/2017

> I don't understand YaCy. Is it a program one needs to download?

Search engines like Google/Bing/Yahoo have large servers that "crawl" (follow links from page to page) and "index" (store information about the pages) the web, generating a large index of many many web pages. When the user submits a search query, this index is used to find and point them toward relevant web pages. As you can imagine, it takes a lot of resources to crawl through and index enough web pages to provide useful results for every possible search query.

Startpage, DuckDuckGo, and Searx don't crawl and index the web, so they rely on other search engines that do have an index of the web. Searx is a good free software replacement for the *front end* of several proprietary search engines, but it can never actually be a full replacement for Google/Bing/etc, because it would be useless without them.

The idea behind YaCy is, instead of one organization devoting a lot of resources to crawling and indexing the entire web, each YaCy user devotes a small amount of resources to crawling and indexing just part of the web. When you do a YaCy search, your search query is sent to the "freeworld" (the network of all YaCy users). If another YaCy user has indexed a page that is relevant to your search query, then that page will show up in your search results. This way, no one person has to index the *entire* web. As long as *somebody* has indexed a web page, that page can show up in everyone's search results.

The problem is that there aren't very many YaCy users yet, so collectively they have not indexed as many pages as Google/Bing/etc, so often times the search results you get aren't very helpful. If more people start using YaCy then the situation will improve, but until the situation improves it will not be easy to persuade people to start using YaCy. This is the chicken-and-egg problem I referred to in my original post.

You can install and start YaCy with these instructions,[1] then while it is running you can search with it by going to http://localhost:8090/index.html in your browser. To set it as a search engine in Abrowser/Icecat, click the "..." toward the right of the URL bar and select "Add search engine".

[1] http://wiki.yacy.net/index.php/En:DebianInstall

GrevenGull
Offline
Beigetreten: 12/18/2017

When doing the first line of the install instruction I get permission denied. Even when including sudo.

PS. So YaCy grows bigger by each search? Or it grows bigger by people manually adding something to somewhere?

chaosmonk

I am a member!

I am a translator!

Offline
Beigetreten: 07/07/2017

> When doing the first line of the install instruction I get permission denied. Even when including sudo.

Try editing /etc/apt/sources.list.d/yacy.list manually, with sudo, to contain "deb http://debian.yacy.net ./" (without the quotes).

> PS. So YaCy grows bigger by each search? Or it grows bigger by people manually adding something to somewhere?

You don't have to do anything manually. YaCy works in the background while you have it running. When you install YaCy you will be prompted to specify a maximum amount of disk space that YaCy is allowed to use to store its data.

GrevenGull
Offline
Beigetreten: 12/18/2017

So I "donate" a portion of my disk which YaCy can use to store information about the Internet? So YaCy just gathers information about the sites I visit and communicates with my disk?