Searx metasearch engine

46 replies [Last post]
SuperTramp83

I am a translator!

Offline
Joined: 10/31/2014
Mzee
Offline
Joined: 07/10/2013

Why do people vote this down? Thanks a lot for sharing this link. I'll give it a try or is there anything wrong with this project?

vinzv

I am a member!

Offline
Joined: 10/11/2014

May it get's downvoted just because there's no further information in his post?

quantumgravity
Offline
Joined: 04/22/2013

Isn't this some kind of rule on the internet?
There are always a few jerks who downvote stuff without any reason whatsoever.

Mampir
Offline
Joined: 12/16/2009

I downvoted, because giving a non-descriptive link like this is very annoying and inconsiderate.

danieru
Offline
Joined: 01/06/2013

Exactly! People should leave the solitary & non-descriptive links for spammers.

Mangy Dog

I am a member!

I am a translator!

Offline
Joined: 03/15/2015

looks good Tramp :)

it's very fast and the layout is agreable.

Seeks is a cool search engine i used a bit
I've been wanting to use Yacy for a while but miss some spare RAM

http://yacy.net/en/index.html

It is fully decentralized, all users of the search engine network are equal, the network does not store user search requests and it is not possible for anyone to censor the content of the shared index. We want to achieve freedom of information through a free, distributed web search which is powered by the world's users.

Dave_Hunt

I am a member!

Offline
Joined: 09/19/2011

I set searx as my browser's default engine, and tried a bunch of searches. I'd say the results were comparable to ones I'd get by hitting bing or google. but the result presentation is wonderfully clean and easy to read. Thanks rfor the suggestion.

davidnotcoulthard
Offline
Joined: 02/28/2014

"I set searx as my browser's default engine" How does one go about doing that (i.e. what's the keyword thingie?)

SuperTramp83

I am a translator!

Offline
Joined: 10/31/2014

David - I'm not quite sure I understand your question correctly. Are you asking how to add and select searx in the search toolbar (or wahtever that is called) - like when you select a word and right click the mouse "search with searx"?

davidnotcoulthard
Offline
Joined: 02/28/2014

I was asking about adding it as the default search engine on Abrowser.

SuperTramp83

I am a translator!

Offline
Joined: 10/31/2014

Oooh I see!?? Contact Ruben and ask.

Edited: Added !??

Magic Banana

I am a member!

Offline
Joined: 07/24/2010

To add searx to the search engines, just go to https://searx.me, click on the arrow at the left of Abrowser's search box and choose "Add searx". From the same drop-down menu, you can "manage the the search engine" (for instance to put searx first in the list).

As for searching from the address bar, you indeed need to modify keyword.URL from about:config. I am not sure what should be its new value. Maybe "https://searx.me/q=" (without the quotes).

SuperTramp83

I am a translator!

Offline
Joined: 10/31/2014

Bananna Magique - I asked him if he meant this.. I think my senility is getting worse..

By the way the procedure you describe is the common one found in all major search engines. It requires allowing javascript. You can do that without javascript..

davidnotcoulthard
Offline
Joined: 02/28/2014

Wown, didn't know FF could do that. Thanks!

@Supertramp83 Yes....this is what I actually meant (I didn't know what "Adding to abrowser" would end up implying.....but I'll ask Ruben anyway now that you've mentioned it :)

tomlukeywood
Offline
Joined: 12/05/2014

from a user interface point of view this search engine is the best i ever used!
and they managed to make it all js free!

and its libre software
bye bye ixquick!

but do you need a powerful web-server to run the search engine yourself?

a_slacker_here
Offline
Joined: 06/30/2013

Oh my! Searx is fast, very fast. I think I'll be using it as my default search engine and see if the results are as good.

I've looked at preferences and they work similarly to ixquick's old days: you can select what search engines to use.

Magic Banana

I am a member!

Offline
Joined: 07/24/2010

Searx appreciates your concern regarding logs, so take the code and run it yourself!

Let's say I do not want to run it myself for ecological reasons. What does searx have in its logs? Did one of you take a look at the code? DockDuckGo, Ixquick and Startpage promise that they neither store the IP address of their users nor their user agents. We cannot verify because we do not have the source code... however nobody can know whether searx.me really runs the code it pretends to run. I would feel better if searx.me would do the same promise.

Tirifto
Offline
Joined: 02/19/2015

Looking at the paragraph right above that one, it says:

> It provides basic privacy by mixing your queries with searches on other platforms without storing search data. Queries are made using a POST request on every browser (except chrome*). Therefore they show up in neither our logs, nor your url history.

As for other info, it doesn't seem to say. However, considering that it focuses on privacy, and unlike similar services (cough cough DuckDuckGo) it's free software, I'm going to trust it and most likely use it in close future.

J.B. Nicholson-Owens
Offline
Joined: 06/09/2014

name at domain wrote:
> Looking at the paragraph right above that one, it says:
>
> > It provides basic privacy by mixing your queries with searches on
> other platforms without storing search data. Queries are made using a
> POST request on every browser (except chrome*). Therefore they show up
> in neither our logs, nor your url history.

Right, but that doesn't address anything substantively so it doesn't
answer anything at all.

That's just a description of what any service has to do ("mixing your
queries") with an unverifiable claim of privacy ("without storing search
data"). All services have to keep some track of who made which query in
order to get them the right results back (we don't want people asking
for cat pictures to get non-cat pictures back). Either the user's
computer does this on their computer alone (which isn't feasible given
the size of the DB needed to do this search in a thoroughgoing way) or
the user submits the query to another computer (which makes it SASS with
the privacy-busting that goes with all SASS).

> As for other info, it doesn't seem to say. However, considering that it
> focuses on privacy, and unlike similar services (cough cough DuckDuckGo)
> it's free software, I'm going to trust it and most likely use it in
> close future.

Startpage/Ixquick are not foreseeably worse metasearch engines, and
being free software SASS doesn't mean anything for the end user because
one doesn't do the entire job on one's own computer.

Tirifto
Offline
Joined: 02/19/2015

> being free software SASS doesn't mean anything for the end user because
one doesn't do the entire job on one's own computer.

While this is true, usability-wise, I also appreciate the contribution to free software! :)

quantumgravity
Offline
Joined: 04/22/2013

"All services have to keep some track of who made which query in
order to get them the right results back (we don't want people asking
for cat pictures to get non-cat pictures back). "

Why should this be the case?
It's perfectly possible to perform a search for some user and delete all data about the process afterwards.

J.B. Nicholson-Owens
Offline
Joined: 06/09/2014

name at domain wrote:
> Why should this be the case?
> It's perfectly possible to perform a search for some user and delete all
> data about the process afterwards.

Which means while the search is going on (the queries have been issued
to the dependent search engines but the replies have not yet been
received) the metasearch engine keeps track of where the query came from
so it knows where to send the aggregated response. This is in the nature
of having another computer do one's computing instead of using one's own
computer to do the job.

With SASS there's no way to verify whether the server deletes data about
the search, only the service owner really knows. With metasearch engines
there's an additional problem if users are identifiable using search
query data or a combination of query data and hosted resource logs (for
those search engines organizations that also own hosting resources like
Google, Microsoft, and Yahoo each do) because it would be possible to
correlate search queries and hits on hosted resources.

We could increase privacy with a search service where the client could
do searches against a copy of a database hosted on the user's computer.
Thus the user could search that database without revealing those
searches to anyone unless the user wants to reveal what they're
searching for. The database files could contain a wide variety of data
on indexed websites so even if someone knows what database files someone
has it's not clear what a user is looking for. And the database files
should be freely shared so one doesn't have to get updates from any one
particular place. So long as each database file is signed with a trusted
key, one should be able to get database files from anywhere.

Research is needed to determine if most users searches could work well
enough with a small database to make this a reasonably private search
system that is sufficiently effective for providing good results. I
expect new additions to search engines would take longer to get to some
users, but that could be helped with increased database file sharing.

J.B. Nicholson-Owens
Offline
Joined: 06/09/2014

name at domain wrote:
> Searx appreciates your concern regarding logs, so take the code and run
> it yourself!

I don't see how running a metasearch engine yourself helps you keep your
privacy so long as queries are made from your computer to the search
engines you depend on.

> Let's say I do not want to run it myself for ecological reasons. What
> does searx have in its logs? Did one of you take a look at the code?
> DockDuckGo, Ixquick and Startpage promise that they neither store the IP
> address of their users nor their user agents. We cannot verify because
> we do not have the source code... however nobody can know whether
> searx.me really runs the code it pretends to run. I would feel better if
> searx.me would do the same promise.

You wouldn't know what any SASS runs regardless of whether that SASS is
released as free software. That's why SASS software freedom concerns are
a moot point for users (but not SASS providers) and why SASS is such a
danger to users. This is also why people like Richard Stallman are clear
to distinguish the freedom concerns of SASS from the freedom concerns of
programs running on one's own computer.

Magic Banana

I am a member!

Offline
Joined: 07/24/2010

This is also why people like Richard Stallman are clear to distinguish the freedom concerns of SASS from the freedom concerns of programs running on one's own computer.

Stallman's definition of SaaSS excludes search engine. See https://www.gnu.org/philosophy/who-does-that-server-really-serve.html :

Services such as search engines collect data from around the web and let you examine it. Looking through their collection of data isn't your own computing in the usual sense—you didn't provide that collection—so using such a service to search the web is not SaaSS.

J.B. Nicholson-Owens
Offline
Joined: 06/09/2014

name at domain wrote:
> Stallman's definition of SaaSS excludes search engine. See
> https://www.gnu.org/philosophy/who-does-that-server-really-serve.html :

Yes, and thanks for pointing that out. I find that article to be a good
description for a lot of today's computing. But I don't find that to be
terribly detailed about how things could be organized better for
society. I don't expect such articles to cover foreseeable futures in
that degree of detail; articles like these are valuable because they're
written for a mass audience of novices. The issues articles like these
need to explain are complex enough (particularly for anyone new to
thinking about such issues) without having to also explain how services
could be better organized.

In other words, that is a good description of how things have been and
are now but not what could be. The privacy implications of "looking
through their collection of data" are profound, particularly for things
one doesn't intend to publish (which would distinguish editing Wikipedia
articles from searching Wikipedia articles) which could very well be
done more privately than it is today (as I explain elsewhere in this
thread). This change in how searching is done places searching more
squarely in the realm of service as a software substitute (SaaSS).

Also, as another poster points out in this thread, even today's searches
(where your queries are fed to another computer) could be done more
privately than they are today with the most popular search engines.
Unfortunately, as far as I know, such privacy claims are unverifiable
because they involve "looking through their collection of data". No
matter what claims to privacy searcx, Startpage/Ixquick, or any other
organization offers I can't be sure my searches aren't being used only
to provide me with hit lists and are completely inaccessible to anyone
for any reason without my explicit per-use consent. The only way to be
sure is to never provide that exploitation opportunity in the first place.

quantumgravity
Offline
Joined: 04/22/2013

"In other words, that is a good description of how things have been and
are now but not what could be. The privacy implications of "looking
through their collection of data" are profound, particularly for things
one doesn't intend to publish (which would distinguish editing Wikipedia
articles from searching Wikipedia articles) which could very well be
done more privately than it is today (as I explain elsewhere in this
thread). This change in how searching is done places searching more
squarely in the realm of service as a software substitute (SaaSS)."

It does not.
You can't browse through a search engines database with your own computer only. Therefore, it is not software as a service substitute.
Just because something is bad for privacy, it's not immediately SaaSS.
There's a specific definition for this term, and it refers to services that do some job you could do on your own computer if you had the right piece of free software.
The central topic here is control and freedom;
sure, connecting to websites raises all kinds of privacy issues and we should take care of them.
But let's not get confused with terms.

J.B. Nicholson-Owens
Offline
Joined: 06/09/2014

name at domain wrote:
> It does not.
> You can't browse through a search engines database with your own
> computer only. Therefore, it is not software as a service substitute.

And here I thought it did because I've already implemented the idea and
it seems to be working. I guess I didn't think to run this past you
first. Perhaps you should be more interested in what's good for
improving privacy than dogmatically sticking to someone's definitions of
terminology.

quantumgravity
Offline
Joined: 04/22/2013

"Perhaps you should be more interested in what's good for
improving privacy than dogmatically sticking to someone's definitions of
terminology."

I think you still don't get it: you can be interested in "what's good for improving your privacy" _without_ calling the use of search engines SaaSS.

SaaSS =! things that are bad for privacy

And it's far from acting dogmatically if you stick to the definition of a term;

tomlukeywood
Offline
Joined: 12/05/2014

char SaSS;
char things_that_are_bad_for_privacy;
void main(){
SaSS =! things_that_are_bad_for_privacy;
}

dont know of the =! operator but it compiles...

Magic Banana

I am a member!

Offline
Joined: 07/24/2010

SaSS =! things_that_are_bad_for_privacy;

You instantiate the variable SaSS (a char, which can encode a Boolean) with "not" the variable things_that_are_bad_for_privacy (again: a char, which can encode a Boolean). There is no "=!" operator, just the binary operator "=" followed by the unary operator "!".

For the discussion, that would mean that SaSS is things that are good for privacy. An obvious bug!

quantumgravity
Offline
Joined: 04/22/2013

"For the discussion, that would mean that SaSS is things that are good for privacy. An obvious bug!"

Not in the new quantumgravity programing language i'm using;
go and build up some knowledge about computers, banana! You're completely outdated.

tomlukeywood
Offline
Joined: 12/05/2014

what compiler/assembler/interpreter are you using for the quantumgravity programing language?

t3g
t3g
Offline
Joined: 05/15/2011

You do make an interesting point. Let's say they release the code for this under the AGPL and we are all free to use it and hope it is just like the code run on the site. I say hope because they, as the copyright holder of the code, can change the license at any point and could be running a a custom version of the code under a different (potentially non-free) license.

J.B. Nicholson-Owens
Offline
Joined: 06/09/2014

name at domain wrote:
> https://searx.me/about

I'm assuming that you wanted people to discuss this, hence you posted
about it. I've participated elsewhere in this thread but I thought I'd
try to more clearly explain my position on this to gain some clarity
about how searx works.

I don't understand how the searx metasearch engine lives up to its claim
of being "Tracking free"[1] or "privacy respecting"[2].

There's nothing inherently "tracking free" about using a metasearch
engine. Quite the contrary, if one submits queries to multiple search
engines from one's own computer (whether manually or through a program
running on one's own computer), one now is being tracked by multiple
search engines. Depending on the details, the metasearch engine and/or
the dependent search engines can record time/datestamps, IP addresses,
queries, and returned results at a minimum. So if your computer runs
searx and searx contacts Google, Yahoo, and Bing on your behalf, those 3
search engines can record the aforementioned information. If this is how
searx works, you'd be better off using just one of these search engines
yourself in a typical way because then you wouldn't be directly handing
information to the other search engines. In this case searx makes the
privacy problem worse, not better, by spreading privacy-busting
information to additional search engines.

If your computer runs searx and searx relays your request to a central
searx server which then relays your query to 3 search engines from
there, the central searx server can record the aforementioned
information (even if it strips away anything identifiable before passing
the query on to the subordinate search engines).

Either way, your privacy is not respected and your queries can be
correlated by time, IP address, and possibly other data.

Unfortunately neither the link you gave nor the project's wiki are clear
about how searx respects ones privacy. While searx claims that it "never
shares anything with a third party"[3] it's not clear how this promise
is kept. As a result, I'm not sure searx is any better than using a VPN
plus a search engine that claims to not keep logs and not host its
services on VMs which are hosted at privacy-busting providers (such as
DuckDuckGo using Amazon.com's VMs). Startpage/Ixquick would appear to be
the best of the lot here but unfortunately I can't verify if
StartPage/Ixquick is lying in their claims because Startpage/Ixquick are
essentially SASS. A free software SASS doesn't help here because the
moment you submit your job to be run by another computer you have no
control over what that computer does.

Does anyone have a clear description of how searx queries are routed to
its search engine dependents in such a way that the user's information
is not able to be kept?

Thanks.

[1] https://github.com/asciimoo/searx
[2] https://github.com/asciimoo/searx/wiki/Contribution-Guide
[3] https://searx.me/about

tomlukeywood
Offline
Joined: 12/05/2014

heres a list of sites running the software:
https://github.com/asciimoo/searx/wiki/Searx-instances

tomlukeywood
Offline
Joined: 12/05/2014

i was trying to run searx and after installing all the dependency’s it wanted me to

i get these errors:
tom@trisquel7-gnu-linux:~/searx-master$ python searx/webapp.py
Traceback (most recent call last):
File "searx/webapp.py", line 53, in
from searx.engines import (
File "/home/tom/searx-master/searx/engines/__init__.py", line 209, in
engine = load_engine(engine_data)
File "/home/tom/searx-master/searx/engines/__init__.py", line 51, in load_engine
engine = load_module(engine_name + '.py')
File "/home/tom/searx-master/searx/engines/__init__.py", line 44, in load_module
module = load_source(modname, filepath)
File "/home/tom/searx-master/searx/engines/wikidata.py", line 3, in
from searx.poolrequests import get
File "/home/tom/searx-master/searx/poolrequests.py", line 48, in
http_adapters = cycle((HTTPAdapterWithConnParams(pool_connections=100), ))
File "/home/tom/searx-master/searx/poolrequests.py", line 14, in __init__
self.max_retries = requests.adapters.Retry(0, read=False)
AttributeError: 'module' object has no attribute 'Retry'

anyone have any ideas?

tomlukeywood
Offline
Joined: 12/05/2014

i compiled it sucsessfully
run this command to save you some headaces!
sudo apt-get install git build-essential libxslt-dev python-dev python-virtualenv python-pybabel zlib1g-dev

heres a screenshot it runs rely rely rely fast btw
but on thing to note is it only seems to work when i am connected to the internet
so i am a bit confused as to were the search results are coming from...:

Screenshot from 2015-05-02 21:04:33.png
t3g
t3g
Offline
Joined: 05/15/2011

Are you running with the Python that comes with Trisquel (CPython) or something like PyPy?

tomlukeywood
Offline
Joined: 12/05/2014

just python 2.7

SuperTramp83

I am a translator!

Offline
Joined: 10/31/2014

ugh what a tasteless awful radioactive green terminal color you got there ay ay ay!!
:P

tomlukeywood
Offline
Joined: 12/05/2014

REMOVED DUE TO SHOULD OF BEEN IN TROLL HOLE

Martago
Offline
Joined: 01/11/2015

Thanks for the link, it looks really good.

SuperTramp83

I am a translator!

Offline
Joined: 10/31/2014

yes. da fastest best looking search engine I've ever used :)

Mangy Dog

I am a member!

I am a translator!

Offline
Joined: 03/15/2015

Yes it's Speedy Gonzales.......fffft!
thanks Tramp ;)

Mampir
Offline
Joined: 12/16/2009

J.B. Nicholson-Owens:
> I don't understand how the searx metasearch engine lives up to its claim
of being "Tracking free"[1] or "privacy respecting"[2].

Meta search are useful as proxies. A given Searx instance can still track you by logging your search queries, but it's still much better than everyone sending all their queries directly to Google, Yahoo, Bing or DuckDuckGo, which allows them to accumulate tons-and-tons of information about everyone.

Searx instances provide decentralization, in the sense that not all search queries of everyone are logged in one place. That's why Searx is much better for freedom and privacy compared to Google or DuckDuckGo.

Staring you own Searx instance is mostly useful for other people to use, not you, at least not directly. It's better for your privacy to use other people's instances, while providing a Searx instance for others.

J.B. Nicholson-Owens:
> Depending on the details, the metasearch engine and/or the dependent search engines can record time/datestamps, IP addresses, queries, and returned results at a minimum.

When you use other people's instances dependent search engines can't get your IP address or any useful tracking information about you.

Magic Banana:
> DockDuckGo, Ixquick and Startpage promise that they neither store the IP address of their users nor their user agents. [...] I would feel better if searx.me would do the same promise.

Promises are worthless. Especially worthless when give by a commercial interest, such as DuckDuckGo. Searx instances provide a much more real and practical solution for privacy, even if there could be a lot more work done.

Mampir
Offline
Joined: 12/16/2009

> When you use other people's instances dependent search engines can't get your IP address or any useful tracking information about you.

Although thats not really true, if you are using some of the other Searx search types, like images and video. Searching for images will make you download image from other search engines and this can be used to track you by Google, Bing and others.

You can use the RequestPolicy Continued add-on to protect against a site connecting you to 3rd party sites. There are other similar add-ons too, like uMatrix, but I haven't used it.