client side search engine?

48 risposte [Ultimo contenuto]
tomlukeywood
Offline
Iscritto: 12/05/2014

there are a many privacy concerns with search engines like google, bing, etc
and there are search engines like startpage or duckduckgo that state they dont data mine
but the data is still going to there servers so you cant be compleatly sure weather that data
is being mined or not

but i have thought of a possible solution and i am surprised that i cant find a program that dose this

why not have a client side search engine?
the only data you would have to store is text
so i would not imagine that it would be more than a few gigabytes

and when you need to update the list of websites
all you need to download from a server is the list of
websites that have changed so you will only download a
very small amount of data every-time you update the list

and everything as its done on your own computer
is completely within your control and there will be no data mining

has somone already done this?
if not is there a reason?

lembas
Offline
Iscritto: 05/13/2010

The closest to that is a p2p search https://en.wikipedia.org/wiki/YaCy

Unfortunately it's quite a bit more than a few GBs. Still it's a great idea.

SuperTramp83

I am a translator!

Offline
Iscritto: 10/31/2014

Installed yacy a few days ago. I think it's great. The problem is that on my slow netbook is slow slow slow. The main problem it's the huge amount of ram used by yacy. :(
other then that and the fact that you don't find always what you're looking for yacy is fantastic!

tomlukeywood
Offline
Iscritto: 12/05/2014

i have 16gb ram i will try yacy

Casey Parker
Offline
Iscritto: 02/06/2015

The fundamental problem is understanding how search works in the first
place. Google and other similar search engines have thousands upon
thousands of servers working quite hard to find what you're searching for.
Even with all of that, they still have to optimize the hell out of it. The
data itself isn't really available without crawling the whole web. If
you're searching without doing that part, you're searching an index. If
you're searching an index, it's *somebody's* index.

On Tue Feb 17 2015 at 6:59:49 AM <name at domain> wrote:

> i have 16gb ram i will try yacy
>

tomlukeywood
Offline
Iscritto: 12/05/2014

"If you're searching an index, it's *somebody's* index."

whats wrong with that you just download the text file
you do not have to let that someone know what your searching
and any program being run is run on your computer under
your controll

why would searching a text index take a long time?

"The
data itself isn't really available without crawling the whole web"

it could take time but once the list is complete
it could be copyed

ssdclickofdeath
Offline
Iscritto: 05/19/2013

Unfortunately, the index will become outdated even as it is being collected, so it could never be "complete."

tomlukeywood
Offline
Iscritto: 12/05/2014

it would still be enough for a good search engine
and you could update it every few weeks

ssdclickofdeath
Offline
Iscritto: 05/19/2013

True...

Ishamael
Offline
Iscritto: 08/29/2014

Doesn't installing the OpenJDK framework represent a security threat?

SuperTramp83

I am a translator!

Offline
Iscritto: 10/31/2014

Casey Parker - we need educated and nice people here so we really don't need you. Consider posting your brilliant shit on another forum in the future. And that future starts now.

Casey Parker
Offline
Iscritto: 02/06/2015

Call me not nice, fine. Don't go calling yourself educated. Learn your
history. Learn why things are how they are. Learn why your ideas make no
sense. This is a forum for users of an open-source operating system, not a
conspiracy theorist forum. I'll stay, thanks.

On Thu Feb 19 2015 at 11:04:48 AM <name at domain> wrote:

> Casey Parker - we need educated and nice people here so we really don't
> need
> you. Consider posting your brilliant shit on another forum in the future.
> And
> that future starts now.
>

onpon4
Offline
Iscritto: 05/30/2012

The only conspiracy theorist on this forum I'm aware of is Fernando_Negro, and he's not active right now. I'm sure even an open source community wouldn't take you seriously if you compare security concerns to conspiracy theories.

You raise some valid points, but the spin you put on them makes it look like you're trying to spread FUD. In this particular case, you're claiming that any sort of search that isn't done by a centralized giant is impractical, which is sort of true (the Web is extremely difficult to index and search, and being a giant like Google makes an extremely difficult task easier), but as you can plainly see, YaCy does exist, even if it's not as good as the centralized search engines. The fact that the task is difficult to do in a decentralized way doesn't mean it's impossible or not worth trying.

Casey Parker
Offline
Iscritto: 02/06/2015

I agree wholeheartedly, onpon4. What I've been attempting to say is that
you can't (successfully) simply assume that the reasons for something being
the way it is are purely ideological. In the case of search technology, it
incrementally became what it is today over a long history of improving or
changing to adapt to the way things actually work in reality. Without
understanding that history, you're doomed to simply recreate the problems
that got us here in the first place. "This is free, so it's better" makes
no sense, it's like supporting an Anarchist movement simply because you
don't like how the government works - it's a failing proposition that has
been superseded with good reason. Another prime example is the general
feeling of people in the US that we should have a straight democracy -
people like the feeling of saying that, but they lack the historical
knowledge of why the electoral college system exists. If you don't know
already, the system became the way it is because of population densities.
If we had a straight vote, direct democracy, it would be immediately ruled
over by NYC and LA. Every other place would suffer for it. Politics and
history are very, very important in the free software movement, and it's
worthless to attempt anything without knowing why the previous person to
try it failed.

On Thu Feb 19 2015 at 1:29:48 PM <name at domain> wrote:

> The only conspiracy theorist on this forum I'm aware of is Fernando_Negro,
> and he's not active right now. I'm sure even an open source community
> wouldn't take you seriously if you compare security concerns to conspiracy
> theories.
>
> You raise some valid points, but the spin you put on them makes it look
> like
> you're trying to spread FUD. In this particular case, you're claiming that
> any sort of search that isn't done by a centralized giant is impractical,
> which is sort of true (the Web is extremely difficult to index and search,
> and being a giant like Google makes an extremely difficult task easier),
> but
> as you can plainly see, YaCy does exist, even if it's not as good as the
> centralized search engines. The fact that the task is difficult to do in a
> decentralized way doesn't mean it's impossible or not worth trying.
>

onpon4
Offline
Iscritto: 05/30/2012

> you can't (successfully) simply assume that the reasons for
> something being the way it is are purely ideological.

I didn't see you say anything remotely like that, but in that case, who are you trying to prove that to? I don't think anyone disagrees with that point. If anything, many bad outcomes we've seen are a result of a distinct lack of ideological drive (or, ideologies that focus on rejecting ideologies in favor of so-called pragmatism, like open source).

> "This is free, so it's better" makes no sense

If one program is libre and the other is proprietary, the libre program is better in the freedom dimension. If, like most of us, you support the libre software movement, that makes the program better, because practical benefits are secondary to freedom.

Casey Parker
Offline
Iscritto: 02/06/2015

Anarchist, eh? You're using a network designed by a government defense
agency. You drive on roads built by organized governmental efforts. The
laws that keep you somewhat safe (ie, driving laws, etc) are enforced
governmentally. Anarchism is the sociopolitical equivalent of autism. You
might be against current governmental and societal structure - I know I am
- but I strongly doubt you have fully thought out your anarchist ideals.
Tragedy of the masses is a thing. As is tyranny. There's a middle ground,
as with all things. Extremism of any kind has never once worked out well
for anybody involved ... please argue that if you can.

Libre software is Not inherently better. It's an option, which is
important. Nobody needs to be forced into 'freedom', because that's not
freedom.

On Thu Feb 19 2015 at 2:09:48 PM <name at domain> wrote:

> > you can't (successfully) simply assume that the reasons for
> > something being the way it is are purely ideological.
>
> I didn't see you say anything remotely like that, but in that case, who are
> you trying to prove that to? I don't think anyone disagrees with that
> point.
> If anything, many bad outcomes we've seen are a result of a distinct lack
> of
> ideological drive (or, ideologies, that focus on rejecting ideologies in
> favor of so-called pragmatism, like open source).
>
> > "This is free, so it's better" makes no sense, it's like
> > supporting an Anarchist movement simply because you don't
> > like how the government works - it's a failing proposition
> > that has been superseded with good reason.
>
> If one program is libre and the other is proprietary, the libre program is
> better in the freedom dimension. If, like most of us, you support the libre
> software movement, that makes the program better, because practical
> benefits
> are secondary to freedom.
>
> I'm not sure what your anarchism analogy is supposed to mean, but I happen
> to
> be an anarchist, and one thing I'd like to point out is that there is not a
> single "Anarchist movement"; several different ideas fall under the
> umbrella
> of anarchism.
>

tomlukeywood
Offline
Iscritto: 12/05/2014

"Extremism of any kind has never once worked out well"
please define extremism

Casey Parker
Offline
Iscritto: 02/06/2015

and support for ideas that are very far removed from what the majority of
people consider either correct or reasonable.

Unfortunately, it's gotten a worse image than it deserves due to religious
extremism's ties to violence.

On Thu Feb 19 2015 at 3:19:49 PM <name at domain> wrote:

> "Extremism of any kind has never once worked out well"
> please define extremism
>
>

onpon4
Offline
Iscritto: 05/30/2012

For anyone confused, I originally had a comment about anarchism in the post above this; I removed it because I noticed that I had misread one of Casey Parker's statements, but it's still visible on the mailing list.

> You're using a network designed by a government defense
> agency.

I never said I was against taking advantage of infrastructure created by bad or imperfect entities. If that were the case, I would have to go out and live in the woods. Keep in mind that our societies build on each other, and some past societies have been rather barbaric empires.

> The laws that keep you somewhat safe (ie, driving laws,
> etc) are enforced governmentally.

You're making an assumption which happens to be incorrect. Namely, you're assuming that I oppose government. I don't. I just support a different kind of government: one run not by a state, but by the collective of citizens. As far as I know, most anarchist philosophies are like this.

> Extremism of any kind has never once worked out well
> for anybody involved ... please argue that if you can.

You are using argument to moderation, which is a fallacy. See:

https://en.wikipedia.org/wiki/Argument_to_moderation

Casey Parker
Offline
Iscritto: 02/06/2015

I'm not using argument to moderation. I'm asking you to debate what I've
said because I'd like to hear you out. That's all. I have no interest in
making you look any particular way, or myself for that matter. So, let's
set that aside. This isn't a win/lose argument and you know it.

Unfortunately, pedantics will get in the way right now. Anarchism is the
theory (doctrine) that 'all forms of government are oppressive and
undesirable and should be abolished'. Yet, what you've said is effectively
communist, and not anarchist. Incredibly different ideals. I may even agree
with what you're thinking, just ... not your words.

On Thu Feb 19 2015 at 3:54:49 PM <name at domain> wrote:

> For anyone confused, I originally had a comment about anarchism in the post
> above this; I removed it because I noticed that I had misread one of Casey
> Parker's statements, but it's still visible on the mailing list.
>
> > You're using a network designed by a government defense
> agency.
>
> I never said I was against taking advantage of infrastructure created by
> bad
> or imperfect entities. If that were the case, I would have to go out and
> live
> in the woods. Keep in mind that our societies build on each other, and some
> past societies have been rather barbaric empires.
>
> > The laws that keep you somewhat safe (ie, driving laws,
> > etc) are enforced governmentally.
>
> You're making an assumption which happens to be incorrect. Namely, you're
> assuming that I oppose government. I don't. I just support a different kind
> of government: one run not by a state, but by the collective of citizens.
> As
> far as I know, most anarchist philosophies are like this.
>
> > Extremism of any kind has never once worked out well
> > for anybody involved ... please argue that if you can.
>
> You are using argument to moderation, which is a fallacy. See:
>
> https://en.wikipedia.org/wiki/Argument_to_moderation
>

onpon4
Offline
Iscritto: 05/30/2012

> Anarchism is the theory (doctrine) that 'all forms of
> government are oppressive and undesirable and should be
> abolished'.

No, that's not right. Anarchism is a broad category of political ideologies whose only similarity is opposition to the state. The state is not the only possible form of government.

As a social anarchist, I reject the idea that we need a state to force us to follow social rules, and I also reject the idea that we need capitalism to force us to cooperate. We humans are naturally adapted to cooperation; we don't need to be forced to do so.

> Yet, what you've said is effectively communist, and not
> anarchist. Incredibly different ideals.

There isn't much distinction between Karl Marx's definition of "communism" and social anarchy, or anarcho-communism. Keep in mind that Marx defined communism as a classless, stateless society. State socialism is completely different. There is some overlap between left authoritarianism and left libertarianism, however.

Magic Banana

I am a member!

I am a translator!

Offline
Iscritto: 07/24/2010

We are discussing P2P search engines. Not anarchy or driving laws. I doubt you can convince anybody on the feasibility of P2P search engines with general statements on unrelated topics (that belong to the troll hole and not to this thread).

Magic Banana

I am a member!

I am a translator!

Offline
Iscritto: 07/24/2010

The client-server model (now marketed as "cloud computing") is the simplest one and the first tried for every problem. It does not mean it is the only one. Not even that it is the best one. Even if it is used as the sole option for decades.

Take file sharing as an example. FTP was published as RFC 114 on 16 April 1971 (thank you Wikipedia). More than 28 years (June 1999), Napster was born and it became clear that P2P was a more efficient way (in terms of bandwidth at least) to distribute files.

Casey Parker
Offline
Iscritto: 02/06/2015

You're misusing the word anarchism, it has a definition. That's exactly
like saying you're an atheist because you aren't Christian. Which people do
say, but it's not correct at all.

That said, P2P search could definitely work. If everyone used it, and it
was federated in some way. Distributed and essentially P2P mechanisms are
how the big search engines are powered on the backend in the first place.
I'm saying that an individual's home PC isn't capable of indexing the web
alone, or holding the index if it could, much less effectively searching it.

On Thu Feb 19 2015 at 4:39:48 PM <name at domain> wrote:

> The client-server model (now marketed as "cloud computing") is the simplest
> one and the first tried for every problem. It does not mean it is the only
> one. Not even that it is the best one. Even if it is used as the sole
> option
> for decades.
>
> Take file sharing as an example. FTP was published as RFC 114 on 16 April
> 1971 (thank you Wikipedia). More than 28 years (June 1999), Napster was
> born
> and it became clear that P2P was a more efficient way (in terms of
> bandwidth
> at least) to distribute files.
>

Magic Banana

I am a member!

I am a translator!

Offline
Iscritto: 07/24/2010

I'm saying that an individual's home PC isn't capable of indexing the web alone, or holding the index if it could, much less effectively searching it.

YaCy is not about "an individual's home PC". It has never been. http://yacy.net is pretty clear about it. The second sentence is:

When contributing to the world-wide peer network, the scale of YaCy is limited only by the number of users in the world and can index billions of web pages.

Yet you wrote that we "are batshit" in answer to lembas and SuperTramp83 writing that YacY is a great idea. Just admit you were wrong, give up the patronizing tone and start over on a better base.

Casey Parker
Offline
Iscritto: 02/06/2015

Read the discussion. I have nothing against YaCy and I think it's a great
effort. I said you can't index the whole thing on your own. The batshit
comment, which I stand by, was because of the tinfoil-hat-paranoia-induced
concept that you could simply download and search an index of the entire
web, which was what was suggested. I get that you think you're doing
something worthwhile right now - you're not.

Stop trying to 'prove' me wrong, start proving your own ideas. It never
makes sense to disprove something. I'm challenging the assumptions that
have been made in this thread so far, and it'd be nice to see some defense
beyond "you're not nice". I'm a misanthrope, and I hate you for being
human. Unfortunate, I'll give you that, but it also has no bearing on my
arguments or yours.

On Thu Feb 19 2015 at 5:19:49 PM <name at domain> wrote:

> I'm saying that an individual's home PC isn't capable of indexing the web
> alone, or holding the index if it could, much less effectively searching
> it.
>
> YaCy is not about "an individual's home PC". It has never been.
> http://yacy.net is pretty clear about it. The second sentence is:
>
> When contributing to the world-wide peer network, the scale of YaCy is
> limited only by the number of users in the world and can index billions of
> web pages.
>
> Yet you wrote that we "are batshit" in answer to lembas and SuperTramp83
> writing that YacY is a great idea. Just admit you were wrong, give up the
> patronizing tone and start over on a better base.
>

Magic Banana

I am a member!

I am a translator!

Offline
Iscritto: 07/24/2010

Read the discussion. I have nothing against YaCy and I think it's a great effort.

Read it yourself. The tree structure of the forum make it pretty clear: you wrote that we "are batshit" in answer to lembas and SuperTramp83 writing that YacY is a great idea.

Anyway, writing that we "are batshit" is against the community guidelines: https://trisquel.info/en/wiki/trisquel-community-guidelines

onpon4
Offline
Iscritto: 05/30/2012

Looks to me like it was a response to a question about the security of OpenJDK (a misunderstanding of the insecurity of Java applets, as being insecurity of Java).

Magic Banana

I am a member!

I am a translator!

Offline
Iscritto: 07/24/2010

You are right. Funny that Casey Parker cannot point that out by himself (herself?).

However I do not believe Casey Parker's post has anything to do with "a misunderstanding of the insecurity of Java applets, as being insecurity of Java".

My understanding of that message now is that "your data" actually mean "the OpenJDK 7 framework", "another source" actually means "Trisquel's repository" (since "openjdk-7-jre-headless" is in it) and the subsequent insult was not only against the community guidelines but also completely gratuitous (because, obviously, "Trisquel's repository" is not "another source").

onpon4
Offline
Iscritto: 05/30/2012

I meant that the person Casey was responding to misunderstood Java applet insecurity as Java insecurity. It looked to me like Casey was mocking the idea that Java could be insecure.

Casey Parker
Offline
Iscritto: 02/06/2015

Wow. You know, Icerf, people do other things in life than only replying to
threads here. Given where the conversation has gone in my absence, I'm ...
confused. Just confused. I said nothing about the security OR insecurity of
Java, and certainly never mocked it being insecure.

Carry on. I'll just watch from over here.

On Thu Feb 19 2015 at 7:14:48 PM <name at domain> wrote:

> I meant that the person Casey was responding to misunderstood Java applet
> insecurity as Java insecurity. It looked to me like Casey was mocking the
> idea that Java could be insecure.
>

Magic Banana

I am a member!

I am a translator!

Offline
Iscritto: 07/24/2010

Along the past two days, someone has put much confusion in the the forum. You. Answering to random posts (so that the conversation cannot be followed) with general statements (usually on politics), never backing anything with facts, writing that we are ignorant as much as you can (even when it does not apply at all), even insulting us. You have repeatedly mocked people for using GMail when you are heavily using Google yourself. That is how one can easily discover that you "strongly recommend" anyone to go grab Windows 10: https://trisquel.info/forum/casey-parkers-passion-about-freedom-privacy-and-open-source-software

That does not match the profile of somebody "passionate about freedom, privacy, and open-source software" (as written on https://trisquel.info/users/casey-parker). More that of a troll. Good thing we now have a well-working "-1" button to silently deal with trolls: https://trisquel.info/forum/hiding-down-voted-posts-and-everything-below#comment-63664

Casey Parker
Offline
Iscritto: 02/06/2015

a *troll* (/ ˈ
t
r

l
/
, /
ˈ
t
r
ɒ
l
/
) is a person who sows
discord on the Internet by disagreeing with Icerf. Apparently.

On Fri Feb 20 2015 at 4:59:48 AM <name at domain> wrote:

> Along the past two days, someone has put much confusion in the the forum.
> You. Answering to random posts (so that the conversation cannot be
> followed)
> with general statements (usually on politics), never backing anything with
> facts, writing that we are ignorant as much as you can (even when it does
> not
> apply at all), even insulting us. You have repeatedly mocked people for
> using
> GMail when you are heavily using Google yourself. That is how one can
> easily
> discover that you "strongly recommend" anyone to go grab Windows 10:
> https://trisquel.info/forum/casey-parkers-passion-about-
> freedom-privacy-and-open-source-software
>
> That does not match the profile of somebody "passionate about freedom,
> privacy, and open-source software" (as written on
> https://trisquel.info/users/casey-parker). More that of a troll. Good
> thing
> we now have a well-working "-1" button to silently deal with trolls:
> https://trisquel.info/forum/hiding-down-voted-posts-and-
> everything-below#comment-63664
>

Ishamael
Offline
Iscritto: 08/29/2014

*Posted to wrong comment.

Magic Banana

I am a member!

I am a translator!

Offline
Iscritto: 07/24/2010

This is a forum for users of an open-source operating system.

This is a forum for users of a *free* operating system. Learn the difference (to use the same patronizing tone as yours): https://www.gnu.org/philosophy/open-source-misses-the-point.html

I did not see any "conspiracy theory" in this thread. Only privacy concerns that are very real.

Casey Parker
Offline
Iscritto: 02/06/2015

The point missed here is in scale. A home computer isn't capable of
indexing the web.

On Thu Feb 19 2015 at 4:19:49 PM <name at domain> wrote:

> This is a forum for users of an open-source operating system.
>
> This is a forum for users of a *free* operating system. Learn the
> difference
> (to use the same patronizing tone as yours):
> https://www.gnu.org/philosophy/open-source-misses-the-point.html
>
> I did not see any "conspiracy theory" in this thread. Only privacy concerns
> that are very real.
>

Ishamael
Offline
Iscritto: 08/29/2014

Thank you Banana. :)

*Damn nested comments. Why doesn't this show up where I post it?

onpon4
Offline
Iscritto: 05/30/2012

No. There's a vulnerability presented by Java applets (via the Java plugin), but I don't know whether or not that applies to IcedTea anyway.

Ishamael
Offline
Iscritto: 08/29/2014

Thank you onpon, I'll have to try it then.

fabio

I am a member!

I am a translator!

Offline
Iscritto: 08/02/2010

I'm trying yacy now... I'm very impressed by the concept! I'm thinking it would be nice to have it in Trisquel repos (even if there is already one for Debian-based distros http://debian.yacy.net)

Casey Parker
Offline
Iscritto: 02/06/2015

Yancy is pretty cool, and desktop search is definitely a good thing ... but
it can't ever really replace a search engine's legion of spiders. Please,
please prove me wrong on that. One huge hurdle is that a simple list of
websites and pages - without even containing the full text to search
through - is terabytes in size. Worse, when a page changes you'd have to
download a whole index again. I just can't see anybody doing that ...

On Tue Feb 17 2015 at 3:59:49 PM <name at domain> wrote:

> I'm trying yacy now... I'm very impressed by the concept! I'm thinking it
> would be nice to have it in Trisquel repos, although there is already one
> for
> Debian-based distros http://debian.yacy.net
>

tomlukeywood
Offline
Iscritto: 12/05/2014

" Worse, when a page changes you'd have to
download a whole index again."

why?

if you download it from a server the server could check which
pages have changed and only send those to the text file

you definitely would not have to download the hole thing again

"without even containing the full text to search
through - is terabytes in size."

this may be a probelem it dosent make it impossible as you could go and buy a 4tb HDD

but i guess a solution could be dividing the web pages into
10mb or 20mb blocks
and when you want to visit a certain website you query the
server for that data it would still be anonymous
as it would be hundreds of thousands of websites
your searching though but you would not have to store a huge amount of data

but ideally we just wait until 8tb HDD's get cheap

Casey Parker
Offline
Iscritto: 02/06/2015

You're saying you want to download The Web. That's kind of ... using the
optical drive tray for your coffee cup. As for hundreds of thousands of
websites - there are at least 4.5 billion - and that's just the ones that
have already been indexed by the insanely difficult work of search engines.
That number grows constantly. You could barely even store a list of them on
a 4TB drive, much less a searchable index. These things are done by huge
corporate entities for a pretty good reason. Insane costs.

On Wed Feb 18 2015 at 3:14:49 AM <name at domain> wrote:

> " Worse, when a page changes you'd have to
> download a whole index again."
>
> why?
>
> if you download it from a server the server could check which
> pages have changed and only send those to the text file
>
> you definitely would not have to download the hole thing again
>
> "without even containing the full text to search
> through - is terabytes in size."
>
> this may be a probelem it dosent make it impossible as you could go and
> buy a
> 4tb HDD
>
> but i guess a solution could be dividing the web pages into
> 10mb or 20mb blocks
> and when you want to visit a certain website you query the
> server for that data it would still be anonymous
> as it would be hundreds of thousands of websites
> your searching though but you would not have to store a huge amount of data
>
> but ideally we just wait until 8tb HDD's get cheap
>

fabio

I am a member!

I am a translator!

Offline
Iscritto: 08/02/2010

Sorry, I'm not so expert in this, but I don't see so much difference between a yacy server and for instance a googlebot except for their number. Which means that is unlucky to replace (let's say) google with yacy as the most diffused and efficient search engine, but still the concept is great and if it will spread it can become a good alternative, even with a much more limited global store space...

Casey Parker
Offline
Iscritto: 02/06/2015

If you need only to search through sites you already know about, and that
you index deliberately, it works okay. It definitely does not and could not
replace a powerhouse search engine - do any of you remember before Search?
You either knew about a page, or you didn't.

On Wed Feb 18 2015 at 7:34:48 AM <name at domain> wrote:

> Sorry, I'm not so expert in this, but I don't see so much difference
> between
> a yacy server and for instance a googlebot except for their number. Which
> means that is unlucky to replace (let's say) google with yacy as the most
> diffused and efficient search engine, but still the concept is great and if
> it will spread it can become a good alternative, even with a much more
> limited global store space...
>

islander
Offline
Iscritto: 05/28/2013

The Internet Archive holds several such databases. I suppose you could download and save that which interests you. Scroll this page to see available libraries.
https://archive.org/about/

Over time you will probably collect multiple terabytes of data. Save it to external HDDs that can be mounted and searched. We use community sharing of collections that are hard-wired, but are now looking into tower nodes to spread the info. https://thefnf.org/

SuperTramp83

I am a translator!

Offline
Iscritto: 10/31/2014

Today I found out about this - https://github.com/HelloZeroNet/ZeroNet

fabio

I am a member!

I am a translator!

Offline
Iscritto: 08/02/2010

It looks nice! It reminds me of this https://freenetproject.org/