do you know a list of ai software being free software?

9 replies [Last post]
tonlee
Offline
Joined: 09/08/2014

privacy guy recommends ollama.com/library/llama3 in order to get
local ai. Is llama3 free software?

Is there a list displaying ai which is free software?
Thank you.

eric23
Offline
Joined: 06/30/2017
Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

Is llama3 free software?

https://llama.meta.com/llama3/use-policy/ violates freedom 0.

That looks better: https://docs.mistral.ai/getting-started/open_weight_models/

GNUser
Offline
Joined: 07/17/2013

In terms of software itself to run models, I have tested ollama (www.ollama.com) which worked, and jan (https://jan.ai/) which I got running with the appimage but the models would download and not run.

As for models, I think it's a little weird mess in terms of licensing, because while LLAMA seems to not be free, TinyDolphin (based on TinyLLama based on LLAMA3) is under APACHE license. So... I guess maybe you can re-license? I don't know, I'm no expert.

As for models I have tried a few from ollama.com all those under Apache license (I think I saw some under MIT license as well). However you are also dependent on your hardware. I can't choose any model I want, so it makes it harder to go with a good licensed model if it doesn't support my machine.

Honestly, I gave up using it locally (the best model I could get to run was still slow and not very good anyway), and I have been using it less and less altogether.

If you don't mind me asking, what is your use-case? What do you plan on using it for?

tonlee
Offline
Joined: 09/08/2014

> what is your use-case?

No user case. I want to test ai software. I do not want to install non free software or use a service as a software substitute.

FreedomForFreedom
Offline
Joined: 04/11/2024

I believe we should start addressing the fact that the LLMs currently considered free software are not entirely so.

At present, some LLMs are labeled as free software because their code is free, even though their weights are not. This is a serious misconception, and the Free Software Foundation should take a stance on this issue.

Right now, the companies behind LLMs that claim to be free software are trying to teach users that the weights should be treated as a resource (part of the non-free culture world) and not as software. This is done so that those of us who care about software freedom don't realize the problem.

The weights SHOULD also be free software. They are not just a resource—they are SOFTWARE. They are binaries for which we don't have the source code or a free license. These companies are trying to disguise this massive blob as something trivial.

The LLM weights should be considered part of the software and not just a resource. Treating them as mere artifacts separates them from the software freedom principles we uphold. Just like binaries require their source code to be considered free, weights should also have their full training data and processes available to be reproducible and modifiable. Without this, the control and transparency we expect from free software are lost, giving too much power to companies behind these models. Weights are an essential part of the software and should be treated as such under free software principles.

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

The weights of the "free models" I linked to are distributed under the Apache2 license, which is a free software license: https://docs.mistral.ai/getting-started/models/models_overview/#free-models

FreedomForFreedom
Offline
Joined: 04/11/2024

I would like to clarify my concerns regarding the construction of weights, which I referred to in my previous message. The weights themselves can be viewed as binaries; their construction relies on either free or proprietary software. The real issue arises when we declare these binaries/weights as free without providing insight into how they were constructed.

If we consider the output of training to be free without disclosing the construction process and the dataset used, we risk equating a precompiled binary software with freedom, even if the source code isn't provided and the construction method remains undisclosed.

For it to be considered free, there must be information about its construction so that it can be replicated, modified, and improved.

In the link you shared, where can I find information about the construction of the weights? Specifically, what dataset was used for their creation?

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

The weights themselves can be viewed as binaries; their construction relies on either free or proprietary software. The real issue arises when we declare these binaries/weights as free without providing insight into how they were constructed.

RMS argues for the exact opposite in https://audio-video.gnu.org/video/richard-stallman-university-of-pisa-2023-06-07.webm from 49:14 to 51:53:
One question raised by machine learning systems that used a trained neural net, which is how they generally work, is "what is the source code?". If you want to release one with source code, so it's free software, the question is "what constitutes the source code for a bunch of data that access a program that has been produced by some sort of calculation and doesn't come from any kind of source code that any people have written?". It turns out that you can modify the trained neural net by running a training program some more and giving it different examples so you can train it to act differently. What this means is that the neural network itself, the values of the nodes in the network, act as source code, because you can modify it the way you want to. So my conclusion is: the trained neural network, that is the data points that go in the nodes of the neural network, they are the source code for themselves, because it's practical, feasible, to modify them to do a different thing. Well, that solves that problem. You don't need to have the original training dataset that would allow you to reproduce the training and get an equivalent neural network. You don't need that in order to make a slightly different one. You just need the original trained neural network. You don't have to repeat the original training. You just do the incremental training to make it different. And this makes it possible to release trained neural network systems as free software. But not if the neural network is so enormous that ordinary users couldn't possibly do retraining on it. I don't have a solution for that, but it may be that those are so dangerous that it's better if they don't exist.

Magic Banana

I am a member!

I am a translator!

Offline
Joined: 07/24/2010

The Free Software Foundation does not follow RMS' opinion (unless he changed it): https://www.fsf.org/news/fsf-is-working-on-freedom-in-machine-learning-applications