Freedom verification and auditing guide

gap - 10 months ago -

I propose we write a freedom verification and auditing guide, either in the wiki, or a Markdown or LaTeX document.

In this way, we will have a resource we can point to, and sign-off freedom issue reports by; instead of saying something wishy-washy like "I checked XYZ" or "I audited XYZ", we can say "I verify XYZ is compliant with the Parabola FVaAG (or whatever the name of the guide is) version x.y.z".
When a package is non-compliant in some way, we can point to a specific point of the specification and say "XYZ fails point a.b.c", instead of repeating ourselves ad nauseam, eg. "XYZ is not acceptable for inclusion in Parabola due to..."

We need to include the various freedom traps, mistakes, and pitfalls:
- SaaSS
- connection to third-party repos
- embedded blobs
- not built entirely from source at build-time (includes all files, including multimedia), a partly technical issue
- mislabelled license in the PKGBUILD

We should reach out to upstream, both Arch and individual projects, recommending full licensing documentation, eg. in the form of a Debian copyright file, an AUTHORS log, or an equivalent log.


We also need a way to make this a reality.

I propose we keep Parabola intact, so as to prevent the enormous workload in blacklisting and re-blacklisting everything, which could be prone to error, and only blacklist packages which turn out to be non-compliant.

The root cause of this issue is that projects do not know or care about freedom.
If this is not fixed, auditing packages is a fool's errand because every project is constantly mutating.
In other words, this would require us to audit every package from time to time, which is a burden we are not numerous enough to handle.
I fear this approach would lead to there being very, very few packages in Parabola, or we would be forced to switch to the LTS paradigm in which our packages are ancient, but at least we are sure they are free.

We need to offload trust onto the project developers themselves, but if we could do that, then this would not be an issue in the first place.
Teaching people about freedom would make this more of a reality, although we should still remain skeptical of a package's freedom status before it has been verified/audited.

Replies (5)

RE: Freedom verification and auditing guide - bill-auger - 10 months ago -

your enthusiasm is commendable; but that proposal here is useless - it is only making this ticket drift off-topic - this is essentially a private discussion between you and i; but i do not have the time to accomplish any of it - if there were such a plan, the first thing i would do is to blacklist all games, then the work "TODO", would be to audit them and re-introduce them one-by-one

if you are serious about it, the 'dev' mailing list and the forum would be the best ways to propose it, and to gather more volunteers - i would not make such an important decision alone - if you would like to do that, i will re-post this reply to it - either way, these off-topic comments should be deleted from the ticket - it is unnecessarily confusing to discuss multiple topics on the same thread

I am not a lawyer, nor a professional auditor; I simply check if license files are present, if those licenses are libre, and if all the source files (including non-source code data) can be built from their corresponding source form, so as to ensure all works are genuinely libre.

that is not enough to ensure that all works are genuinely libre - im sure that most upstreams do at least that much; and none of them are lawyers either - the license file is almost always - except for new packaging requests, we already know that it compiles; because it is already in the repos - so that statement is the same is "if we blindly trust the upstream's licensing practices, then what i am doing here is pointless"

the fact that you have found so many examples in the past few weeks demonstrates that game devs do a lousy job at licensing - it seems to me that most distros simply look for a license file; and conclude based on only that - parabola should be more thorough

A large-scale audit has not been done on a distro before

it does not need to be done for any distro - a distro is only a collection of other software - perhaps it could be done for all of the "other software"; but the result of that, would be equivalent to auditing all distros

So as not to spend thousands of hours on auditing, I do not dive into the history of each file, nor contact the contributors in question to ask if the licensing information is correct and up-to-date, nor make sure every single file has a license header.

blobs do not have license headers - those files require some external file to explicitly associate each blob with a license file

also, it is important to note that no graphical programs are essential - there is no imperative for anyone to audit them; except that they are in the repos now - for new packaging requests, i spend as much time as necessary, to do a thorough job on each one; and am not concerned about how many get in or are pushed out - it would be best to have them all excluded, until each is proven fit to be included - of course, it would be a long time before any graphical programs got that attention - the proposal would require the essential packages to be audited first; because those can not be excluded

In other words, unless we can train someone to become a professional auditor, or hire one, we have to rely on projects, trusting they do a good job in terms of licensing.

if we simply trusted the project maintainers, no one would need to double-check for anything - the decision would be utterly trivial in all cases - if the declared license is a libre one, it would be acceptable, otherwise it would not be acceptable

unfortunately, most project maintainers do not know (or want to know) about proper licensing, especially WRT the project's blobs - for every package i add to the repos, i do check the history as best as it is enivent; and i do contact the project maintainers if i have doubts - i would not add it until i had some confidence - i have rejected several packages, because the upstream maintainer insisted that the project was licensed properly and refused to do anything

One such example of trust is assuming that a license file in the root of the project applies to every file in the project, without specific documentation of every file, such as a Debian copyright file or an AUTHORS log.

that is "blind trust"; because regardless what the upstream thinks, the license file in the root of the project does not apply to every file in the project, in many (maybe even most) cases - if the license is a MIT-like, it applies only to "software and documentation files" - images and sounds are not "software" nor "documentation files" - if the license is a GPL, it applies to no files whatsoever (that is why we rejected 'gmnisrv' last week, regardless that the author believes otherwise) - the GPL is not even required to be present in the code-base - it makes no difference whether a GPL file is present or not present

I'd like to propose as a long-term goal that every single package must have a full copyright log of every single file in the project.

sounds great; but that can not happen if no one is willing to be as thorough as i am

anyone who can read plain english could learn to do thorough license audits, in a short time - that sort of grand plan would be best suited as an FSDG or FSD project, so that all FSDG distros could benefit from the results; but it is not likely to happen - regardless of any plans or guidelines, very few people actually will take the time to be thorough - we can not even get the FSDG distro maintainers (the people who are responsible for it) to coopoerate on licensing audits, nor to agree on the few common conclusions, nor any advice from the FSF on either matter

ie: we need not more plans; but more volunteers

I'd also like to mention that the license tag in PKGBUILD files are virtually useless.
Virtually all packages either have an incorrect license specified, or include an aggregation of many files licensed under many different licenses.

the PKGBUILD licenses=() is an array - it is supposed to specify all of the licenses for all source code

Generalising about an entire package with a single license tag does not work most of the time.

we are in agreement then

Therefore, I'd also like to propose that this tag is deprecated and replaced with a pointer to the aforementioned license log.

parabola could not do that; because it would make the PKGBUILDs incompatible with arch

I understand these two proposals are currently a wish, although if we formally contacted Arch with a proposal, coming from the standpoint of a full distro project, we might start to make progress in this direction.

maybe, if arch agreed; but again, more plans or guidelines would not accomplish anything, without many more volunteers

RE: Freedom verification and auditing guide - bill-auger - 10 months ago -

to recap:

if this proposal were to happen, the best approach would be:
1. blacklist everything in [extra], [community], and [pcr] immediately,
and also anything in [libre], [nonprism], and [nonsystemd], which replaces something in [extra] or [community]
2. audit everything in [core], blacklisting AND liberating any which do not pass scrutiny
3. audit everything in [extra] and [community], restoring only those which pass scrutiny
4. treat all new packaging request with the same rigor

in that way, the proposed goal would be reached, as soon as phase 2 were completed - i think that is the best approach; because otherwise, there is little assurance that the goal would ever be met in practice - i am not in favor of any grand plans that will never be completed

note that if phase 2 can not be completed, parabola may need to go defunct - it may be impossible for any 100% libre GNU+linux distro to exist

and note that, after all that work is done, it still leaves all software installable by TPPMs, completely unaccounted for - it is a tremendous amount of work, only for the benefit of one distro - that is what the FSD and FSDG work-groups are for

RE: Freedom verification and auditing guide - gap - 10 months ago -

I've just spotted an issue that's relevant to this, which concerns the Parabola Social Contract; see

RE: Freedom verification and auditing guide - gap - 10 months ago -

When we say "audit" something, what we really mean is that we have done something we described as an "audit" to a particular version of the package.

As I mentioned above, we need a spec so as to formalise the meaning of "audit" to mean "version x.y.z of package XYZ is in compliance with the spec".
Then, we can be sure a particular version of a particular package is free, but we'd have to switch to an LTS paradigm and only allow new versions once audited.

This is why I say the root cause of the issue is that we need to teach projects to care about freedom; we cannot do this by ourselves without massive amounts of time and most likely massive amounts of funding.
We need each project to self-audit.

I agree the trust I mentioned is blind, as you pointed out, but without the resources, the primitive checking I did is all I can muster, and it was fruitful at that, yielding over 30 problematic packages.
If such a primitive audit done in spare time yields a high ratio of problematic packages in the category of video games packaged in Parabola, I agree that points to video game developers being sloppy with licensing, although I maintain this stems from the root cause of not knowing or caring about freedom.

RE: Freedom verification and auditing guide - gap - 9 months ago -

As is customary, I came up with a funny name for the auditing project: the Parabola Auditing Packages and Freedom/Liberty Authentication Project (PAPAFLAP).