Science and technology

Avoid this widespread open supply scanning error

Pete Townshend, legendary guitar participant for British rock band The Who, is well-known for enjoying suspended chords. Suspended chords add musical pressure to a tune. For these piano gamers studying this who (like me) like to play in the important thing of C, merely play a C main chord (the notes C, E, and G) and exchange the E observe with both an F or a D. You are actually in your solution to changing into a British rock star!1

Music is usually stuffed with mixtures of chords, like suspended chords, that present pressure, then launch. Although including pressure to a musical composition is fascinating, including pressure to scanning software program with open supply instruments is actually unwelcome.

An subject at Red Hat involving scanning software program led me to put in writing this text.

Recently, an essential buyer raised a priority after scanning a few of our software program’s supply code. As you could know, Red Hat offers the supply code of its software program. The buyer’s scanning software reported that a sure software program file was licensed underneath the GPLv3, which was not anticipated. In reality, the software program file was explicitly and solely marked as being licensed underneath the Apache 2.zero license. The buyer requested that we glance into this subject, and we have been pleased to take action.

After our in-depth analysis, we concluded that their scanning software program is clearly incorrect. We developed a speculation that explains the anomaly, which I’ll now clarify.

A preferred kind of open supply software program scanning software compares the software program being scanned to huge repositories of preexisting open supply software program and stories any matches. For instance, assume there may be an open supply file named MIT.c that returns an integer one larger than the integer handed to it. In different phrases, it’s a easy adder. It might seem like this:

Copyright 2021 Jeffrey R. Kaufman

Permission is hereby granted, freed from cost, to any particular person acquiring a duplicate of this software program and related documentation recordsdata (the "Software"), to deal within the Software with out restriction, together with with out limitation the rights to make use of, copy, modify, merge, publish, distribute, sublicense, and/or promote copies of the Software, and to allow individuals to whom the Software is furnished to take action, topic to the next situations:

The above copyright discover and this permission discover shall be included in all copies or substantial parts of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

int foo(int x)

For this hypothetical instance, assume that MIT.c was positioned in a GitHub repository named The Simple Maths Project as an open supply community-based venture for fixing easy arithmetic issues. This venture comprises many different related C language recordsdata, all underneath the identical MIT License.

Since this hypothetical instance operate is so helpful (in fact it is not, however stick with me right here), it was included to supply easy arithmetic utility in lots of different open supply tasks on GitHub. Also, assume that certainly one of these different tasks, named The Sustained Chord Calculator, makes use of this MIT.c supply file from The Simple Maths Project to assist calculate the musical formulation for suspended chords.

The hypothetical Suspended Chord Calculator venture, along with utilizing MIT.c, additionally consists of a number of supply recordsdata licensed underneath GPLv2. When The Suspended Chord Calculator venture is compiled, you possibly can assume the ensuing executable will include each GPLv2-licensed software program and the MIT-licensed MIT.c as one mixed work in such a means that MIT.c can’t be fairly thought-about impartial and a separate work in itself. That ensuing executable would rightly be thought-about licensed underneath the GPLv2, and the obligations of the GPLv2 have to be complied with. Compliance means offering (or providing to supply for 3 years) the entire sources used to create the binary or executable, together with all of the software program recordsdata licensed underneath GPLv2 and MIT.c.

Moving again to our drawback…

Suppose certainly one of your software program merchandise makes use of MIT.c, along with your individual authored software program. Since MIT.c is solely underneath the MIT License, this may obligate you to adjust to solely the MIT License phrases, which is straightforward to do. Typically, individuals comply by offering a duplicate of the MIT License together with the copyright discover with their software program distribution. Or, if you’re an organization like Red Hat, offering the supply code that comprises the license textual content can also be a technique of compliance—and my really helpful method. (See An economically efficient model for open source software license compliance.)

If you determine to scan the supply code of your software program product utilizing a source-code scanner of the sort that references repositories of open supply tasks, your scanner might seemingly report that MIT.c is licensed underneath the GPLv2! Why? Because it can see MIT.c, in source-code type, related to The Suspended Chord Calculator venture licensed underneath the GPLv2 and assume, naively, that MIT.c additionally have to be topic to GPLv2 phrases. This is however that the MIT.c supply file is clearly marked with an MIT License, and also you copied it solely from the unique MIT-licensed The Simple Maths Project.

This is an unlucky consequence of utilizing all these scanning methods. In this instance, such methods will erroneously report usually each single open supply venture in its repository that makes use of MIT.c. There may very well be tens, a whole lot, and even hundreds of packages that use MIT.c, all underneath completely different licenses, and you can be supplied with a large stack of tasks to overview indicating that MIT.c may very well be MIT licensed, BSD licensed, GPLv2 licensed, or, frankly, carry another open supply license underneath the solar from a venture that simply occurs to make use of MIT.c. And ultimately, you’ll uncover that the file was solely underneath MIT.c. In my expertise, there are only a few conditions the place this sort of scanning is warranted and, even when it’s justifiable, that the file license outcomes are one thing apart from you anticipated. It occurs, however it’s uncommon.

There is one other kind of software program scanning system that stories on licensing by trying just for matches to recognized license texts within the supply recordsdata of the venture. This kind of scanner would detect the MIT License textual content within the supply code and accurately report that the software program is topic to the phrases of the MIT License, however the truth that MIT.c could also be utilized in many different open supply tasks underneath various license phrases. Although this sort of source-code scanner may have false positives, in my expertise, source-code scanners of the sort that reference repositories of open supply tasks have considerably larger charges of misreporting for the explanations mentioned beforehand.

Frankly, source-code scanners that reference repositories of open supply software program to establish license information may be helpful in sure conditions, similar to when you could be hyper involved that an engineer has inadvertently copied and pasted supply code from an unacceptable license with out additionally copying over the relevant license textual content. In that state of affairs, a source-code scanner of the sort that appears just for matches to license texts wouldn’t detect that inclusion. However, as I acknowledged earlier than, this case is exceptionally uncommon, making repository-matching source-code scanners susceptible to errors and a waste of assets for monitoring down the reality. This is time and assets that may very well be dedicated to extra problems with substance. You may handle this case by coaching your builders to by no means copy software program from one other supply with out additionally copying over any relevant license.

A scanner that stories the inaccurate license is doing an incredible disservice to your group by requiring you to resolve a false constructive. Countless hours of wasted assets are devoted to those wild-goose chases…as our buyer skilled.

We will not be fooled once more!


I wish to thank my colleague Richard Fontana for suggesting the title of this text. Read a few of his nice articles on Opensource.com underneath the Law section.

1. If you wish to be taught extra about music principle and suspended chords, try Rick Beato’s evaluation of one other nice monitor from The Who at What Makes This Song Great? Ep. 96 The Who.

Most Popular

To Top