Tom Alrich's Blog: Will version ranges ever work?

I’ve come to realize that version ranges are perhaps the biggest problem in vulnerability management. Here’s why I say that:

1. Vulnerabilities usually are found in a range of versions of a software product, not just in one version or a small number of separate versions. This is because a coding error that leads to a vulnerability often won’t be found until many versions later. In other words, if the vulnerability was introduced in version 1.2 and was discovered and fixed in version 1.8, all versions (including patches) between 1.1 and 1.8 are likely to be vulnerable.

2. When a software developer (either a commercial developer or an open source project) notifies their users of a new vulnerability that applies to a range of versions, they will normally describe the range in regular English (or whatever the vernacular language is), e.g. “Versions 1.2 through 1.7 are vulnerable.”

3. However, when the developer creates a CVE record to report the vulnerability (if they are a CVE Numbering Authority or CNA) or works with a separate CNA to report it, they seldom include a version range in the CPE name, even though they may have described the range in the text of the CVE record. And if a NIST employee or contractor adds a CPE name to the record, they will seldom include a version range in the CPE, even though the CNA described the range in the text of the CVE record.

In other words, while many CVE records include a textual description of a range of affected versions for one or more products, the range seldom is reflected in the machine-readable CPE name included in the record (if one is included, of course. Today, this is true for fewer than half of new CVE records). If an end user organization’s vulnerability management tool reads such a record, it won’t normally flag the range as vulnerable, unless the record contains a separate CPE name for each affected version.

Why aren’t CPE version ranges being included in new CVE records? Even though the CPE spec is complex, I doubt that is the main reason. My guess is the reason has much more to do with the fact that few end users have tools that can make intelligent use of a machine-readable version range, so the users aren’t asking for them.

The question now becomes, “Given that probably most software vulnerabilities occur in a range of versions, why aren’t there many end user tools that can make intelligent use of machine-readable version ranges?” To answer that question, we need to look at end user use cases. There are two main types of these.

The first use case

The first use case answers the question, “Given a version range, does version X fall inside or outside that range?” Depending on the versioning scheme used by the supplier of the software (i.e., the rules followed to create a label for a version, such as 2.2.3), this will be an easy or difficult question to answer. There are two main types of versioning schemes, one mostly followed by open source software communities and the other mostly followed by commercial developers.

The first type of versioning scheme is one that is completely numerical and follows simple arithmetical rules. For example, there is a versioning scheme in which the version is represented as X.Y, where X refers to the major version and Y refers to the minor version. Thus, version 2.3 means major version 2 and minor version 3. If a minor version change were introduced next, the version number would be 2.4. If a major version were introduced next, the new number would be 3.0, since the minor version number reverts to zero with a new major version.

One popular all-numeric scheme is semantic versioning. This follows the model X.Y.Z, where X is the major version, Y the minor version, and Z the patch version. The rules for semantic versioning are not much more complex than those for the major/minor scheme just described.

The purl community has developed a “mostly universal” means of specifying a version range, using almost any all-numerical versioning scheme. Since an identifier for the versioning scheme is part of the range specification, a user tool that supports VERS will be able to ingest a version string from one of the supported versioning schemes and respond whether that version is within a range that was specified using VERS. Thus, it is accurate to state that question 1 can be answered for almost any all-numerical version range.

However, commercial developers often utilize complex versioning schemes that are not all-numerical and/or do not follow the simple rules that the all-numerical schemes follow. Two ways in which complex schemes differ from all-numerical schemes are:

a. They might include letters as fields, not just numbers; and

b. The order in which fields are incremented isn’t self-evident. If, for example, the fields are always incremented moving from left to right, the incremented version will be very different than if the fields are incremented from right to left, or even according to some other plan.

For example, the latest version of Cisco IOS is “15.9(3)M11”. Suppose someone tells you that IOS versions 15.5(1)M8 through 15.9(3)M11 are affected by a serious new vulnerability; you want to know whether the version you’re using, 15.5(5)M9, falls within that range. You won’t be able to answer the question until you have been given three pieces of information:

1. Is ‘M’ a field or an unchanging part of the specification?

2. If M is a field, is it related in any way to one of the numerical fields? For example, is “M11” a unit that could be replaced with another letter/number combination like “N14”? Or does the number vary but not the letter?

3. In what order are the fields incremented? If the fields are incremented moving from left to right (so that 15.9 is incremented first), the incremented version string will be very different than it would be if the fields were incremented moving in the opposite direction (so that 11 was incremented first). It’s also possible that the integer within the parentheses is incremented first, which of course will yield a very different incremented version string.

In this post in April, I introduced a term that a large software developer had introduced to me: “ordering rule”. This is a rule (rather, a set of rules) that describes how the versions in a complex versioning scheme like the one behind IOS are ordered, beyond the simple rule that an integer n is followed by n+1 (which is the basic rule behind the all-numeric schemes). Since commercial software suppliers, especially large ones, often don’t follow simple ordering rules, this means that a tool vendor – on either the developer or consumer side – will need to have an ordering rule for each commercial supplier whose products they support.

Not only that, but there will need to be a standard notation for documenting an ordering rule, plus a “rule interpreter” that can be incorporated into a tool and will interpret each ordering rule for the tool. Thus, a tool for vulnerability or asset management (and other tools as well) would be able to ingest and utilize version ranges based on versioning schemes for many commercial suppliers, if it had previously ingested an ordering rule for each of those suppliers.

Of course, the standard notation for documenting ordering rules, and the code for the rule interpreter, will need to be developed by some organization, preferably a software security nonprofit like OWASP or OASIS Open. An individual software developer might also take this on if they agree to make all the code developed available to the general public.

I don’t think anything like ordering rules or rule interpreters exists today, but if you know differently, please email me.

The second use case

The second use case answers the question, “Given the versioning scheme used by the supplier of product XYZ as well as the supplier’s ordering rule, what are the versions that fall within this range?” One example I can think of for why this question might need to be answered is the case in which a serious new vulnerability has been discovered that applies to a range of versions of a commercial product; the supplier needs to know every version that falls within the range so they can patch each one. This includes major versions, minor versions, patched versions, new builds, etc. My guess is that a lot of commercial suppliers would find it very difficult to answer this question for at least some of their products.

If I were on a product security team and I had to answer this question, my best hope would be to find that somebody has been keeping meticulous records of new versions all along. That is, they have maintained a single list that includes every version within the range, in an order that strictly follows the ordering rule. However, in a lot of organizations this is too much to hope for. Not only will there not be a comprehensive list of versions available, but it is quite possible that nobody will even be able to state how many versions fall within the range.

However, that’s not the biggest problem. The biggest problem is that I not only have to identify versions that fall with the range, but I must also be able to demonstrate that my list is complete – i.e., there are no other versions that fall within the range.

The best way I can think of to create a provably complete list of versions within a range is to take the following steps. I’m breaking this problem into two sub-cases: 2a) The product follows the Semver versioning scheme, for which the rules are well defined in the Semver specification; and 2b) The product doesn’t follow an all-numeric versioning scheme, but the product’s supplier has documented an ordering rule for the product that makes it clear, for every version string, what the next version string will be[i].

In both sub-cases, the goal is to start with the first version in the range and then run the ordering rule to predict what the next version will be in each possible scenario. Here is an example using sub-case 2a and the simple “X.Y” versioning scheme described earlier:

1. Start with the first version in the range. If that is v2.3, the next version will be either 2.4 (if it is a minor version) or 3.0 (if it is a major version). In some versioning schemes like semver, there will be three or more possible next versions.

2. Determine whether v2.4 or v3.0 was released (if they were both released, somebody made a mistake, unless the product was deliberately “forked”. And if neither was released but subsequent versions were released, somebody made a different mistake).

3. Start over at the first step, this time using whichever subsequent version(s) was released. Continue doing this, while maintaining a list of each released version that has been “discovered”.

4. When you reach the last version in the range, stop.

Of course, sub-case 2b will be more complex than the above, since the ordering rule will be complex. However, the process will not differ in principle from the one just described. In both sub-cases, following the process will generate a list of versions that are vulnerable because they fall within the vulnerable version range.

Lessons learned

There are probably other end use cases for version ranges, besides the two I’ve described. It would be nice if there were a single algorithm that could address all possible end use cases. However, the fact that the algorithms for the one use case and two sub-cases just described are so different points to the fact that there is probably no single algorithm.

What this means is that developers of end user tools for vulnerability management, asset management, software testing, etc. will need to take most of the responsibility for developing the algorithms to make use of the version ranges provided by software developers, vulnerability researchers, etc. – since these algorithms will need to be closely tailored to whatever their tool does.

However, the software developers aren’t off the hook. They are responsible for:

i. Describing the range of affected versions in the text of the CVE record;

ii. If possible, including the range in the CPE name for the affected product and including that in the CVE record;

iii. If one or more of their products doesn’t follow an all-numeric versioning scheme like semantic versioning, preparing an ordering rule for the product; and

iv. When purl becomes an alternative identifier for affected open source products (which I believe will happen by later this year), including the VERS specification of the range in the purl for the affected product.

As you can see, there is a lot that will need to be done in the way of tools and standards development, before automated use of version ranges in vulnerability management will become a real possibility. However, given that there’s no real good alternative to taking these steps (will suppliers suddenly start including tens or even hundreds of CPE names in CVE records to identify every version in a range? They’ve always known they can do that, but they understandably don’t think it’s a good use of their time to do that), I think it’s only a matter of time before this happens.

My blog is more popular than ever, but I need more than popularity to keep it going. I’ve often been told that I should either accept advertising or put up a paywall and charge a subscription fee, or both. However, I really don’t want to do either of these things. It would be great if everyone who appreciates my posts could donate a $20-$25 (or more) “subscription fee” once a year. Will you do that today?

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

[i] It is also possible that the product doesn’t follow an all-numeric versioning scheme, but the supplier has not provided an ordering rule. In this case, there is no algorithmic method available to answer the question for the second use case.

Tom Alrich's Blog

Monday, July 14, 2025

Will version ranges ever work?

No comments:

Post a Comment

Get new posts by email: