Raw Numerical Security

When writing software code, it is common practice (at least today) to use numbers to control how the program runs. Numbers are used to define loop counters, error codes, program options, and much more. Spend enough time with software code and you begin to apply numbers as easily as breathing. When you need to write the code to walk through a list of numbers or interpret input from databases, networks, files and so on, the numbers become second nature. Pulling out parts of a word or sequence of letters or arranging the placement of text, pictures, and form fields through relative distances becomes routine through exposure. Much of this though creates a security blind spot.

Numbers Have Flaws Too

As I passed into chapter 6 of Robert Seacord’s book that I have been reading (see the prior posts, Raw Data Security and Raw Code Security), I came away with a higher awareness of the potential security issues involving the trivial use of numbers natural to so many who have written software. While the concept of the buffer overflow has become more widely known, there is more that can be understood about numbers themselves. Rather than repeat what has been written in Robert Seacord’s book, I would summarize by saying that the use of numbers in computer programs have to be applied carefully.

In Range versus Out of Range

Chapter 5 of Secure Coding in C and C++ describes many considerations when dealing with numbers in computer programs. It continues with mitigation practices that include evaluating numbers based on whether they are out of range or in range. Of the two approaches, in range is usually the best way. In Range evaluation works to allow numerical values based on if they fit in the set of numbers between a minimum and maximum range. Robert Seacord describes this in more detail but suffice it to say, in range evaluation can often have better performance and can read better in terms of source code maintainability.

Between MIN and MAX

Robert Seacord goes on to describe a concept called Saturation Semantics. Based on my reading, it is a means of definitely capping the values of numbers at either end of a scale and eliminating undefined behavior in software code. It is a smart strategy that will not work in every case and I can think of some cases dealing with financial software where it would not fit so well. However, it is a good baseline for taking in range evaluation further. The main point is that in some cases, it is the writer of the software program and not the software compilers and tools that determine how well numbers will be checked in terms of out of range issues that could lead to security issues.

Perimeter Based Coding

During the early 2000s I had an approach to programming that rigorously checked inputs into the program but relaxed cross checking of data originating within the program. The idea I settled upon was that from an engineering standpoint, subsystems operating within and across the program stack could be trusted. That afforded excellent system performance and systems development productivity. I have never had a security exploit that I know of but that does not mean the systems I created lacked security loopholes. While my approach to software was efficient and benefited from a corporate environment in which the operating systems and configurations was standardized, I can see issues in what I did.

The primary error I made was the assumption of proverbially unlimited computer resource and implicit durability of program structure. During a time when computer performance, capacity, and features are growing each year, I bought into the common sentiment that in order to significantly improve software, do not refine the program but throw more hardware at it. This is intuitive since the cost of an additional machine, network capability or storage mechanism may be far less than cost involved in reducing program footprint in time and space.

As to the second error, I believed that if I wrote the code well enough that external inputs were addressed with great scrutiny while writing code at a level where it was fast, accurate and had no operational errors, I created a great program. Well, in a highly trustworthy environment, that approach is sound but in a network connected environment, you need something more. That something more may be to have an effective set of layered perimeters within the program. This of course makes program development slightly more complex but could offer a more robust solution that could hold up better in the face of malware while maintaining acceptable performance.

Intuition versus Structure

A commonly held belief in software security is that you cannot add security to a computer program after the fact and expect the security measures to be fully effective. I agree with that, but I have rarely practiced this myself. Writing a computer program can either proceed systematically or intuitively and I would argue that the large majority of written computer programs are done intuitively. However, to apply security means to one of two things. 1.) You must methodically design software code from the beginning in alignment with security standards you observe (from low to high risk mitigation). 2.) You write programs intuitively to flesh out the implementation of the ideas, the functions, and the overall orientation and then rewrite it from scratch with security in mind. Both approaches are expensive but the second option seems more likely to produce good results since the intuitive approach may more likely lead to innovation and a stronger alignment of the software to the goals of the producer and audience.

Chapter 5 Makes the Case for Integer Checking

The fifth chapter talks at length about the security implications of numbers but it is worthwhile. A case has to be made that a number like 2147483647 can become 4294967295 under the right conditions for example. Under some conditions in which the numbers are not adequately ranged, the program behaves in a way not anticipated by the software writer. That unexpected behavior can be the opening malware needs to gain further opportunities with data and outcomes in the real world. In other cases, improperly managed use of numbers can end a business such as the case of Knight Capital which lost $440 million. Or consider the analysis presented on Forbes.com that certain kinds of financial software on Wall Street may be a tragedy in waiting. Not all of it has to do with basic numerical checking but much of it does.

Managed Code Never Looked Better

 

While computer solutions based on Java, Microsoft .NET, and Apple Swift has higher built-in safeguards that make them immensely more desirable than C and C++, they are not appropriate for every use case. Sometimes, software in these newer environments must talk to software defined in C and C++. Directly or indirectly. While programs in these newer languages and runtimes may be less susceptible to the kinds of range issues described by Mr. Seacord for C and C++, the data they often manage may occasionally get passed into software processes defined in the latter languages. In that case, what would be a managed condition in Microsoft .NET for example could elevate into a potential software vulnerability during the transition from managed to unmanaged code. Insights such as this could be a value of works like Secure Coding in C and C++. Even if you never create code in C or C++, at least knowing the issues that can impact the environments we operate in that are indirectly defined by those languages (if even 2 – 4 levels removed) can perhaps contribute to more prudent decisions in what we do apply and how we evaluate.


By Michael Gautier

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s