Five abstraction levels exist to define a GUI and your choice is to choose one of them. Each level has pros and cons to affect the process to create a UI, the effort involved, and the quality of the final result. The levels mentioned are not the only levels in the graphics output ecosystem but represent the moss common choices available. A given level may be more useful in certain situations than others but each level provides useful capabilities in creating GUIs, videos, photo apps, reports, and various kinds of image recognition and visualization solutions.
Coordinating Visual Output
Whenever you see a program on a display screen, a photo, or video, there are generally 5 different ways that screen is produced. All of them use some form of a function. What is a function? A function is a named or numbered area in a computer’s memory that takes one or more values, computes on them using a processor (CPU, GPU, or other type) and returns a value. One of the values at an abstract level is always the location of the functions itself so that entrance and exit from the function forms a path from input to output. Every graphics output proceeds from accessing a function. That function could be simply to put a dot at an X, Y position on the computer’s monitor. Another function could simply be to color that dot red, green, blue or any variations thereof. Yet another function may take 2 pairs of X, Y coordinates and draw a series of dots from the first pair to the second pair of coordinates in order to form a line. Anything you see on a screen fundamentally was translated from similar functions to produce what you see.
The concept of these functions is very straightforward and if that all there was to it, I would have nothing more to say. The clarity of the concepts just presented is where easy comes to an end. Every graphical operating system has functions for drawing lines, coloring dots, and detecting the X, Y movements of a mouse, finger, or stylus. Android, iOS, Microsoft Windows, macOS, Linux, and UNIX all have a library of functions you can hook your program into to draw GUIs, graphics, photos, animations, and video. The challenge is the same function in each library has a different name and often takes a different sequence of values to produce graphics output. That is the origin of the 5 GUI library levels.
Why 5 GUI/Graphics Library Levels?
The 5 GUI/Graphics library levels exist because among the 1 million (more than that depending on who you cite) or so people who create GUIs are many who would rather spend less time creating them. They use a generalized method that helps them avoid substantial time spent stitching together numerous graphics, geometry, and GUI function calls as well as shorten their effort when re-adapting a large GUI code base to another operating system.
A second reason they exist in at least 5 levels is the tastes, preferences, imaging and interactivity requirements for programs differ widely. Even an operating system’s own graphics/GUI offerings can differ across versions of that operating system. That amends the definition of cross-platform to include not just operating systems that differ based on the organization (i.e. Apple, Microsoft, or Red Hat) that provides it, but also different versions of the operating system from the same organization. The groups within an organization that work on these graphics/GUI libraries and frameworks evolve them over time which eventually leads to significant changes.
Short Description of Graphics/GUI Libraries and APIs
We established that a graphics layer contains a collection of functions to draw images on a screen. That image can be a photo, a button, a text entry field, a list of data, scrollbars, and icons. Let’s say I have a graphics library name Graphics Library #1. A function within Graphics Library #1 is named Dot. The function named Dot takes 4 values named X1, Y1, X2, Y2 and returns true or false in which false indicates the dot was printed on the screen. The value false is important because values passed in must equate to a perfect square and false indicates if the values we passed to the function met that requirement. You use the Dot function many times to draw lines.
Later on, you realize that you can build GUIs faster if you have functions that use the Dot function to make standard types of shapes such as circle, square, rectangle, triangle, and various polygons. You conclude that such a library would be useful and you create Graphics Library #2 that has those shape functions which, in turn, call the Dot function in specific ways to produce the shapes indicated by the function. For example, the Square function in Graphics Library #2 will call the Dot function in Graphics Library #1 a sufficient number of times and in the correct sequence to establish a square on the screen. The same for the functions for triangles, circles, and general polygons.
The various shape functions in Graphics Library #2 contain a lot of code sequences and instructions to apply the Dot function in Graphics Library #1 to achieve the results indicated by a given shape function. However, from the perspective of the person using Graphics Library #2, they are oblivious to all this activity going on and simply calls a function for square, circle, triangle, or polygon. What they are using is what is referred to as an API. A collection of functions in which each function contains a sufficient number of steps to accomplish a result based on data passed to the function. Those steps is the function’s implementation. In a related way, you can the same graphics/GUI library on multiple operating systems in which the API for the library is the same but the implementation underneath is adapted to the operating system’s own API for graphics/GUIs. The person using the API does not know the difference and has achieved cross-platform portability of the program. That is a primary reason for a cross-platform API.
In the real world, APIs have more than a single Dot function or certainly more (but sometimes do) than 5 or 6 shape functions. Substantial APIs that are used to setup graphics/GUI programs may have between a few hundred to thousands of functions as part of their API. That is why a programmer with accumulated knowledge of APIs in one context often has to start from ground zero in another context in terms of muscle memory in the use of an API, its nuances, limitations, undocumented flaws, short cuts, and best practices. That also applies when going forwards or backwards in time within the same API. Some API then exist to manage this by assuming and going beyond the functionality found in the natural API for specific environments. Use of these more generalized APIs can substantially reduce the effort needed to maintain the technical aspects of a program extending its longevity and greatly inoculating it against obsolescence in connection with the retirement of a more specific API.
The 5 Graphics/GUI Library Levels
A GUI shows on the screen. The screen is some kind of physical display sitting on your wrists; in your hand; or across from you on a vertical piece of glass like material. That display is not the place where that GUI sprung into existence but came from somewhere else and merely appears on the display screen for human eyes (cats, dogs, and other critters too). The visuals for the GUI came from a graphics chip that defined the geometric data in a format the display could understand and sue to show the forms, lines, text, and colors on its surface.
Computer code defined the instruction sequences and data for the graphics chip so it could translate it to a display. It took many such instructions and data in a coherent form that the graphics chip could formulate into the requisite signals for the display. Those instructions were, in turn, formed by sequences of functions that simplified, at least for some programmer working at that level, the task of emitting reams and reams of short instructions and data that form a composite output on a display. Preceding that were functions that consolidated the calls to the granular functions just mentioned. The 5 Graphics/GUI Library Levels represent consolidations of function calls at the preceding level until we arrive at a hierarchy of consolidated function groups.
What follows is a description of the 5 Graphics/GUI library levels:
- Direct Graphics Card Control
- System Level Graphics/GUI Control
- Graphics Driver Abstraction
- Graphics/GUI Frameworks
- Automatic GUI Processes
Side Note: The vast majority of people who create GUIs in some form are using capabilities at levels 4 and 5. The most prominent example of a Graphics/GUI framework that works well across operating systems is Qt which is defined at level 4. That is a toolkit that reduces the need to immerse yourself in the intricacies of APIs for specific operating systems while enabling you to create world-class, highly polished, and highly reliable GUI applications. In earlier years, Adobe used Qt for some of its flagship applications that worked on Windows and macOS.
Direct Graphics Card Control
Old school video game consoles, many old school GUIs from the 1980s, and some present-day GUI programs directly access the graphics chip. Not all do this by manipulating memory addresses but API exists to accomplish this to varying levels of detailed manipulation of pixels on a screen. While this may seem antiquated, hold on … not so fast, some embedded devices like smart watches, wearable displays, some of the displays for appliances, industrial equipment, military, or aerospace displays may require this level of hardware access to present information and enact interactive feedback. The pros are that the level of control over the graphics is absolute but the downsides in terms of implementation time, preparation, validation, maintenance, and portability can be substantial. A team concerned less with portability and with the right skills can produce superior graphics/GUI solutions with this level of hardware access in some cases.
System Level Graphics/GUI Control
This is the operating system’s API for providing access to graphics/GUI capabilities. Most of these APIs date from the late 1970s to mid 1980s and while they offer the most complete synchronization between the GUI and the operating system, they can be nearly as granular and intricate to use, apply, debug, validate, troubleshoot, reason about, and maintain as the preceding layer of hardware access. Major software applications such as Adobe PhotoShop, Microsoft Excel, and AutoCAD use various parts of these API to achieve a more significant GUI. That statement means you are not required to use the operating system’s graphics/GUI API exclusively and many larger applications mix APIs to achieve an overall result.
While DIRECTX for Microsoft Windows may appear to be the API for Windows at this level, it is not. The primary API for Microsoft Windows is still Win32. APIs such as DIRECTX enhance the available APIs offered by Microsoft for graphics, highly interactive, and GUI applications but are not full substitutes for the main API and indeed could not exist without the main API. Win32 is a large API that covers many areas and the subset of Win32 dealing with graphics and GUIs is GDI. Microsoft Windows 10, for example, has alternatives to GDI but you can still use GDI and many of the big time programs for Microsoft Windows produced over the last 30 years or so still use GDI. The equivalent for Linux is X-Windows. GTK+ on Linux is built atop X-Windows whereas the same GTK+ on Windows may use GDI.
Again, use of the operating system’s graphics/GUI API is recommended if you want your GUI to appear and work lock-step with the design and nuanced visual functionality behaviors of the operating system as well as integrate more seamlessly with various parts of the operating system. Since the makers of operating systems validates their core API (for example, the Windows team’s tools for debugging Windows itself may use GDI within Win32), you can be sure of the best fit and function when you use this API. A downside here is when the API is either replaced, stagnates for a long while, or does not maintain compatibility within the operating system’s lineage or industry at large, you have no choice but to follow suit. Also, unless the function names and behaviors for the operating system API is converted to a different operating system you may be interested in, you are stuck to that operating system for that GUI based software you built.
Graphics Driver Abstraction
The 3rd level of graphics/GUI API is Graphics Driver Abstractions distinct from those built into the operating system. APIs such as Vulkan, OpenGL, DIRECTX, and Metal manage and enhance access to the graphics chip. Often, they will provide functions for drop shadows; anti-aliased fonts; very smooth lines and curves; processing of matrices defining composite graphics data; real-time animation; transparency; alpha-blending; lighting; texture painting and much more. However, they can introduce their own complexities. That is why you will have additional APIs at this level termed game engines that further consolidate Graphics Driver APIs.
By this point, you should see a pattern. The more we consolidate or produce turnkey productive usages of API functions at a lower level we do not necessarily solve the problem of complexity but merely redefine it into a form appropriate for the level at which we are operating. The complexities involved in creating a GUI for real-time imaging of a X-Rays in a medical environment will differ from those involved in making video games versus those for creating a data entry program bearing a few reports for a telemarketing department.
Graphics Driver Abstractions did not always exist but their maturation and further consolidation of functionality in game engines led to their use in some cases as the foundation for the next level of API above them. While game engines serve a role in the video game industry, they can also be a substitute framework at the next higher level above Graphics Driver APIs. They can be used to derive a GUI that is more advanced, visually compelling and more effective in handling real-time visualization of network streamed data. Since they are often designed to work across operating systems, they can offer an alternate means to build a GUI that works on multiple operating systems. The downside is they generally lack widgets common to most GUI programs such as buttons, text entry fields, data grids, drop downs, check boxes, radio buttons, menus, trees, list boxes, and such that you have to create them yourself. In some cases that means you are recreating widgets that have been in existence in more established GUI frameworks for nearly half a century with the requisite expected level of polish and interactive nuance.
The majority of the estimated 1 million programmers use this type of API for GUIs on desktops and mobile (this entire discussion excludes web applications who windowing occurs in a program called a web browser). Microsoft Windows Forms, Microsoft WPF, Microsoft UWP, Qt, wxWidgets, Java Swing, Apple Swift, GTK+, and Xamarin are but a few of the available options to productively create a GUI program that contains common functions expressed through drop downs, lists, buttons, data entry fields, text output within shapes (primarily squares and rectangles) and tabular grids. Most of these frameworks consolidate functions at the lower levels in such a way to enable a person to more quickly and reliably build a GUI program.
They are often accompanied by a visual screen design tool that further speeds up the process of defining a screen. The resultant screen is still accessible to code which customizes and adapts the GUI screen when it actually runs. Some persons skilled in the API itself may forgo the visual tools to achieve even greater control over the resultant GUI output. Whichever route is taken (tool assisted or hand crafted), the overall productivity and certainty of result is orders of magnitude higher than what is available with APIs at a lower level. Note this does not mean the result is superior to those produced using the lower level API but just that the time to create and revise the GUI is faster and the operational outcomes are highly consistent.
Some APIs at this level are more portable than others. Qt works just about everywhere as does Java Swing and wxWidgets. GTK+ works on Windows and Linux. Swift only works on Apple. WPF only works on Microsoft Windows. Windows Forms works 100% on Windows and about 70% to 80% on Linux. You have to choose your GUI framework carefully because while some of them work well for Windows or work well in Linux or have solid execution in mobile environments, the way they work, evolve, and are used can conflict with other requirements you have. You really do not have to speculate about this as extensive comparison charts exist on the Web for many Graphics/GUI Frameworks. For example, Qt has great functionality but may not keep perfectly in synch with the primary programming language used with Qt while GTK+ may have the opposite condition.
Defining the requirements for which GUI level can often start with determining what you are attempting to accomplish. Sometimes this is the wrong starting point and it can be very useful to decide your primary deployment preferences. Placing the emphasis on deployment first can facilitate consideration of the right design constraints, gauge feasibility, level of effort, and streamline the process of translating general goals into an implementation. If you truly know you want to be cross-platform, it will make little sense to establish design criteria that actually confines the solution to a single operating environment or version of the same.
Automatic GUI Processes
The final level is automatic GUI processes. API is involved but in far more constrained way from the perspective of the person customizing GUIs at this level. You typically use a visual screen design tool at this level and no other discrete mechanism for the definition of a screen is available. Alternatively, a system may conduct the wholesale generation of GUI screens based on data in a database, a file, or other source of data. Many tools are available on the Web that achieves this. Sometimes programmers create this to varying levels of intricacy to speed up their own work.
HTML should be mentioned in the context of automatic GUIs. HTML is not a programming language but is a language to specify screen geometry, widget layout, visual attributes, and data category. A web browser is a program that use HTML as a triggering mechanism to interpret the specified HTML to determine what to draw on the screen as it uses API for the lower levels to produce the equivalent graphical output that expresses borders around tables, drop downs, and other visual elements and effects. Adobe Flash, Microsoft Access and other screen builders work in similar ways with varying levels of control available to the programmer to customize the behavior and output of the GUI.
Major organizations over the years have exercised decades long initiatives to supplant the GUI building approaches enabled by the other levels, principally the 4th level, to no avail. It is a chicken and egg issue. Programming is required to automate GUI creation but automated GUIs are often insufficiently equipped to sustain their own evolution. Coupled with the limitations of consolidation taken to a certain level and you are full circle back to a lower level depending on changing circumstances and needs.
Though the 5th level of GUI implementation sees tremendous use on a daily basis, part of the push for this has been to establish vendor lock-in. The most adopted automation to produce GUIs would coincide with the perpetuation of an operating environment that saw less growth, slowed advancement of the state-of-the-art, lower flexibility, but higher costs, and would greatly impair the realization of alternatives that collectively contribute to the advancement of environments, tools, capabilities, and divers opportunities.
Though interest in substantially expanding automatic GUI creation has not subsided, it has yet to encompass all areas in which GUIs are needed. Much is written as to why that is and will not be summarized or restated here. Instead, I will conclude this section by saying they definitely have their uses and are the fastest way to build a GUI. Moreover, they can be the fastest way to build and deploy a highly polished, extremely consistent, and very robust set of screens perfectly coordinated to the data they capture and modify. They can save an enormous amount of time and are not to be discounted.
What we see on the screen comes about through one or more of the 5 processes described. The one that is appropriate to use depends on your goals, skills, time, and general motivations. Many considerations go into the creation of GUI screens, photo visualization programs, video playback programs and apps, and any software that presents visual information.