When you approach database technology today, several options exist. Understanding the better ways to apply a specific form of database technology can help you use that technology to greater effect. The road to getting there is the first task. I am one of those who have knowledge, education, and experience in database technology of a specific kind (Oracle and SQL Server). Persons already predisposed to database technology can take a new relational database and explore it quickly. However, there is a limit to knowledge and experience as certain assumptions do not cross over from one form of a technology to another.
Those who create database technology, create each one a specific way while observing SQL standards that are to be common across database systems. Whether experienced or no, sometimes jumping in and exploring can be helpful but often, starting at the beginning can prove the most valuable exercise. As beginnings go, let’s look at PostgreSQL through the eyes of Neil Matthew and Richard Stones.
At the Beginning
Neil Matthew and Richard Stones published a second edition of their work in 2005 by the name of Beginning Databases with PostgreSQL. The book covers PostgreSQL 8.0 and starts with the history of data, continues through the various major areas of the PostgreSQL system, how it deals with data, how to manage and PostgreSQL and ends with ways to access PostgreSQL from several programming languages. The guide is thorough and complete and would equip anyone aspiring to better understand databases in general. It succeeds in imparting readers with new knowledge and a new aptitude in data necessary to succeed in applying relational databases for managing and accessing data.
General knowledge is necessary for defining the overall structure of a solution. It also comes to one’s aid in evaluating how well a solution meets the requirements for which it was established. Meanwhile, practical knowledge is what drives the creation of the actual solution and the means to conduct management and refitting of process to goal. The book, Beginning Databases with PostgreSQL does an excellent job in making the general and the practical available from beginning to end. Finishing the book at chapter 12 would easily inform you at a broad discussion of how to approach PostgreSQL at database level.
Certainly, if we could just use databases by themselves with SQL, that would be too easy. Databases do a great job of maintaining and managing access to data but they are not generally straightforward in terms of how the information is presented. Usually, you need a spreadsheet, web page, mobile or desktop screen to present the information in a more understandable way. Spreadsheets are discussed in chapter 2 by way of comparison to databases in general. In chapter 5, Microsoft Access and Microsoft Excel are described as possible mechanisms to “sit in front of” (hence the term, front-end) PostgreSQL to show data in a way that is more familiar and easier to understand.
While chapters 2 and 5 addresses the matter of spreadsheets somewhat, chapters 13 – 18 finishes off the work by showing how to write computer programs that access the database. These programs may simply be a way to produce visual reports that can be printed or shared electronically. They may be a way to enter information into the database in a controlled and structured way. They may also be a means to use the database to execute workflow triggers and other kinds of automated data processing. Chapters 13 – 18 gives you the foundation for these kinds of solutions by showing you the fundamental ways of accessing a database information through a computer program. Those chapters cover C, PHP, Perl, Java, and C#.
SQL vs Programming
In my blog post about the upcoming C# 6.0, I shared how SQL is not a programming language which is obvious in that you cannot create technology with SQL. SQL is a way to retrieve and change data in a database but you need a programming language to create computer screens, access networks, and sensor input. Programming languages typically communicate with databases through formal mechanisms called API. Years ago, the documentation for a database API would often show how to access a specific aspect of the API but not how multiple parts come together to form an overall solution for receiving, processing, and posting data regarding databases. The pages from chapter 13 onward shows how this is done.
Most surprising to me was seeing my own patterns for coding represented in chapter 18 in the section, “Extracting Data with NpgsqlDataReader Class”. As I read the pages that followed, I saw a style and structure of code that was all too common (with the exception of the embedded connection string). You can substitute SqlDataReader for NpgsqlDataReader and between the pages of chapter 18 you get the idea of what a large majority of my past coding work consisted of in a corporate IT environment.
Although I know the process backwards and forwards as represented in chapters 17 and 18, I did learn something new in chapters 13 – 16. There is something quite intriguing about seeing the same concepts addressed in different ways in different programming languages. I believe I also learned what I wouldn’t want to touch as the examples provided a good window into how well given languages applied certain conventions. Hints are laid out as to time and effort to remediate certain issues that could emerge in a given language.
You can use PostgreSQL without the involvement of programming languages. I effectively demonstrated that in an earlier post in which I explored PostgreSQL using only my experience before I had read the book. You may find PostgreSQL suitable for data only. However, should you need to create a front-end or automatically running process that accesses PostgreSQL, it is good to know that the fundamental methods for doing so are well presented in the same work that presents the data and administrative methods for the technology. Which language you pursue is a matter of taste, but I do offer some high level analysis of languages that could be of help.
PostgreSQL Scalability and MongoDB
Relational databases are not for everyone as I mention in a recent post. At a global Web level, sometimes, you do better with a solution like MongoDB. In 2006, I once wrote a web-based layer atop SQL Server that I called a content management system. It provided all the software programs I made for a company the ability to access data in a very flexible and highly available way in theory. It worked but was always destined to be limited by the underlying SQL Server technology of the time. That company has since moved on to a third-party content management system.
While I was proud of what I had accomplished at the time, creating an Internet Filesystem, I did not have it in me to repeat the effort. Fortunately, another database technology has emerged that is very similar to what I had striven towards. MongoDB looks a lot like the concept I had defined in a relational context (minus the DropBox style interface) but does so with the support of a broad development community and in a way that has been vetted against high volume web systems. I have recently read the book about MongoDB by Kristina Chodorow and find that it makes an excellent case for NoSQL and document oriented data systems. However, sometimes you want more of the relational than the document orientation.
PostgreSQL team has announced they are adding more of this to future versions of PostgreSQL. That will be good, however, it does little to address the large difference between PostgreSQL and MongoDB. MongoDB is defined from the ground up for the web whereas PostgreSQL was defined pre-web and requires additional configuration and oversight versus MongoDB to have it run well for highly scalable web scenarios. At least that is the general perception that remains debatable. Sarah Mei offers a real world counter scenario on her blog. Regardless, they both have a lot going for them in that they represent managed data platforms (versus plain files) and are suitable fits in an environment inclined towards managed code platforms.
Rationale for PostgreSQL and Relational Databases
When relational databases often become the default choice and the only choice that many know about, it can be unclear why they exist in the first place. The immediate answers seems to be to manage data. However, they are not the only way to manage data. Several mechanisms existed before relational databases. The authors of the book alludes to this when they discuss flat files.
The problem was how to manage data in a way that did not require the reinvention of data management for every report and program that needed to access data. That is the first benefit of relational databases in that they achieve this in a consistent, time and cost-effective way. However, relational databases need not be centralized technology. Other solutions like SQLite can bring this kind of capability down to the program level and there is nothing wrong with having both central and local relational data engine in operation. That just improves the situation even more.
The benefit of PostgreSQL, given that it is open source and actively maintained is that it could serve as an indefinitely available and supported managed data technology when on a Linux or Unix based system. Though you may not elect to use it in an enterprise, it can be very useful for personal pursuits. Those pursuits could involve the maintenance of data or they could be the ongoing update of database skills. In this regard, it seems the book, Beginning Databases with PostgreSQL has found the right tool with which to inform.