About a month ago I decided to revamp the RSS reader API. I am satisfied with the user interface, but I wanted to revisit the underlying code for pulling and saving rss feed data. The previous API was okay. The list of feeds were read from a file. The list of headlines from each feed’s website was pulled based on the website address for the feed. It worked. However, I found that not every website that publishes feeds use the same format. That means that in some cases, you get a description for a headline, but in other cases, you do not.
I decided to implement the ability to pull the actual news article content. While it is true that I could have simply added this to the previous API, I had another goal as well. Most websites have a limit on how often you can connect to them when it comes to things like RSS feeds. As an example, Slashdot has a limit of 1 connection every 30 minutes. If you connect more often than this, you could be blocked. I realized I had an API that pulled all feeds in the feeds lists with associated headlines data every time the RSS program ran. That was a problem.
The goal then is to get to the point where time limits apply to the API. I am not there yet because I realize time-stamp data will not be consistent between websites. I will have to generalize an API to handle different date/time representations in connection with C++ / C time functions. At the same time (the puns are unavoidable) I will need to plan the right approach to introduce time limit control in the program. I will either allow the user interface to drive this, or I will pre-program a limit into the API (say an hour). I wanted to use time values based on when the program runs but that is not a good approach in this case. Rather, the time will have to be based on the time of the file that contains headline data.
The past month I was able to commit about 1 – 3 nights a week to the endeavor. I would start around 10 PM or 11 PM and finish around 3 AM. The available time was rare and it has been tough staying motivated enough to make progress. The normal daytime experience afterwards could be quite rough. However, I stuck with it. The biggest breakthrough came when I had a couple of hours on a day off where I could just sit down and not write any code but instead plan everything on notebook paper in a coffee shop. I sat and thought deeply through the API expression, the next level goals, and the foundation for future progress. I needed an approach that would give me the biggest gains in quality in a short amount of time (infrequent late night/early morning sprints).
After I finished writing and planning the new API on notebook paper, I paused. That same hour, I dived right in and wrote the first code for the API but just the function definitions without the implementation. Some people call that “stubbing”. I was never a fan of stub function implementation in the past. No real reason, but in this case, I found that doing the stub functions provided progress and maintained commitment to the end goal. Over several weeks and when fatigue was low and motivation arose, I incrementally worked the new API into existence.
When I ran the command-line version of the program a day or two ago, it was funny how the output was the same but the underlying implementation was different. That is a good result since re-integration with the user interface will be fairly straightforward. A “meat” of the API is done, but it will take several revisions to get the time-based limit control just right. The POCO C++ libraries may prove useful in that regard. POCO has all the functions I need to do the time-based limit control in a cross-platform, universal manner. It will come down to the best abstraction from the perspective of the user interface and command-line programs in terms of how that time values are retrieved, stored, and evaluated.
The most recent updates described here are in the 6/29/2018 commit to github. More updates will follow. As I work through this, I realize that go get the RSS reader to a pristine level of functionality that for a simple process, the final fit requires some strong reworking of the article presentation. I have a raw understanding of where I would like to go with that, but it is too early to describe that at length as I am still deciding between easy versus more extensive effort.