Scripting is the Glue

Software architecture isn't about writing code, it's about creating solutions, and those can come in many ways.

I grew up in the 80s and 90s, got into computers in the time of DOS, was coming of age at the time of Windows 3.1 and 95, and was in college for the Windows 98 and Windows 2000 (and Windows ME). Windows 2000 is still my favorite Windows. But I definitely was into windowed apps and GUIs. My first real exposure to doing things with command line was running telnet to my university servers to check my email with some terminal email application whose name I can't recall. And my first exposure to Lynx web browser.

My other exposure came in with MP3s. I know that's probably not a traditional route. But I was in college during the real birth of MP3s and digital music, and I was FULLY on board immediately. I had CD cases and a spinning CD tower that could hold like 80 CDs or something. (I would later take the same adoption of digital movies). But I had a program that would rip the audio from the CD, then another program called LAME (which I remember as being a recursive acronym for "LAME is not An MP3 Encoder", but it really was). This program would run in the command line. If I was into scripting back then, I would have had my entire catalog converted much more quickly. (Edit: I looked into it and learned that LAME is for LAME Ain't an MP3 Encoder)

After college, I had a period where I was developing a lot of Java in my free time. Although my full time job was with C# and Microsoft tools, I wasn't going to adopt that for personal use, due to the expenses. Visual Studio was like $2000, hosting is more expensive and less available, it didn't boast a huge open source trove like Apache Jakarta. Ah my Jakarta days. I had done some Java in college, it was generally a neat language. So I got more into Java. With hosting, you can just get a Linux server for like $20 a month or something. Pretty cheap. And having to log into Linux and do everything with it, I quickly became comfortable with the terminal. I couldn't tell you the correct syntax for tar, but no one needs to memorize anything anymore :)

I got into scripting for deployments. I carried this with me across all of the platforms I would develop on over the next many years. From Java, to Node.js, to Go. Deployment to a Linux server involved bundling the code, copying it over, restarting the server in some way to reload the latest code. Maybe also applying some SQL scripts as well. So, in Go in particular, you had to find which process is the one you want to kill, send a signal to it, deploy, start the new service. This also involved setting up nginx and in my previous days, Tomcat. A small digression... in my Node days, I had written the web server, which was deployed with each application, so just restarting the whole service did the job. I love nginx though.

I've written plenty of scripts, but only now am I getting to writing of its importance. The operating system provides a ton of useful tools. It's important for a developer to know about them. Any time where you don't have to write code to solve a problem is best. No code == no bugs. So, less code == less bugs. I always say, the idea of coding is a lot like the idea of golfing. The object of golf is to golf as little as possible. Same with coding.

So this week, I touched on it a little bit with my weather api bonus content post. Scripting is the glue. It absolutely depends on the problem of course. Sometimes the solution is one application, or one website, or even a script. Or it could be human effort. It's a massive skill, I think, to be able to look at some problem, see the parts, and delegate. Some solutions will be a combination of many applications, and the order of their execution can and should be scripted.

I'll get a little technical here. WIth my Weather API geocoding solution, the problem was this:

I want to automate my process of creating the cities.json and countries.json files from geonames.org, cross referencing the locations that I have in my database. These are the steps:

  • Given a list of locations in CSV in the format of city,state,country
    • For each one, find it in the conglomeration of cities500.txt and my custom editions, and add it to the output.
    • Ignore any that are not in my locations.

I did not know of a good way to get the CSV list from a database in the command line. So this part I had to write in my quick sqlcsv application. This was pretty easy. And leads to fun looking code like *(v.(*interface{})). That's just gorgeous.

So, that's line one of my script:

./sqlcsv -c connstr.json -q "select distinct city, state, country from tournaments" -out unique_locations.csv

(column and table names changed to protect the innocent)

The next thing is I wanted to have updated data from geonames. These files that I need are pretty static in the area of their location on the internet. Here comes the operating system. In my early Linux days, you'd download stuff all the time with curl. It's just how things are done. Windows also has curl (when and how, I don't know).

Write-Host Downloading country file
curl.exe -L -o countryInfo.txt https://download.geonames.org/export/dump/countryInfo.txt

The cities500 comes in a zip file because it's 34 MB when extracted.

Write-Host Downloading cities file
curl.exe -L -o cities500.zip https://download.geonames.org/export/dump/cities500.zip

So the next thing is that my program doesn't read zip files. Do I update it to read zip files? Heck no. As with the tar command on Linux, while it's not a core operating system function (like process scheduling or networking, etc), it is definitely something that comes with operating systems because they'd be silly to not provide it. And in this case, Powershell just provides a thing that I assume hooks into the .NET libraries for dealing with archives (or there's a compression executable. Details don't matter.)

Write-Host Extracting cities500.zip
Expand-Archive cities500.zip -Force -DestinationPath .

This I know just has the cities500.txt file inside of it, but they could one day mess with this and it won't work. But today is not that day.

I then run my geocodecsv program with the unique locations csv file, the latest cities500.txt and countryInfo.txt, and my adds/edits. It then outputs the cities it couldn't locate. At this point, it's done.

The real world presents these types of problems all of the time. Where it's not just a simple "oh there's an application written to do that specific thing". Almost never. The efficient programmer can reuse things, and interject where necessary, and introduce something that completes the puzzle.

I could have been like "I need an application that downloads a file given a specific url" and written it. It's very easy. Some applications I've written do that, but it's like core functionality on top of the other functionality it's providing. The judgment call is where the skill comes in. Do I want to make geocoding a list of locations bloatware? No. It should just do that. Not also download files or run queries against a database. I was looking into making this a website where someone could log in and update it, and add or edit cities. That may become a thing, but for now, this was the best approach.

I'm ecstatic when I see something that can be solved with something that someone has already written, whether it be me or anyone who has software running on my computer. Tons of people took tons of hours to write tons of useful applications. Use them:)

Happy coding!