Budget App

I'm a software developer, which, if it's not your first time here, or you know me in real life, then you know this. If I have a problem which can be solved with software, I write software. I was having a hard time keeping track of my money, and like, being able to project when hard times will not be hard. So I made a thing. I named it "ABIDE" due to the 5 things it tracks, namely, in correct order, "adjustments", "balances", "incomes", "debts", and "expenses". It's a very simple app, you enter in the data, and then it computes out as far out as you want to go. I've gone as far into the future as my projected retirement year, 2045.

Currently it's not open sourced, but it's on github and I'll just have to flip the private repo to public.

The premise is easy. Enter data, compute data. Occasionally enter in updated amounts since you can't necessarily track every single dollar in your life. Here are a bunch of screenshots that show it in action.

Adding and updating balances, your starting amounts for savings / checking accounts

Balances

Modifying expenses

Modifying expenses

The Calendar

Calendar

Reconcile, which is to update the calendar with actual amounts, without actually modifying your past numbers. Although you can still do that.

Reconcile

And the month summary will tell you the starting and ending amounts for each of your balances.

Month Balances

The project uses docker, and a MongoDB container, so your data is stored in the container, and containers can get wiped. So it's a good idea to grab a backup occasionally if you have changed the data! With the data transfer, you can import and export data.

Data Transfer

 

I will have more later, but for now, it's helping me a lot!  It's written in Go and Angular 7 as well as using MongoDB. 3 containers are spun up with docker-compose. Both the Go service and the website are multi-stage builds. It's neat. Code to come! Here's the docker compose file.

docker compose

It's not the best Angular. I was getting really lazy with it :P Anyway, check back soon, or hit me up on twitter or something (above) if you want to use it, I can send you the latest.

SCGen Update

With some client Sitecore work coming up, I've had to think about how to get rid of TDS for that specific client. I haven't had to do much Sitecore template work on that project at all, so i've been able to exclude TDS projects from the solution pretty much from the beginning.
 
However, the TDS code generation within that project created a Model.cs file that is nearly 53,000 lines. 2.4 MB!! It is pretty monumentally important that scgen can generate pretty much the same code, but without the enormous overhead of TDS. However, not the same code, as much of the code in that Model.cs is repeated, like full repeated "using" statements for every single type... (ugh).  I can probably get it down to half the size, or even better!
 
The TDS code generator was generating things that weren't covered by scgen, like field wrapper properties along with field wrapper value getters.  It was generating solr index attributes. Index field attributes would use the name of the property but as all lowercase and underscore separated, as well as the C# style property name. 
 
The project Model.cs needed a lot of new things that just isn't covered by scgen, and I didn't want to add dedicated properties to scgen that only this project needs.
 
So, my solution...
 
Long story short, Go allows you to deserialize json to a concrete structure much like Newtonsoft.Json, and it allows a field type to be json.RawMessage. So the FieldType type looks like this now
type FieldType struct {
	TypeName      string          `json:"typeName"`
	CodeType      string          `json:"codeType"`
	Suffix        string          `json:"suffix"`
	PropertiesRaw json.RawMessage `json:"properties"`
	Properties    FieldTypePropertyMap
}

type FieldTypePropertyMap map[string]string

To define these properties, you can just add a "properties" json property to the field type. Here's an example for the "checkbox" field type:

        { "typeName": "Checkbox", "codeType": "bool", "suffix": "", "properties": {"SpecialProperty": "SpecialPropertyValue"} },

The process for including this and processing the text template is pretty verbose. It's not so bad if you can always depend on that property being there. But first, if you need to check for the property, you can use Go's Text Template "index" method, like so:

{{ if ne (index $field.Properties "propertyName") "" }}   The property exists, use it here.  {{ end }}

If the map doesn't include that property, it will just return a blank string. And to write it out in the template, simply do this, again with the index method:

{{ index $field.Properties "propertyName" }}

Feel free to browse the code at github.com/jasontconnell/scgen.  There was also an update to the configuration helper that I use for nearly every project I create in Go, located at github.com/jasontconnell/conf.

The Mind of a 20+ Year Developer

I'm not sure how long I've been coding. Let me see... I started with GWBasic in my senior year of high school, so approximately Fall of 1996, so 22 years fully, but I didn't know jack then. I don't know jack still, so by that measure, I've not been developing at all :)

But computer science started in college in Spring of 1998, so it's been over 20 years since I actually started to learn some things. Those were exciting times! Anyway...

As a developer of 20 years, really into learning everything I could, having dealt with many different languages and coding paradigms, frameworks, ideas, projects, protocols, etc. at some point, a lot of solutions started to seem obvious to me. For a recent example, I've been working on my Go Sitecore API. Sitecore is a hierarchical system for storing content. Content is a broad term in the Sitecore world. As they say, "Sitecore is built on Sitecore". It stores all of your data structures (templates), content, ways to render the content (layout and renderings for all intents and purposes), system data, etc. Everything is a node. A node is an instance of a "Template".  The "Template" node is a node. Etc. And it's all tree based.

So in writing the Sitecore API, you don't necessarily want to deal with the entire tree. Also being a tree, the nodes have a "Path" property, like /sitecore/templates/User Defined/My Template. I wrote a simple way to build the tree from the database, then filter out the tree by a set of paths ( paths []string). This would simply go through the tree and the result would be a map of Guid -> Item (a node), where the nodes returned would all reside within the path specified. You could provide a path like "-/sitecore/system" (beginning with a hyphen) to say that you want to exclude those items. That code is here. So I found myself needing the opposite.

Give me all nodes in these paths, then give me all nodes NOT in these paths. You could write a set operation, an XOR or something like that. But I needed to do it by path. Knowing I had the path operations like "-/sitecore" (starting with hyphen) to exclude items, I quickly said to myself, "why not use the same paths, and where it starts with -, remove it, and if it doesn't start with -, then prepend it, and use those paths?"  So that's what I did. You can see that code here.

Of course, now I'm thinking the XOR operation might be a better idea! Give me all of the nodes in those paths, then loop through the base tree and add any nodes where the ID is not in the resulting filtered tree... that might be a little bit better, I think... although it does result in two loops through the entire tree contents, my original idea may actually be the better one.

So you can see how the mind of a 20 year developer works. Also I'm not afraid to say "Oh, yeah, that can be done much better, I'm going to rewrite it."

An Uncanny Memory

Another thing that I've noticed is that I know how pretty much everything I wrote works. For every project, if someone asked me, "Hey, there's a component in this project you worked on 5 years ago, I have to make an update to it, can you tell me how you wrote it?"  "Sure! It works like this" and then I'd spout off all the things it does. And any thing to watch out for. Sometimes I'm amazed at how I have retained that information through all I've worked on in the past 20 years.

I might benchmark those two methods and see which one is faster. Happy Coding!

CRUDGEON

I thought that was a word when I first wrote it. But I was thinking about other words like curmudgeon or something. Anway...

CRUDGEON

A set of big projects came along in work that consisted of some of the same pieces, in a high level of thinking. Get data from a service, store it in a database, generate html based on the data. I guess it doesn't matter that HTML is generated. Generate formatted output based on the data. That's better.

The services are not consistent in their details. One was a WSDL Web Service, one is a JSON service, and two are just schemaless XML. This part was pretty annoying. Schemaless XML and JSON need to go away. We are in 2018, the dynamic typing experiment is over ;)  (that's sure to ruffle some feathers).

When looking over the data that was coming back, 2 responses returned types that have 130+ fields in them. This would have to be represented in a SQL Table, in stored procedures, in a C# class and interface. Looking over 130+ fields, I immediately thought there's no way I'm typing all of that by hand.

A really lazy person (like me) would probably try to use a text editor with regexp find/replace functionality, copy a list of the fields in and run regexp find/replace to format it in the way that I would need it at that given moment. Like, in a list of parameters to a stored procedure, or as a list of parameters to a constructor, or generating properties on a POCO (plain old C# object). I am definitely lazy, but I'm also too lazy to do that each time.

A championship caliber lazy person (like me) would probably write CRUDGEON.  I also don't know why I keep doing it in all caps. "crudgeon" is equally acceptable for those who's caps lock and shift key are just too far away.

So what is it?

Basically, you give it a fake C# list of properties, and it'll generate whatever you need it to. Right now it'll generate:

  1. Database Table
  2. Stored procedures for Get, Update, and Delete
  3. C# objects with appropriate attributes for pulling data from services, like XML attributes, JSON "DataMember" and "DataContract" attributes, and so on.
  4. A convenience "map" script which does the copy from a service object to your POCO, in the case of WSDL objects where you don't want to have the WSDL generated type available to everyone, and hence depended on by anything except what you control (I always do this by the way... never expose WSDL types, they should only be internal. But I digress).

The README.md has a lot of info on using crudgeon. It also has example inputs and outputs within the git repository itself. I wrote it for these specific projects in mind, and the generated C# code has references to things I wrote specifically for these projects, but if I come across any other project that needs to courier data and store it locally, I will definitely be broking open the ol' VS Code again. I wrote sqlrun in conjunction with crudgeon because I needed a quick way to run all those SQL files it was generating. I've used it hundreds of times in the week since. After testing, I'd find that I'd need a new attribute, or a column needs to allow null, or something. And I'd regenerate all of the code, run sqlrun at the SQL that was generated, and begin importing the code again, all withing like 10 seconds of each other.

Maybe you'll find some use for it. I know I definitely will. Like I said, it was definitely written with these sets of projects in mind, but with little modification, maybe they can be used more broadly. Or maybe with no modification! I'll know later when I find an excuse to use it again :)

Happy Coding!

Sitecore Doc

Sitecore solutions can become large and unweildy. Recently I was tasked with the following: Find out which page each of these service methods (40-50) are called. With how .NET and Sitecore applications (generally all good applications) are written, a service method call would be written within a component, but that component could be put on any page!

Luckily these components manifest themselves in Sitecore as "renderings". They can have data sources and parameters. And code, which is what leads us to this mess in the first place ;)

First we'd need a way to map these renderings to service calls. I came up with a generic "info" data field to do that in a JSON file which defines all of the renderings we're interested in. On a side note, I only provide those that we're interested in, this current project would yield a 4-5 MB result file which would be ridiculous if it included everything. That JSON looks like this:

 

{
    "includeUndefined": false,
    "renderings": [
        {
            "id": "6A8BB729-E186-45E7-A72E-E752FDEC2F48",
            "name": "AccountBalancePurgeSublayout",
            "info": [
                "CustomerInformationService.GetCardStatusV4(authHeader, accountId)",
                "CustomerInformationService.GetPurgeInfo(authHeader, accountID)"
            ]
       }
}

Using my (recently updated to accommodate this request) Go Sitecore API, I was able to take that information and map it against every page renderings (or standard values renderings) and produce a file that is filled with every page and their (eventual) calls into service methods. These aren't directly called within the page code (usually), and there's heavy caching going on as well. Here's what the output looks like:

 

    Name:     booking-calendar
    ID:       f1837270-6aca-4115-94bc-08d1a4ed43ad
    Path:     /sitecore/content/REDACTED/booking-calendar
    Url:      https://www.REDACTED.com/booking-calendar
    Renderings:
            Default
                CalendarBooking   
                    Path:         /layouts/REDACTED2013/REDACTED/SubLayouts/Booking/Calendar/CalendarBooking.ascx
                    Placeholder:  content 
                    Info:
                                  ReservationService.GetAllRoomTypesV2()
                                  ReservationService.GetCashCalendarV3(GetAuthHeader(),promoCode,startDate,endDate,isHearingAccess,isMobilityAccess, isWeb)
                                  ReservationService.GetCashCalendarWithArrivalV3(GetAuthHeader(), promoCode, roomType, arrivalDt, numNights, isWeb)
            Mobile
                CalendarBookingMobile   
                    Path:         /layouts/REDACTED2013/REDACTEDMobile/SubLayouts/Booking/Calendar/CalendarBookingMobile.ascx
                    Placeholder:  content 
                    Info:
                                  ReservationService.GetAllRoomTypesV2()
                                  ReservationService.GetCashCalendarV3(GetAuthHeader(),promoCode,startDate,endDate,isHearingAccess,isMobilityAccess, isWeb)
                                  ReservationService.GetCashCalendarWithArrivalV3(GetAuthHeader(), promoCode, roomType, arrivalDt, numNights, isWeb)

This was very useful for this specific task, however it's written in a way that will be very useful going forward, to provide insights into our Sitecore implementations and how the content is structured.

This app will see updates (sorry the code isn't available for now) so that it will show usages among different renderings, unused ones or broken (exists in a renderings field but not as an actual item in sitecore [was deleted or not imported]), and other stuff that I can think of. This binary is named "scdoc" as I like to keep my names short :)  The Sitecore Code Generation tool I wrote is simply "scgen".

Check out that Go Sitecore API though if you want to easily query your Sitecore database!  Happy Coding :)

Google App Engine on Windows

For the life of me, the Google App Engine "Quick Start" wasn't working for me for starting up the development server in Go. The command listed as "dev_appserver.py app.yaml" is the one I'm referring to. Windows kept asking me which program I wanted to use to run it.

Trying to avoid installing another copy of python, since it comes with the Google Cloud SDK, and I knew the "gcloud" command used python, I decided to look for that gcloud.cmd. And I found it. And I copied it. Only to run the dev_appserver.py instead of gcloud.py :)

Here it is. Just put it alongside the dev_appserver.py and it'll run fine.

 


@echo off

SETLOCAL

SET "CLOUDSDK_ROOT_DIR=%~dp0.."
SET "PATH=%CLOUDSDK_ROOT_DIR%\bin\sdk;%PATH%"

SETLOCAL EnableDelayedExpansion

IF "%CLOUDSDK_PYTHON%"=="" (
  SET BUNDLED_PYTHON=!CLOUDSDK_ROOT_DIR!\platform\bundledpython\python.exe
  IF EXIST !BUNDLED_PYTHON! (
    SET CLOUDSDK_PYTHON=!BUNDLED_PYTHON!
  ) ELSE (
    SET CLOUDSDK_PYTHON=python.exe
  )
)
IF "%CLOUDSDK_PYTHON_SITEPACKAGES%" == "" (
  IF "!VIRTUAL_ENV!" == "" (
    SET CLOUDSDK_PYTHON_SITEPACKAGES=
  ) ELSE (
    SET CLOUDSDK_PYTHON_SITEPACKAGES=1
  )
)
SET CLOUDSDK_PYTHON_ARGS_NO_S=!CLOUDSDK_PYTHON_ARGS:-S=!
IF "%CLOUDSDK_PYTHON_SITEPACKAGES%" == "" (
  IF "!CLOUDSDK_PYTHON_ARGS!" == "" (
    SET CLOUDSDK_PYTHON_ARGS=-S
  ) ELSE (
    SET CLOUDSDK_PYTHON_ARGS=!CLOUDSDK_PYTHON_ARGS_NO_S! -S
  )
) ELSE IF "!CLOUDSDK_PYTHON_ARGS!" == "" (
  SET CLOUDSDK_PYTHON_ARGS=
) ELSE (
  SET CLOUDSDK_PYTHON_ARGS=!CLOUDSDK_PYTHON_ARGS_NO_S!
)


SETLOCAL DisableDelayedExpansion


"%COMSPEC%" /C ""%CLOUDSDK_PYTHON%" %CLOUDSDK_PYTHON_ARGS% "%~dp0\dev_appserver.py"" %*

Advent of Code 2017 - Day 13

Day 13 reveals itself as a sort of lock picking exercise. Part one is a simple tumble (get it) through the series of inputs they give you, to figure out if you would be caught on a certain layer, and if so, do some multiplication to get your answer. Simple.

The puzzle could also be thought of as the scene in The Rock (the movie about Alcatraz with Nicholas Cage and Sean Connery), where, to get into the prison, Sean Connery has memorized the timing for the fires that blaze, and rolls through them unscathed.

Except, the timings way down the line don't match up since they themselves are on their own timers. And there's like 40+ of them.

The sample input looks like this:

0: 3
1: 2
4: 4
6: 4

So, layer 0 has a depth 3. So on "second 0" the scanner is at 0, on second 1 it's at 1, on second 2 it's at 2, on second 3 it goes back to 1, and on second 5 it's back to the start, blocking any would-be attackers.

Layer 1 only has depth 2, so it's fluctuating back and forth between 0 and 1.

Since the puzzle input may include gaps, and it's easier (probably) to complete the puzzle with no gaps, the first step is to fill them in! As usual I'm writing my answers in Go

func fill(scanners []*Scanner) []*Scanner {
	max := scanners[len(scanners)-1].Depth
	for i := 0; i < max; i++ {
		if scanners[i].Depth != i {
			s := &Scanner{Depth: i, Range: 0, Nil: true}
			scanners = append(scanners[:i], append([]*Scanner{s}, scanners[i:]...)...)
		}
	}
	return scanners
}

That line   ------    append(scanners[:i], append([]*Scanner{s}, scanners[i:]...)...)  ---- with all the dots!! What it's doing is pretty simple, though.

If we don't have a scanner at depth i, insert a scanner with depth i at the current position.   "scanners[:i]" is all scanners up to i.  "scanners[i:]" is all scanners after and including i  ( the :   (colon) syntax is very subtle). So we want to insert it between those two. That's all it's doing. The ellipsis confusion is just because "append" takes a variadic list of parameters, and you can convert an array to variadic parameter with the ellipsis. Done!

We'll need a method to move all of the scanners every step. That's pretty straightforward. I called this method "tick". The Scanner is just a struct with Depth, Range, Current, and Dir for telling which direction the thing is moving.

func tick(scanners []*Scanner) {
	for _, s := range scanners {
		if s.Nil {
			continue
		}

		if s.Current == s.Range-1 {
			s.Dir = -1
		} else if s.Current == 0 {
			s.Dir = 1
		}

		s.Current += s.Dir
	}
}

Part 1 was to just send a "packet" (or Sean Connery) through and every time you are at a security scanner (explosion, going back to the movie), multiply the depth times the range at that scanner, and add it to the previous number to get the new number. That part was fine, and you could do it with the physical motion of passing the Sean Connery through :)

So, to figure out part 2, which is "you need to get through without getting caught this time".  Getting caught is simply being at Depth "d" when the scanner at "d"'s current position is 0. You could brute force this.

For brute force, you'd start at delay 0, then try go figure out if you can make it all the way through. If not, move to delay 1 and try again. For each delay, you have to run the tick method. For delaying 100 seconds, tick would have to be run 100 times to get the puzzle into the correct state. So this becomes computationally intense!

This is a fine solution in most instances. In this instance, though, I let it run over lunch and checked in with it 44 minutes later, and it wasn't complete yet! So, back to the drawing board.

But wait!!  Math is a thing. And it's very useful!  I'm actually pretty certain that I don't even need to check the final answer by actually traversing the sequence of layers, it's just the answer. Due to math!

So, to get through safely, the position of a particular depth has to not be 0 when we're passing through it. I wrote a method to figure this out, called "possible". It's pretty much the key to the puzzle, and solving it insanely fast.

func possible(scanners []*Scanner, delay int) bool {
	p := true
	for _, s := range scanners {
		blocking := (s.Range*2 - 2)
		p = p && (s.Nil || ((delay+s.Depth)%blocking != 0))
	}
	return p
}

A "Nil" scanner is a scanner that was filled in due to gaps. This one has 0 range and can't block anything. So if it's one of these, it can pass through this depth at this time.

The (s.Range * 2) - 2.  Call this "blocking" or whatever. I called it blocking since it's at position 0 of its range every "blocking" number of steps. A scanner with range 4 is back to 0 every 6 steps (4 * 2 - 2) To determine if it's possible at this delay, a layer 7 steps in cannot be blocking on delay + 7. Otherwise it gets caught.  (delay + depth) % blocking.   (After delay, scanner at depth "depth" has to not be at the start ( mod blocking == 0) ).  "p" is kept, for each step, if we can pass through up until now and the current layer. You could possibly shave off some iterations here by checking p, and if it's false, break out of the loop. I might actually update it to do that and report back! It takes about 1 second to run currently. (CONFIRMED. it runs in about half a second now).

So, all that's left is to still brute force the numbers to find if it's possible to get through the sequence at the current delay, without getting caught, but you don't have to actually do it, which speeds it up somewhere on the order of a million percent :)

Check out the full solution here - https://github.com/jasontconnell/advent/blob/master/2017/13.go

Happy coding, and happy holidays!!

Go Dep

As of this moment I've updated all of my Go code to use "dep". It's useful, and it works. It's slow but that'll be fixed. Phew! Slight pain updating all of my repos to it. But man, it'll make life easy going forward.

First of all, I had to get all of my dependencies, code libraries that I've written to be shared, into my public github account. After that was complete, I had to update all of the imports. For instance, instead of "conf", which was a folder directly inside my Go path. Which makes it interesting. For a few reasons.

  1. I had to create a repo (in my private server) for each dependency.
  2. If I didn't clone that repo on another machine, the build would fail.
  3. If I forgot to commit the dependency repo but committed the repo that was depending on it, the build would fail on my other computer

These are problems. Here are a few solutions...

  1. For dependency repos, I may only have them in github going forward. Until it's easy to go get private server repositories. All of my repos are literally on the same server this site is running on.
  2. Doing dep ensure will get all of my dependencies, at the version it was locked at.
  3. Using dep and the dep paths (github.com/jasontconnell/____) willl ensure that the project will only build if it's in github.

You'll see all of my dependency repos within github. There are a few full fledged projects out there as well. (Including scgen :).  It is just a matter of updating the code to use the github url instead of the local file url, running dep init, and waiting :)

One tricky thing is I have a bash script to automatically deploy websites on that server. It looked like this:

#!/bin/bash

cd src/$1
git fetch

deploybranch=`git branch -a | grep deploy`

if [ -z "$deploybranch" ]; then
   echo no branch named deploy. exiting.
   exit
fi

git checkout deploy
git pull
cd ..

echo building $1

go build -o bin/$1 $1

outdir=/var/www/$1

echo $outdir

PID=`pgrep $1`

echo found pid $PID
if [ -n "$PID" ]; then
    echo killing process $PID
    sudo kill  $PID
fi

sudo cp bin/$1 $outdir/

if [ -d "$PWD/$1/content" ]; then
    echo copying content
    sudo cp -R $PWD/$1/content/ $outdir/content
fi

if [ -d "$PWD/$1/site" ]; then
   echo copying site
   sudo cp -R $PWD/$1/site/ $outdir/site
fi

cd $outdir

sudo nohup ./$1 > $1.out 2> $1.err < /dev/null & > /dev/null

echo $1 started with pid $!

exit

I'm very noobish when it comes to shell scripting. Anyway, this will checkout a deploy branch if it exists, pull latest, run go build, kill the current process and then start the process. It'll then copy contents over to the website. Simple build and deploy script.

It is named "deploy.sh". It exists in /home/jason/go and it is run just like this, "./deploy.sh jtccom"  It finds the folder "jtccom" inside of src and does all of the operations there. However, since I'm now using "dep", and none of the files exist within the "vendor" folder (you really shouldn't commit that... dep creates reproducible builds), I will have to modify it to run dep first. This has to happen after the pull. I've included the entire contents of the new deploy.sh here.

#!/bin/bash

cd src/$1
git fetch

deploybranch=`git branch -a | grep deploy`

if [ -z "$deploybranch" ]; then
   echo no branch named deploy. exiting.
   exit
fi

git checkout deploy
git pull

if [ -f Gopkg.toml ]; then
   echo Running dep ensure
   dep=`which dep`
   $dep ensure
fi

cd ..

echo building $1

GOCMD=`which go`
$GOCMD build -o bin/$1 $1

outdir=/var/www/$1

echo $outdir

PID=`pgrep $1`

echo found pid $PID
if [ -n "$PID" ]; then
    echo killing process $PID
    sudo kill  $PID
fi

sudo cp bin/$1 $outdir/

if [ -d "$PWD/$1/content" ]; then
    echo copying content
    sudo cp -R $PWD/$1/content/ $outdir/content
fi

if [ -d "$PWD/$1/site" ]; then
   echo copying site
   sudo cp -R $PWD/$1/site/ $outdir/site
fi

cd $outdir

sudo nohup ./$1 > $1.out 2> $1.err < /dev/null & > /dev/null

echo $1 started with pid $!

exit

I've updated how it calls go and dep, since calling just "go" didn't work anymore for some reason. Command not found. Anyway, here's the output.

[jason@Setzer go]$ ./deploy.sh jtccom
remote: Counting objects: 3, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 2), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From ssh://myserver/~git/jtccom
   03999eb..7c49dc6  deploy     -> origin/deploy
   03999eb..7c49dc6  develop    -> origin/develop
Already on 'deploy'
Your branch is behind 'origin/deploy' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
Updating 03999eb..7c49dc6
Fast-forward
 Gopkg.lock | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Running dep ensure
building jtccom
/var/www/jtccom
found pid 29767
killing process 29767
copying content
jtccom started with pid 30260

That Gopkg.lock was updated because I had to update a dependency to also use the github version of the dependency it was importing, since I deleted all dependencies on this server. So that was it. It's very easy to use and will make my life a lot easier, albeit a little bit more tedious. BUT! I really can't complain because the old way of doing things was painful. Forgetting to commit the dependency, now my code doesn't build on my work computer, so I have to wait until I get home :P  Plus everyone can look at the little dumb Go code I use across multiple projects!  Enjoy.

Go and Sitecore Interlude

This is part of the same program I'm developing to generate, serialize and deserialize items, but it's a general helper method that I found very useful, and can be used in any Go program. It can be expanded to be more complete, I pretty much did it for things that I am currently working with. You'll see what I mean.

The application is written as one application that does it all (generation, serialization, deserialization). So, it's looking for configuration settings for all aspects of those pieces of functionality. You generally don't want to serialize the same Sitecore paths that you want to generate code against. However, having the configuration in one file is not what I wanted. Here are the drawbacks.

If the configuration is in one file, you would have to update your configuration file in consecutive runs if you wanted to serialize, git fetch and merge, then deserialize. Your configuration would be committed and would be set for the next person that wants to run the program. You couldn't write bat files to run updates.

You could use the flag package to control the program pieces. Of course. But I set out to have multiple configs. For instance, if you wanted to serialize from a shared database then serialize to your local database. You could also make the config file a flag and set it to different huge files that each only differ by the connection string.

You could.

But then I wouldn't have this cool piece of code :)

Basically, when you run the program, you call it with a "-c" flag which takes a csv list of config files. The program reads them in order and merges them, having configuration values later in the chain overwrite values in the previous versions. I do this using Go's reflect package. As follows:

func Join(destination interface{}, source interface{}) interface{} {
    if source == destination {
        return destination 
    }
    td := reflect.TypeOf(destination)
    ts := reflect.TypeOf(source)

    if td != ts || td.Kind() != reflect.Ptr {
        panic("Can't join different types OR non pointers")
    }

    tdValue := reflect.ValueOf(destination)
    tsValue := reflect.ValueOf(source)


    for i := 0; i < td.Elem().NumField(); i++ {
        fSource := tsValue.Elem().Field(i)
        fDest := tdValue.Elem().Field(i)

        if fDest.CanSet(){
            switch fSource.Kind() {
                case reflect.Int:
                    if fDest.Int() == 0 {
                        fDest.SetInt(fSource.Int())
                    }
                case reflect.Bool: 
                    if fDest.Bool() == false {
                        fDest.SetBool(fSource.Bool())    
                    }
                case reflect.String: 
                    if fDest.String() == "" && fSource.String() != "" {
                        fDest.SetString(fSource.String())
                    }
                case reflect.Slice:
                    fDest.Set(reflect.AppendSlice(fDest, fSource))
                case reflect.Map:
                    if fDest.IsNil(){
                        fDest.Set(reflect.MakeMap(fDest.Type()))
                    }
                    for _, key := range fSource.MapKeys() {
                        fDest.SetMapIndex(key, fSource.MapIndex(key))
                    }
                default:
                    fmt.Println(fSource.Kind())
            }
        } else {
            fmt.Println("Can't set", tdValue.Field(i))
        }
    }

    return destination
}

So, you can see what I mean when I said it can be expanded. I'm only doing strings, bools, ints, slices and maps. The slice handling is different in that it adds values to the current slice. Map handling will add entries or overwrite if the key exists. Strings will only overwrite if the existing string is blank and the source isn't blank. So that's probably different from how I described the code in the beginning :)

Go is very useful. There's like, nothing you can't do :)

So the program is called like this:

scgen -c scgen.json,project.json,serialize.json

scgen.json will have the template ids for "template" and "template field", stuff that's pretty ok if it's hard coded. If sitecore were to change those template IDs, I'm fairly positive there's a lot of existing code out there that will break.

project.json has the connection string, the field type map, serialization path (since it's used for serialization and deserialization), and base paths for serialization.

serialize.json, in this instance, only has { "serialize" : true }  as its entire contents. Files like "generate.json" have "generate": true  as well as the file mode, output path, the Go text template, and template paths to generate.

So these files can be combined in this way to build up an entire configuration. The bools like "serialize" and "generate" are used to control program execution. The settings can be set in separate files, different files can be set and used depending on the environment, like a continuous integration server, or in a project pre-build execution. I foresee this being used with bat files. Create a "generate.bat" file which calls with generate.json in the config paths, etc for each program mode. Or a bat file to serialize, git commit, git pull, and deserialize. Enjoy!

Go and Sitecore, Part 3

In parts 1 and 2, so far, we've covered code generation with Go against a Sitecore template tree, and serializing items from the database to disk. Part 3 takes that serialized form and updates the database with items and fields that are missing or different, then clears out any items or fields that were orphaned in the process.

It probably doesn't do the clearing orphaned fields completely correctly, as I will only clear fields where the item doesn't exist anymore. It won't clear fields that no longer belong to the new template if the item's template changed. That'll probably be an easy change though, as it could probably be done with a single (albeit, advanced) query.

Deserializing involves the following steps.

  1. Load all items (already done at the beginning every time the program runs)
  2. Load all field values. This happens if you are serializing or deserializing.
  3. Read the contents from disk, mapping serialized items with items in the database.
  4. Compare items and fields.
    1. If an item exists on disk but not in the database, it needs an insert
    2. If an item exists on the database but not on disk, it needs a delete (and all fields and children and children's fields, all the way down its lineage)
    3. #2 works if an item was moved because delete happens after moves.
    4. Do the same thing for fields... update if it changed, delete if in the db but not on disk, insert if on disk but not in the db.
  5. This can, in some cases, cause thousands of inserts or updates, so we'll do batch updates concurrently.

Deserialization code just involves 2 regular expressions, and filepath.Walk to get all serialized files. Read the files, build the list, map them to items where applicable, decide whether to insert / update / delete / ignore, and pass the whole list of updates to the data access layer to run the updates.

I love the path and filepath packages. Here's my filepath.Walk method.

func getItemsForDeserialization(cfg conf.Configuration) []data.DeserializedItem {
	list := []data.DeserializedItem{}
	filepath.Walk(cfg.SerializationPath, func(path string, info os.FileInfo, err error) error {
		if strings.HasSuffix(path, "."+cfg.SerializationExtension) {
			bytes, _ := ioutil.ReadFile(path)
			contents := string(bytes)
			if itemmatches := itemregex.FindAllStringSubmatch(contents, -1); len(itemmatches) == 1 {
				m := itemmatches[0]
				id := m[1]
				name := m[2]
				template := m[3]
				parent := m[4]
				master := m[5]

				item := data.DeserializedItem{ID: id, TemplateID: template, ParentID: parent, Name: name, MasterID: master, Fields: []data.DeserializedField{}}

				if fieldmatches := fieldregex.FindAllStringSubmatch(contents, -1); len(fieldmatches) > 0 {
					for _, m := range fieldmatches {
						id := m[1]
						name := m[2]
						version, _ := strconv.ParseInt(m[3], 10, 64)
						language := m[4]
						source := m[5]
						value := m[6]

						item.Fields = append(item.Fields, data.DeserializedField{ID: id, Name: name, Version: version, Language: language, Source: source, Value: value})
					}
				}
				list = append(list, item)
			}
		}

		return nil
	})

	return list
}

I did a quick and crude "kick off a bunch of update processes to cut the time down" method.

func update(cfg conf.Configuration, items []data.UpdateItem, fields []data.UpdateField) int64 {
	var updated int64 = 0
	var wg sync.WaitGroup
	wg.Add(6)
	itemGroupSize := len(items)/2 + 1
	fieldGroupSize := len(fields)/4 + 1

	// items - 2 processes
	for i := 0; i < 2; i++ {
		grp := items[i*itemGroupSize : (i+1)*itemGroupSize]
		go func() {
			updated += updateItems(cfg, grp)
			wg.Done()
		}()
	}

	// fields - 4 processes
	for i := 0; i < 4; i++ {
		grp := fields[i*fieldGroupSize : (i+1)*fieldGroupSize]
		go func() {
			updated += updateFields(cfg, grp)
			wg.Done()
		}()
	}

	wg.Wait()

	return updated
}

Very unclever. Take all of the update items and fields, break them into a set number of chunks, kick off six processes, allocating twice as many for fields than for items. Each call to the respective update methods opens its own connection to SQL Server. This can be done much better but it does accomplish what I set out to accomplish. Utilize Go's coroutines (goroutines) and where something can be done concurrently, do it concurrently to try to cut down the time required. This is the only process that uses Go's concurrent constructs.

That's it for part 3!  Part 4 will come more quickly than part 3 did. I had some things going on, a year anniversary with my girlfriend, lots of stuff :)

Series:
Part 1 - Generation
Part 2 - Serialization
Part 3 - Deserialization