Quarantine - Code-id 19

For two months now, we've been social distancing. Luckily I have a job where I can work from home. This saves me an hour at least per day in commute time. Except, now I have to home school too, but that's fun. It just takes time.

In order to be more efficient, I try to automate everything. Everything I write that is not directly doing work for a work project, is doing something to make me more efficient and also limit the amount of grunt work I would have to do.

Recent Problems

Dev content is out of sync! 

Solution:  Write a dbsync tool. Luckily the db we deal with mostly has no identity insert auto incremented integer id fields. This makes it easier. The thing that makes it harder is that the dbs for which I wrote this tool have millions of rows, are frequently updated, and have blobs that are multiple megabytes in size. This one I haven't made public.  But it will take a config of two connection strings, one is the source, one is the destination, then it will also have table definitions along with a timestamp field to be able to tell which rows have been updated since the last time the tool ran. When it reads all of the rows from the source table, it will update the time stamp so it gets much less data the next time. I just run these occasionally for all of the different databases that I have to manage. Having up to date content for development is a headache reducer.

Configuration is complex!!

Sitecore uses about a hundred configuration files. We have to keep track of different settings across multiple environments, like development content management and a separate content delivery, then the same thing for UAT, and the a more complex configuration set for production, with a CM and two CD servers typically. I solved this during a build by having a build process take all of the base Sitecore configs, then copy over application specific configuration, then finally copy over environment specific configuration. Then a separate process deletes files where there's also a .deleted file along site of it, like something.config and something.config.deleted.  Then a separate processes merging updates to Web.config in the root. This is pretty useful.

The main problem with this is that if we inherit a site, or want to convert an older site that didn't use this configuration process. So I made a config split tool! So you have a folder with all of the Sitecore configuration files, then each environment's configuration files in other folders, and run cfgsplit with the proper parameters. It will remove all the files that are the same across all of the folders. Then you run another pass at it with slightly different parameters, and it will move all configs that are the same from the environment configuration files into a separate application specific configuration folder, where all of those configuration files are the same. After that you're pretty much done.

The other aspect of this is the Web.config. There's a similar process with this except it works on XML, not folder structures. The idea is the same though. However, some XML nodes are extremely tricky to figure out a unique identifier for. Like, <x><y><a id="unique" /> <b x="y" /></y></x>   where the b node is the same across many different instances of y, and where y never has any attributes but there are multiple in succession. You might think using indexes might be a good idea. But, we're comparing whether these nodes exist in the other file, and having them be identified by index would not match the other files, so I had to match by content.

For a node like x or y, I would grab a hash of the children's contents. This was a good unique identifier. For a node like b, I had to grab a hash of the siblings. If there are two identical y nodes under this specific x, my code will fail. But you can argue when that would be useful, having two identical nodes as siblings in XML. I won't share this code either. It is nearly complete.

A File is too big to copy over RDP!

This one was a fun one and only took me an hour. I was heavily inspired from my memories of downloading "warez" in college, where you'd get a RAR file as a bunch of files, instead of one large one. This code I can share, it is here. I simply call it parts. It just does that, takes a file and splits it into chunks whose size is passed in via a parameter. There's no check on that size, now that I think about it, but I can check to see how many chunks would result and maybe throw an error if that's over an arbitrary limit, like 100 or something. But if you want to create 2 billion files, go ahead.

Happy coding!

The Mind of a 20+ Year Developer

I'm not sure how long I've been coding. Let me see... I started with GWBasic in my senior year of high school, so approximately Fall of 1996, so 22 years fully, but I didn't know jack then. I don't know jack still, so by that measure, I've not been developing at all :)

But computer science started in college in Spring of 1998, so it's been over 20 years since I actually started to learn some things. Those were exciting times! Anyway...

As a developer of 20 years, really into learning everything I could, having dealt with many different languages and coding paradigms, frameworks, ideas, projects, protocols, etc. at some point, a lot of solutions started to seem obvious to me. For a recent example, I've been working on my Go Sitecore API. Sitecore is a hierarchical system for storing content. Content is a broad term in the Sitecore world. As they say, "Sitecore is built on Sitecore". It stores all of your data structures (templates), content, ways to render the content (layout and renderings for all intents and purposes), system data, etc. Everything is a node. A node is an instance of a "Template".  The "Template" node is a node. Etc. And it's all tree based.

So in writing the Sitecore API, you don't necessarily want to deal with the entire tree. Also being a tree, the nodes have a "Path" property, like /sitecore/templates/User Defined/My Template. I wrote a simple way to build the tree from the database, then filter out the tree by a set of paths ( paths []string). This would simply go through the tree and the result would be a map of Guid -> Item (a node), where the nodes returned would all reside within the path specified. You could provide a path like "-/sitecore/system" (beginning with a hyphen) to say that you want to exclude those items. That code is here. So I found myself needing the opposite.

Give me all nodes in these paths, then give me all nodes NOT in these paths. You could write a set operation, an XOR or something like that. But I needed to do it by path. Knowing I had the path operations like "-/sitecore" (starting with hyphen) to exclude items, I quickly said to myself, "why not use the same paths, and where it starts with -, remove it, and if it doesn't start with -, then prepend it, and use those paths?"  So that's what I did. You can see that code here.

Of course, now I'm thinking the XOR operation might be a better idea! Give me all of the nodes in those paths, then loop through the base tree and add any nodes where the ID is not in the resulting filtered tree... that might be a little bit better, I think... although it does result in two loops through the entire tree contents, my original idea may actually be the better one.

So you can see how the mind of a 20 year developer works. Also I'm not afraid to say "Oh, yeah, that can be done much better, I'm going to rewrite it."

An Uncanny Memory

Another thing that I've noticed is that I know how pretty much everything I wrote works. For every project, if someone asked me, "Hey, there's a component in this project you worked on 5 years ago, I have to make an update to it, can you tell me how you wrote it?"  "Sure! It works like this" and then I'd spout off all the things it does. And any thing to watch out for. Sometimes I'm amazed at how I have retained that information through all I've worked on in the past 20 years.

I might benchmark those two methods and see which one is faster. Happy Coding!

SQL Server, A Million Updates, Multithreading and Queues

In the past month or two, with work, I've had two projects that have involved massive updates of data. Pulling data from a source, processing it, and updating SQL Server in both instances coincidentally. I've learned a lot.

First, SQL Server does not respond well to multiple threads doing thousands of updates each. I did not know this. I've seen the error message, "Transaction (Process ID XX) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction." more times than I'd like to admit. I've done multiple threads doing SQL updates many times before, but I guess never with tens of thousands of rows.

I wrote two apps that process hundreds of thousands of rows of data each. One was written in C#, the other was in Python. I'm not quite as adept in Python but I've learned some tricks.

The approach I've taken for each language was almost similar. Both involved creating a shared queue that would hold all of the SQL statements that need to run. The SQL statements are just stored procedure calls. There would be one process that just goes through the queue, batches them into 15-30 statement chunks, then executes it.

The two solutions, Python and C#, were slightly different though. In Python, the multiple threads would add to the queue, then after all the threads were done processing, it would process all of them. The C# solution involved creating an object which was a singleton (per connection string), and held the queue, and it would contain its own thread which would constantly process the queue. But just one thread so it wasn't overwhelming SQL Server in any way. Here's a little bit of code. In each language, I used the built in Queue provided by their respective standard library, although in C# I used the ConcurrentQueue.

C# pseudo code

        multiple threads collecting data
              call service
              add all data to be updated to the service

        service data collection
              get sql for object (orm)
              add to the shared queue

        sql execution thread - run
			while (true)
				open connection
					while queue has stuff
						create a string builder, add 15 or so commands to the batch, separated by ;

						build n database command and execute it.

				close connection.
				sleep for a second.

Python pseudo code

        multiple threads collecting data
             download data (in this case, it's just download csvs.
             process data
             get sql statement
             add sql statement to the shared queue

        main thread
             collect all data (fill the queue) across multiple threads
             process the queue, calling each batch of 35 in this case, in a single thread

So as you can see, the C# version is processing the queue as the data is being collected, and Python waits until the end and then processes the queue. I think the C# approach is better, as I said I'm a little bit more adept with C# so I'm more comfortable doing things like that there. Hopefully this helps someone out there with processing loads of data.

Of course, in Go I would have just used a channel! Just kidding, there'd be the same amount of complexity in Go, but definitely the end result would be a lot better looking!

Happy Coding!


I thought that was a word when I first wrote it. But I was thinking about other words like curmudgeon or something. Anway...


A set of big projects came along in work that consisted of some of the same pieces, in a high level of thinking. Get data from a service, store it in a database, generate html based on the data. I guess it doesn't matter that HTML is generated. Generate formatted output based on the data. That's better.

The services are not consistent in their details. One was a WSDL Web Service, one is a JSON service, and two are just schemaless XML. This part was pretty annoying. Schemaless XML and JSON need to go away. We are in 2018, the dynamic typing experiment is over ;)  (that's sure to ruffle some feathers).

When looking over the data that was coming back, 2 responses returned types that have 130+ fields in them. This would have to be represented in a SQL Table, in stored procedures, in a C# class and interface. Looking over 130+ fields, I immediately thought there's no way I'm typing all of that by hand.

A really lazy person (like me) would probably try to use a text editor with regexp find/replace functionality, copy a list of the fields in and run regexp find/replace to format it in the way that I would need it at that given moment. Like, in a list of parameters to a stored procedure, or as a list of parameters to a constructor, or generating properties on a POCO (plain old C# object). I am definitely lazy, but I'm also too lazy to do that each time.

A championship caliber lazy person (like me) would probably write CRUDGEON.  I also don't know why I keep doing it in all caps. "crudgeon" is equally acceptable for those who's caps lock and shift key are just too far away.

So what is it?

Basically, you give it a fake C# list of properties, and it'll generate whatever you need it to. Right now it'll generate:

  1. Database Table
  2. Stored procedures for Get, Update, and Delete
  3. C# objects with appropriate attributes for pulling data from services, like XML attributes, JSON "DataMember" and "DataContract" attributes, and so on.
  4. A convenience "map" script which does the copy from a service object to your POCO, in the case of WSDL objects where you don't want to have the WSDL generated type available to everyone, and hence depended on by anything except what you control (I always do this by the way... never expose WSDL types, they should only be internal. But I digress).

The README.md has a lot of info on using crudgeon. It also has example inputs and outputs within the git repository itself. I wrote it for these specific projects in mind, and the generated C# code has references to things I wrote specifically for these projects, but if I come across any other project that needs to courier data and store it locally, I will definitely be broking open the ol' VS Code again. I wrote sqlrun in conjunction with crudgeon because I needed a quick way to run all those SQL files it was generating. I've used it hundreds of times in the week since. After testing, I'd find that I'd need a new attribute, or a column needs to allow null, or something. And I'd regenerate all of the code, run sqlrun at the SQL that was generated, and begin importing the code again, all withing like 10 seconds of each other.

Maybe you'll find some use for it. I know I definitely will. Like I said, it was definitely written with these sets of projects in mind, but with little modification, maybe they can be used more broadly. Or maybe with no modification! I'll know later when I find an excuse to use it again :)

Happy Coding!

Grunt Work Principle

One word that I've used to describe my work style, but not really my programming style, is "lazy". This can be a word that describes behavior that is often considered detrimental. Non-flattering.

However, in the programming world, it is a very good trait, if employed with other proper attributes. Like, being a bad programmer and lazy isn't good. But being a decent programmer with good work ethic and lazy is actually pretty good!

Lazy definitely doesn't describe my work ethic. Lazy describes how I am when confronted with grunt work. I have been trying to describe it with a principle or some other short definition. Out of pure laziness and lack of creativity in general, I'll call it the "Grunt Work Principle". You can feel free to put my name in front of that. I couldn't be bothered to.

Grunt Work Principle

If the amount of grunt work presented exceeds 1 minute, and the grunt work can be automated, no matter how long the process to automate, it will be automated.

In practice, this will take the form from anything like my last post on Sitecore Doc, to something simple like taking 100 lines of short text and compacting them to 100 characters per line. For that compacttext project motivation in particular, each time I would have to do it would not exceed 1 minute probably, but easily add up all the times I've had to do it and it's in the 3-4 minute range :)  That project probably took me 1 hour total to create but it can be used indefinitely.

There is no upper limit on the amount of time automating will take. There is only the restriction of whether it can be automated. For instance if it requires human interaction or things that aren't so easily determined by a computer alone. Even then I'd probably find a way to automate as much as I can. For instance, with the Sitecore Doc project, I could automate getting items and renderings from Sitecore, generating output, but at the time (and I plan to integrate source-d into my workflow) I could not easily figure out a way to map method calls to renderings. So that part I had to do manually, which was a few hours worth of grunt work. Oh believe me, tracking calls to methods is grunt work when all you are doing is producing documentation!!

Benefits of Automation

Future re-use: The top reason to always automate the current task at hand is for future re-use. You may initially very specifically automate for the task at hand, but then in the future find a way that you can re-use that but with small modifications. Or even a complete rewrite. Or completely as is. This is all fine.

Consistency: Automating will produce consistent results. In my compacttext example, the output is predictable. If I specify 100 line length in that example, the same input will produce the same output 100% of the time. If a human were doing it, there's no guarantee as eyeballing the line length can skewed by things like screen size, font size, caffeine consumed, etc.

It is usually too soon to optimize, but wtf are you waiting for with automation?!  Get to it!!

Goals for the Summer

Goals for the summer... I've written a few of these in my life, it'd be nice to not have to do them again.

#1 - Code generator that generates code generators.
#2 - ORM which, based on inputs, will map the appropriate choice of ORM to my current needs. An ORM Mapper.


More null than null

// no value which is more null than null in this case.

A comment in my code just now. I highly enjoy implying degrees of things which have no degrees. Binaries. It is now more perfect code with this comment in there.


Go and Sitecore Interlude

This is part of the same program I'm developing to generate, serialize and deserialize items, but it's a general helper method that I found very useful, and can be used in any Go program. It can be expanded to be more complete, I pretty much did it for things that I am currently working with. You'll see what I mean.

The application is written as one application that does it all (generation, serialization, deserialization). So, it's looking for configuration settings for all aspects of those pieces of functionality. You generally don't want to serialize the same Sitecore paths that you want to generate code against. However, having the configuration in one file is not what I wanted. Here are the drawbacks.

If the configuration is in one file, you would have to update your configuration file in consecutive runs if you wanted to serialize, git fetch and merge, then deserialize. Your configuration would be committed and would be set for the next person that wants to run the program. You couldn't write bat files to run updates.

You could use the flag package to control the program pieces. Of course. But I set out to have multiple configs. For instance, if you wanted to serialize from a shared database then serialize to your local database. You could also make the config file a flag and set it to different huge files that each only differ by the connection string.

You could.

But then I wouldn't have this cool piece of code :)

Basically, when you run the program, you call it with a "-c" flag which takes a csv list of config files. The program reads them in order and merges them, having configuration values later in the chain overwrite values in the previous versions. I do this using Go's reflect package. As follows:

func Join(destination interface{}, source interface{}) interface{} {
    if source == destination {
        return destination 
    td := reflect.TypeOf(destination)
    ts := reflect.TypeOf(source)

    if td != ts || td.Kind() != reflect.Ptr {
        panic("Can't join different types OR non pointers")

    tdValue := reflect.ValueOf(destination)
    tsValue := reflect.ValueOf(source)

    for i := 0; i < td.Elem().NumField(); i++ {
        fSource := tsValue.Elem().Field(i)
        fDest := tdValue.Elem().Field(i)

        if fDest.CanSet(){
            switch fSource.Kind() {
                case reflect.Int:
                    if fDest.Int() == 0 {
                case reflect.Bool: 
                    if fDest.Bool() == false {
                case reflect.String: 
                    if fDest.String() == "" && fSource.String() != "" {
                case reflect.Slice:
                    fDest.Set(reflect.AppendSlice(fDest, fSource))
                case reflect.Map:
                    if fDest.IsNil(){
                    for _, key := range fSource.MapKeys() {
                        fDest.SetMapIndex(key, fSource.MapIndex(key))
        } else {
            fmt.Println("Can't set", tdValue.Field(i))

    return destination

So, you can see what I mean when I said it can be expanded. I'm only doing strings, bools, ints, slices and maps. The slice handling is different in that it adds values to the current slice. Map handling will add entries or overwrite if the key exists. Strings will only overwrite if the existing string is blank and the source isn't blank. So that's probably different from how I described the code in the beginning :)

Go is very useful. There's like, nothing you can't do :)

So the program is called like this:

scgen -c scgen.json,project.json,serialize.json

scgen.json will have the template ids for "template" and "template field", stuff that's pretty ok if it's hard coded. If sitecore were to change those template IDs, I'm fairly positive there's a lot of existing code out there that will break.

project.json has the connection string, the field type map, serialization path (since it's used for serialization and deserialization), and base paths for serialization.

serialize.json, in this instance, only has { "serialize" : true }  as its entire contents. Files like "generate.json" have "generate": true  as well as the file mode, output path, the Go text template, and template paths to generate.

So these files can be combined in this way to build up an entire configuration. The bools like "serialize" and "generate" are used to control program execution. The settings can be set in separate files, different files can be set and used depending on the environment, like a continuous integration server, or in a project pre-build execution. I foresee this being used with bat files. Create a "generate.bat" file which calls with generate.json in the config paths, etc for each program mode. Or a bat file to serialize, git commit, git pull, and deserialize. Enjoy!

Go and Sitecore, Part 3

In parts 1 and 2, so far, we've covered code generation with Go against a Sitecore template tree, and serializing items from the database to disk. Part 3 takes that serialized form and updates the database with items and fields that are missing or different, then clears out any items or fields that were orphaned in the process.

It probably doesn't do the clearing orphaned fields completely correctly, as I will only clear fields where the item doesn't exist anymore. It won't clear fields that no longer belong to the new template if the item's template changed. That'll probably be an easy change though, as it could probably be done with a single (albeit, advanced) query.

Deserializing involves the following steps.

  1. Load all items (already done at the beginning every time the program runs)
  2. Load all field values. This happens if you are serializing or deserializing.
  3. Read the contents from disk, mapping serialized items with items in the database.
  4. Compare items and fields.
    1. If an item exists on disk but not in the database, it needs an insert
    2. If an item exists on the database but not on disk, it needs a delete (and all fields and children and children's fields, all the way down its lineage)
    3. #2 works if an item was moved because delete happens after moves.
    4. Do the same thing for fields... update if it changed, delete if in the db but not on disk, insert if on disk but not in the db.
  5. This can, in some cases, cause thousands of inserts or updates, so we'll do batch updates concurrently.

Deserialization code just involves 2 regular expressions, and filepath.Walk to get all serialized files. Read the files, build the list, map them to items where applicable, decide whether to insert / update / delete / ignore, and pass the whole list of updates to the data access layer to run the updates.

I love the path and filepath packages. Here's my filepath.Walk method.

func getItemsForDeserialization(cfg conf.Configuration) []data.DeserializedItem {
	list := []data.DeserializedItem{}
	filepath.Walk(cfg.SerializationPath, func(path string, info os.FileInfo, err error) error {
		if strings.HasSuffix(path, "."+cfg.SerializationExtension) {
			bytes, _ := ioutil.ReadFile(path)
			contents := string(bytes)
			if itemmatches := itemregex.FindAllStringSubmatch(contents, -1); len(itemmatches) == 1 {
				m := itemmatches[0]
				id := m[1]
				name := m[2]
				template := m[3]
				parent := m[4]
				master := m[5]

				item := data.DeserializedItem{ID: id, TemplateID: template, ParentID: parent, Name: name, MasterID: master, Fields: []data.DeserializedField{}}

				if fieldmatches := fieldregex.FindAllStringSubmatch(contents, -1); len(fieldmatches) > 0 {
					for _, m := range fieldmatches {
						id := m[1]
						name := m[2]
						version, _ := strconv.ParseInt(m[3], 10, 64)
						language := m[4]
						source := m[5]
						value := m[6]

						item.Fields = append(item.Fields, data.DeserializedField{ID: id, Name: name, Version: version, Language: language, Source: source, Value: value})
				list = append(list, item)

		return nil

	return list

I did a quick and crude "kick off a bunch of update processes to cut the time down" method.

func update(cfg conf.Configuration, items []data.UpdateItem, fields []data.UpdateField) int64 {
	var updated int64 = 0
	var wg sync.WaitGroup
	itemGroupSize := len(items)/2 + 1
	fieldGroupSize := len(fields)/4 + 1

	// items - 2 processes
	for i := 0; i < 2; i++ {
		grp := items[i*itemGroupSize : (i+1)*itemGroupSize]
		go func() {
			updated += updateItems(cfg, grp)

	// fields - 4 processes
	for i := 0; i < 4; i++ {
		grp := fields[i*fieldGroupSize : (i+1)*fieldGroupSize]
		go func() {
			updated += updateFields(cfg, grp)


	return updated

Very unclever. Take all of the update items and fields, break them into a set number of chunks, kick off six processes, allocating twice as many for fields than for items. Each call to the respective update methods opens its own connection to SQL Server. This can be done much better but it does accomplish what I set out to accomplish. Utilize Go's coroutines (goroutines) and where something can be done concurrently, do it concurrently to try to cut down the time required. This is the only process that uses Go's concurrent constructs.

That's it for part 3!  Part 4 will come more quickly than part 3 did. I had some things going on, a year anniversary with my girlfriend, lots of stuff :)

Part 1 - Generation
Part 2 - Serialization
Part 3 - Deserialization

Go and Sitecore, Part 2

In part 1, I covered how I'm now generating code from Sitecore templates, to a limited degree. I won't share the whole process and the whole program until the end, but just going over touch points until then.

For part 2, we'll cover Sitecore serialization. For the terminology, I'm not sure what TDS or other similar tools would refer to them as, but I will refer to these acts as serialization (writing Sitecore contents to disk) and deserialization (reading Sitecore contents from disk and writing to the database)

For Sitecore serialization, I would say step 1 is to decide which fields you DON'T want to bring over. In the past, I've had loads of issues with serializing things like Workflow state. And locks. So my approach is to ignore the existence of certain fields. Essentially, find out all of the fields on "Standard template", and decide which ones are essential or useful. Remove those from a global list of "ignored fields" list. Then get your data. For the data, from part 1 we use the same tree of items. When we build the tree, it gets a root node tree and an item map  (map[string]*data.Item). For serialization we need the item map. The root is only useful for building paths, after that we could most likely toss it. With the item map in hand, and a list of ignored fields, we can get the data.

        with FieldValues (ValueID, ItemID, FieldID, Value, Version, Language, Source)
                ID, ItemId, FieldId, Value, 1, 'en', 'SharedFields'
            from SharedFields
                ID, ItemId, FieldId, Value, Version, Language, 'VersionedFields'
            from VersionedFields
                ID, ItemId, FieldId, Value, 1, Language, 'UnversionedFields'
            from UnversionedFields

        select cast(fv.ValueID as varchar(100)) as ValueID, cast(fv.ItemID as varchar(100)) as ItemID, f.Name as FieldName, cast(fv.FieldID as varchar(100)) as FieldID, fv.Value, fv.Version, fv.Language, fv.Source
                    FieldValues fv
                        join Items f
                            on fv.FieldID = f.ID
                    f.Name not in (%[1]v)
                order by f.Name;

With SQL Server, we're able to do common table expressions (CTEs) which makes this a single query and pretty easy to read. We're getting all field values except for those ignored. We get version and language no matter what, and we get the source, which table the value comes from. ValueID is just the Fields table ID which could be useful as a unique identifier, but it's not actually used right now.  We simply pull all of these values into another list of serialize items, matching their ItemID with the item map to produce a new "serialized item" type, which will be serialized. SerializedItem only has a pointer to the Item, and a list of field values. Field values have Field ID and Name, the Value, the version, the language, and the source (VersionedFields, UnversionedFields, SharedFields).

The item map is also trimmed down to items in paths that you specify, so you're not writing the entire tree. In SQL Server with the current database (12K items), the field value query with no field name filter takes 3 seconds and returns 190K values. That's a bit high for my liking, but when you're dealing with loads of data you have to be accepting of some longer load times.

The serialized file format is hard coded, versus being a text template. However I feel I could do the text template since I've found out how to remove surrounding whitespace (e.g.  {{- end }}, that left hyphen says remove whitespace to the left). However, putting it in a text template, as with code generation, implies that the format can be configured. But, this needs to be able to be read back in through deserialization, so should be less configurable, 100% predictable.

func serializeItems(cfg conf.Configuration, list []*data.SerializedItem) error {
	sepstart := "__VALUESTART__"
	sepend := "___VALUEEND___"

	for _, item := range list {
		path := item.Item.Path
		path = strings.Replace(path, "/", "\\", -1)
		dir := filepath.Join(cfg.SerializationPath, path)

		if err := os.MkdirAll(dir, os.ModePerm); err == nil {
			d := fmt.Sprintf("ID: %v\r\nName: %v\r\nTemplateID: %v\r\nParentID: %v\r\nMasterID: %v\r\n\r\n", item.Item.ID, item.Item.Name, item.Item.TemplateID, item.Item.ParentID, item.Item.MasterID)
			for _, f := range item.Fields {
				d += fmt.Sprintf("__FIELD__\r\nID: %v\r\nName: %v\r\nVersion: %v\r\nLanguage: %v\r\nSource: %v\r\n%v\r\n%v\r\n%v\r\n\r\n", f.FieldID, f.Name, f.Version, f.Language, f.Source, sepstart, f.Value, sepend)

			filename := filepath.Join(dir, item.Item.ID+"."+cfg.SerializationExtension)
			ioutil.WriteFile(filename, []byte(d), os.ModePerm)

	return nil

If you've looked into the TDS file format, you've noticed it adds the length of the value so that parsing the field value is "easier???" or something. However, it makes for git conflicts on occasion. Additionally, you can't just go in there and update the text and deserialize it.  For instance, if you had to bulk update a path that would end up in the value for each item, like a domain name or url in an external link field which is the value for many fields, with the TDS method you can't just do a find replace (unless the length of the value doesn't change!). Without the length you could find/replace across the whole path of serialized objects. There are other future benefits to this. Imagine you need to generate a tree but you don't want to use Sitecore API. You could generate this file structure and have it deserialize to Sitecore. The length doesn't help that scenario though, it just makes it a tiny less painful.

The idea for this was first, "common sense", but second, it's been working for HTTP and form posts for YEARS!! HTTP multipart forms just use the boundary property. My boundary isn't dynamic, it's just a marker. If that text were to show up in a Sitecore field, this program doesn't work. Most likely I'd replace underscores with some other value. I could generate a boundary at the start of serialization, and put it in a file in the root of serialization, like ".sersettings" with "boundary: __FIELDVALUE90210__" which would be determined at the start of serialization to be unique and having no occurrences in sitecore field values. Anyway, I've gone on too long about this :)

Also, the path and path/filepath packages in Go are the best. So helpful.

In this format, here is what the "sitecore" root node looks like serialized.

ID: 11111111-1111-1111-1111-111111111111
Name: sitecore
TemplateID: C6576836-910C-4A3D-BA03-C277DBD3B827
ParentID: 00000000-0000-0000-0000-000000000000
MasterID: 00000000-0000-0000-0000-000000000000

ID: 56776EDF-261C-4ABC-9FE7-70C618795239
Name: __Help link
Version: 1
Language: en
Source: SharedFields


ID: 577F1689-7DE4-4AD2-A15F-7FDC1759285F
Name: __Long description
Version: 1
Language: en
Source: UnversionedFields
This is the root of the Sitecore content tree.

ID: 9541E67D-CE8C-4225-803D-33F7F29F09EF
Name: __Short description
Version: 1
Language: en
Source: UnversionedFields
This is the root of the Sitecore content tree.

In part 3, we'll be looking into deserializing these items.


Part 1 - Generation
Part 2 - Serialization
Part 3 - Deserialization