Google Cloud Development - Data Access

Compile time dependency injection is still dependency injection!

I've been slowly writing code to utilize Google Cloud. First one being this website. But, I've changed how I've done the data access, which will be the topic of this post.

In the previous version of this site, it was very much written knowing that the backend would be MongoDB. I have helper code that specifically only deals with MongoDB connections and read/write operations. However, when moving to Google Cloud, MongoDB is no longer an option, at least for me.

However, not wanting to always write code against the live Cloud Datastore or Firebase database, but also not wanting to install the simulated versions of these, the need for some kind of dependency injection was required.

Now, dependency injection and I have an interesting past. I'm not typically a fan of dependency injection, because it's frequently overused, and I'm more of a dependency injection purist. Meaning, mock objects used during testing. Most DI you find on .NET projects are there without any testing being done, so it doesn't gain you anything. Anyway, with that aside...

Dependency injection, as it's meant to be used, is necessary when writing code that would write to a Google Cloud Datastore in production, but to MongoDB or even a JSON file during development. I currently use a JSON file as my backend during testing.

For this site, there's a "Posts Repository" which you create to access the posts written by me. This is an interface:

type PostRepo interface {
	GetPosts(page, pagesize int, tag Tag) ([]Post, Pages, error)
	GetDatePosts(page, pagesize, year, month int) ([]Post, Pages, error)
	GetPost(year, month, day int, key string) (Post, error)
	SavePost(p Post) error
	GetCalendar() ([]CalendarEntry, error)
	GetTagCloud() ([]TagCloudEntry, error)

	// support old site
	GetPostByKey(key string) (Post, error)
}

These methods return instances of structs, not interfaces. I don't see the need to have an interface for a "Post" when it's just things like a Title, Text, Date, etc. There are no methods on a Post. Well, there is, a .Url() method to get the formatted URL for a Post, but that could just as easily be written as a static method which takes a Post object.

The old code for this site, overall, really can't be compared because it was so bad. But consider it was using this repo, https://github.com/jasontconnell/server, and consider now that I can basically delete that whole repository.

What was required with that repository was a server state. So I could store things like the connection string to MongoDB in the server state, then in handlers, grab the connection string to connect to the database. Now with a JSON file or with Cloud Datastore, there's no need for a connection string (in Cloud Datastore, you have a creds.json file in your root and the library uses that and a project ID that you pass in). However, even if I was connecting to something that requires a connection string, you'll see I can easily still use this new method.

So, to do away with the server state, and the "server" repository linked above completely, I simply have an object that embeds the http.Handler interface. Let me explain. Embedding is like Go's object inheritance. Read more about it here.

Here's how I wrote a handler so I could get rid of the "server state" and the "server" repository:

type SiteHandler struct {
	http.Handler
	tmpl            *template.Template
	Posts           data.PostRepo
	Users           data.UserRepo
	CanonicalDomain string
	Key             string
}

Now my main func looks like this, whereas before it was used to define all of the routes. Like I said before, the old code can't really be compared since it was so bad. But here's the new main func:

func main() {
	h := handlers.GetHandler()
	h.CanonicalDomain = os.Getenv("CanonicalDomain")
	port := os.Getenv("PORT")
	if port == "" {
		port = "6040"
	}

	log.Fatal(http.ListenAndServe(":"+port, h))
}

And my GetHandler func. I've clipped all of the routes, not for security or anything, just for brevity. It uses the Gorilla Toolkit Mux library.

func GetHandler() *SiteHandler {
	h := new(SiteHandler)

	key, _ := ioutil.ReadFile("keyfile")
	h.Key = string(key)

	m := mux.NewRouter()
	m.StrictSlash(true)

	// posts
	m.HandleFunc("/", h.Index)
	// removed other routes for brevity ... 

	h.Handler = m
	h.Posts = &data.CloudRepo{}
	h.Users = &data.CloudRepo{}
	return h
}

While this is compile time dependency injection, it is still technically dependency injection :)  I could have a setting or try to figure out if we're running in Google Cloud's App Engine, but this way works for this project since I'm not going to accidentally delete everything. I would definitely find a more solid solution if there was any risk. The alternative here would be to pass it &data.JsonRepo{} for the Posts and Users do use the json file. This would also work on the Google Cloud instance, except for when trying to save a post. It might work but it'll get blown away with any updates.

You can see the benefit of this method. If you wanted to, for some odd reason, write two "providers" for your repository, one being MongoDB, and the other being MySQL for instance, both which require a connection string, you could simply add a ConnectionString string property to the SiteHandler, and create instances of MongoDBRepo or MySQLRepo providing the corresponding connection string.

This is what a handler method looks like, in this case, Index:

func (s *SiteHandler) Index(w http.ResponseWriter, r *http.Request) {
	pageIndex, pagesize := getPaging(r.URL.Query())
	posts, pages, err := s.Posts.GetPosts(pageIndex, pagesize, "")
    ...  }

Here, with the injected Posts repository, there's no need for knowing where the data is coming from, which is definitely the main idea behind dependency injection. The very core idea :)

Anyway, that's how I've replaced an entire repository and what I considered "bad" code in my old website, by embedding http.Handler into my site and using dependency injection!

blog comments powered by Disqus