Borrow the good, discard the bad

In my many years of web development, I've come across a lot of good ways that platform authors did stuff, and a lot of bad ways. So I'm writing my version of a web platform on Node.js, and I decided to keep the good stuff, and get rid of what I didn't like. It wasn't easy but I'm pretty much finished by now.

As with most things I develop, I'll decide on an architecture that allows for changes to be made in a way that makes sense, but I'll start with what I want the code to look like. Yes. When I wrote my ORM, I started with the simple line, db.save(obj); (it turns out that's how you do it in MongoDB so I didn't have to write an ORM with Mongo :) When starting a web platform, I started out the same way.

I wanted to write:

<list value="${page.someListVariable}" var="item">
Details for ${item.name}
<include value="/template/item-template.html" item="item" />
</list>


Obvious features here are code and presentation separation, SSIs, simple variable replacement with ${} syntax.

There aren't a lot of tags in my platform. There's an if, which you can use to decide whether to output something. There's an include, which you can pass variables from the main page so you can reuse it on many pages. This one takes an "item" object, which it will refer to in its own code with ${item}.

Recently I added a layout concept. So you can have your layout html in another file, and just put things into the page in the page's actual html. For instance, you might reach the file index.html, which would look like this:

<layout name="main">
<content name="left-column">
<include value="/template/navigation.html" />
</content>
<content name="main-column">
<include value="/template/home-content.html" />
</content>
</layout>


Java Server Faces used a two way data binding mechanism which was really helpful. But then you need controls, like input[type=text] or whatever. My pages will not have two way data binding, but you can use plain html. Which I like better. (However, those controls were very simple to swap due to the generous use of interfaces by Java, and their documentation pretty much mandating their use. e.g. using ValueHolder in Java instead of TextBox, and if you were to make it a "select" or input[type=hidden], your Java code would not have to change, which is one thing I absolutely hate about ASP.NET).

I borrow nothing from PHP.

ASP.NET pretty much does nothing that I like, other than it's easy to keep track of what code gets run when you go to /default.aspx. The code in /default.aspx.cs and whatever Page class that inherits, or master page that it's on. In Java Server Faces you're scrounging through xml files to see which session bean got named "mybean".

My platform is similar to ASP.NET in that for /index.html there's a /site/pages/index.js (have I mentioned that it's built on node.js), that can optionally exist, and can have 1-2 functions implemented in it, which are "load" and "handlePost", if your page is so inclined to handle posts. Another option is to have this file exist, implement neither load nor handlePost, and just have properties in it. It's up to youme.

Here's a sample sitemap page for generating a Google Sitemap xml file:

Html:

<!--?xml version="1.0" encoding="UTF-8"?-->

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://${config.hostUrl}/index</loc>
<lastmod>2011-06-16</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
<jsn:foreach value="${page.entries}" var="entry">
<url>
<loc>${entry.loc}</loc>
<lastmod>${entry.lastmod}</lastmod>
<changefreq>${entry.changefreq}</changefreq>
<priority>${entry.priority}</priority>
</url>
</jsn:foreach>
</urlset>


I use the jsn prefix, which just stands for (now, anyway) Javascript Node. I wasn't creative. I guess I can call it "Jason's Site N..." I can't think of an N.

And the javascript:

var date = require("dates"), common = require("../common");

this.entries = [];

this.load = function(site, query, finishedCallback){
var self = this;
var now = new Date(Date.now());
var yesterday = new Date(now.getFullYear(), now.getMonth(), now.getDate());
var yesterdayFormat = date.formatDate("YYYY-MM-dd", yesterday);
common.populateCities(site.db, function(states){
for (var i = 0; i < states.length; i++){
states[i].cities.forEach(function(city){
var entry = {
loc: "http://" + site.hostUrl + "/metro/" + city.state.toLowerCase() + "/" + city.key,
lastmod: yesterdayFormat,
changefreq: "daily",
priority: "1"
}
self.entries.push(entry);
});
}
finishedCallback({contentType: "text/xml"});
});
}


My finishedCallback function can take more parameters, say for handling a JSON request, I could add {contentType: "text/plain", content: JSON.stringify(obj)}.

That's about all there is to it! It's pretty easy to work with so far :) My site will launch soon!

The Non-Blocking Nature of Node.js

This can lead to some pretty sweet code. For one thing, always add a callback as a parameter to functions you create, to keep with the non-blocking nature. The next thing you need to know is that you will back yourself into a corner!

Take the following code:
collection.find(search, {sort: sort}, function(err, cursor){
cursor.toArray(function(err, messages){
for (var i = 0; i < messages.length; i++){
db.dereference(messages[i].from, function(err, result){
messages[i].from_deref = result;
});
}
callback(messages);
});
});


Backstory: I'm using MongoDB as the backend, I have a message collection, a user collection, and messages have a "from" property that is a DBRef to a user.

You would run this code and find that if you had any number of messages greater than zero, you will probably get a null "from_deref" object, which means the callback at the end was called before it was finished processing. That is if you're lucky enough to not get an error stating that the code "can't set the property from_deref of undefined", which means, usually, that "i" is null or greater than the length of the array by the time the callback for db.dereference calls. If it's not obvious, I'm dereferencing the user's DBRef and storing it in the message's from_deref property.

This is because of the non-blocking nature of Node.js. It's interesting because it makes me think in new ways. Anything that makes you think differently is good in my opinion. So how do we accomplish this and not break anything? Consider the following code as a solution:

collection.find(search, {sort: sort}, function(err, cursor){
cursor.toArray(function(err, messages){
var process = messages.length - 1;
for (var i = 0; i < messages.length; i++){
(function(messages, index){
db.dereference(messages[index].from, function(err, result){
messages[index].from_deref = result;

if (index == process)
callback(messages);
});
})(messages, i);
}

if (messages.length == 0) callback(messages);
});
});


Javascript is awesome. This is basically an anonymous function that I define and call in the same block. The definition is everything inside (function(x,y){}) and the call is in the parentheses following: (messages, i); So this calls the inner block with the value of i that I'm hoping it will (or rather than hoping, I'm confident it will!). And when all dereferences are done, I know that the process variable will be equal to the index (process variable is length - 1 which is the max value the index can have).

Of course, this doesn't take advantage of the node-mongodb-native's library of the nextObject function on the cursor object. That would totally solve this without javascript magic:

cursor.nextObject(function(err, message){
db.dereference(message.from, function(err, result){
message.from_deref = result;
});
});


However, I like the Array...

So there you have it.

Task Notification with Google Talk via XMPP

I wrote a post for work about using XMPP to send task notifications through Google Talk.

Compose Debugging

I have a method of debugging that has worked since college. I call it "compose debugging". As I'm writing an email to ask a question, either to the professor who is teaching the course, or since then, the author of a library I'm trying to incorporate into my project (or whose code my project relies solely upon), I find my answer. Sometimes I'll have already sent the email, so I have to send a short "nevermind" email. Basically, that's the content and the subject. "Figured it out..."

Sometimes I pretend that I sent the "nevermind" email without sending the question email, and the recipient is all "WTF?"

Obviously the secret is, I get all of my ideas and problems on paper, which is a million times more organized than my head.

Computers are fun.

Image Processing with Node.js

Image processing is very important if you are going to allow anyone, even if it's only you, to upload images to your site. The fact of the matter is, not everyone knows how to resize an image, and forcing users to do it will mean less user submitted content on your site, since it's a pain, and other sites let you upload non-processed images. If it's just you uploading images, laziness will take over and you will stop doing it because it's not easy. So get with the times!

I found a Node.js plugin for GraphicsMagick. I learned about both the plugin and GraphicsMagick itself simultaneously. GraphicsMagick is pretty sweet, minus setting it up. After you get it set up though, you can perform many operations on any image format that you configured within GraphicsMagick.

This post will not cover setting up GraphicsMagick (although I will point out that setting LDFLAGS=-L/usr/local/include in my instance saved me from problems with missing LIBPNGxxx.so files), and I couldn't get it to work on my Mac. Here's how I'm using GraphicsMagick, via the excellent Node.js module, gm.

var gm = require("gm"), fs = require("fs");

var basePath = "/path/to/images/";
var maxDimension = 800;
var maxThumbDimension = 170;
var thumbQuality = 90;

var processImages = function(images){
console.log("processing : " + images.length + " image(s)");
images.forEach(function (image, imageIndex){
var fullPath = basePath + image;
var newFilename = basePath + "scaled/" + image;
gm(fullPath).size(function(err, value){
var newWidth = value.width, newHeight = value.height, ratio = 1;
if (value.width > maxDimension || value.height > maxDimension){
if (value.width > maxDimension){
ratio = maxDimension / value.width;
}
else if (value.height > maxDimension){
ratio = maxDimension / value.height;
}
newWidth = value.width * ratio;
newHeight = value.height * ratio;
}

if (newWidth != value.width){
console.log("resizing " + image + " to " + newWidth + "x" + newHeight);
gm(fullPath).resize(newWidth, newHeight).write(newFilename, function(err){
if (err) console.log("Error: " + err);
console.log("resized " + image + " to " + newWidth + "x" + newHeight);
});
}
else copyTheFileToScaledFolder(); // ?? how do you do this?!? :P
}
}
}


I run this as a service instead of putting it in the web application. It is on an interval, and you just let node.js handle it! That part was simple:

var interval = setInterval(function(){ processImages(getImages());}, 4000);

Your getImages function might look like this:

var getImages = function(){
fs.readdir(basePath, function(err, files){
// this won't work...
// filter out folders, non-image files and files that have already been processed
processImages(files);
// maybe delete these images so you don't have to keep track of previously processed images
});
}


This is not how my code works, since my images are in a MongoDB database and my document has a "resizeImages" boolean property on it, to trigger this to get images to resize. So I don't know if it will work, or what the fs.readdir sends in its files argument on the callback! But you can try :)

With GraphicsMagick, you could also change the format of the image, if you were a stickler and wanted only PNG or JPG files. You can apply filters like motion blur, or transforms like rotation, add text, etc. It is pretty magical...

File System Operations

Somewhat related, how to you simply copy a file in Node.js? I found a method that uses the util.pump method but it didn't work for me. Also, deleting a file in node.js is "unlink", since it will work on symlinks and files. This one did work but I found that I was deleting them too soon, attributed to the non-blocking nature of Node.js, and had to take it out.

git: Because I alway forget

And it takes me 20 minutes to figure it out again. This is how you specify a url to a remote repository to a linux box under a user (git's) home directory, using ssh...

git remote add [name] ssh://git@ipaddress/~git/reponame

I usually name them reponame.git.

That should turn the process into a one minute one in the future :)

GeoSpatial Indexing in MongoDB

Is DIRT simple! Consider the code, which searches an item collection for items located near a specific location, designated in longitude/latitude (according to the GeoJSON spec, longitude first, latitude second):
this.findItems = function(db, type, longitude, latitude, index, pageSize, callback){
db.collection("items",function (err, collection){
if (err) throw err;
collection.ensureIndex(["location","2d"], false, function(err){
var search = {}, paging = {};
if(type) search.type = type;
if(longitude != null && latitude != null)
search.location = {"$near" : [parseFloat(longitude),parseFloat(latitude)]};

if (index >= 0 && pageSize >= 0){
paging.skip = index*pageSize;
paging.limit = pageSize;
}

collection.find(search, paging,
function(err,cursor){
cursor.toArray(function(err,items){
callback(items);
}
);
});
});
});
}

It's that simple (minus hours of learning how to do that, scratching my head over stupid errors that I caused but didn't know I was causing [us computer scientists are quick to blame the other guy :) ])

If you can decipher that, good for you. Onto step 3 of a million on my little code project.

MongoDB and Node.js

I am starting a joint venture with my very good friend, where I am doing the coding, and I decided that I will be doing it in Node.js with a MongoDB backend. I was thinking about why these two technologies, and how I would explain my choices to another techy. I think I have my explanation figured out...

First and foremost, Javascript. I have come to love everything about it! For my side projects, I used to use Java, and wrote a pretty decent web middle tier and rolled my own ORM also. You are witnessing it in action by reading this post. I could make an object like "Book", add properties to it, give them properties in an XML file (a series of them, if you know how Java web apps work ;), and my ORM would create the table, foreign keys, etc. It was particularly magical in how it handled foreign keys and loading referenced objects in one SQL query, knowing which type of join to apply and everything. However, this was all a daunting task, made even more so by the fact that Java allows abstraction to the Nth degree. If I could remember that code, I would be very much more specific on how it worked, but I had an object for connecting to the database, an object in another project for building SQL because I thought I could abstract that out and rewrite it if I had to, and swap in SQL building engines. That was the main cause of pain, I would think "If I just wanted to swap in another X, it would be easy." I initially made plans to have it either go to SQL or XML or JSON, whatever you wanted, and you would just swap in an engine in the config file. It was heavily reflection based :)

So, I've been writing Javascript a lot. Only lately, but before discovering Node.js, have I come to realize its power... anonymous methods and objects, JSON, Functions as first class citizens. Of course there are the libraries, like jQuery, and Google APIs, like maps for instance. There's the shear fact that you don't have to think ahead about every possibility for an object before creating it. Like, my Book object would have an author, title, etc. If later I wanted to add the ISBN or something, in Java I would have to update any IReadable interfaces (well, considering that you could read a cereal box that would undoubtedly NOT have an ISBN, this example is falling apart, but you know what I'm talking about :P ), then update the Book class, update the list to enable searching by an ISBN, etc. Tons of stuff. Javascript:

var Book = function(opt) { for (var i in opt) { this[i] = opt[i]; } }

Imagine I start calling it with
var book = new Book({title: "Brainiac", author: "Ken Jennings", ISBN: "some string of numbers});

I can now just get the ISBN in other places by calling "book.ISBN".

Node.js

I've grown weary of writing server side code in Java and C#. ASP.NET is a CF with its control design, JSF was a CF with its configs and my ginormous web framework that is almost 6 years old now, and I never tried updating past JSF 1.1, so I don't know if it's gotten any better. But Java web development was nuts, you always needed 34 external libraries, most of which came from Apache's Jakarta project (which is awesome though), intimate knowledge of how to set up Tomcat or your favorite application server, you had to know how to write servlets and JSPs, JSTL, JSTL configs, and to read a Catalina.out file. I'm sure I'm forgetting something worth hours of learning...

Node.js is simple. Create a server, attach a listener to the request method. It's so barebones, that requesting anything will not work at all on the server you just created. I researched some libraries for web app development, and I decided, F it, I'm not falling in that trap again. I've dealt with enough code in my life, I could write one. One that takes my favorite things from the frameworks I know, and works on that. It is nearly done...

MongoDB

MongoDB is clearly the only choice for me on Node.js. JSON documents. That should about clear it up, if you were still wondering. If still... imagine we need an ISBN on a book. There's no updating 14 stored procedures that perform CRUD ops on Book to now also include ISBN. No alter table statements... There's simply, Book has ISBN now... db.books.save({title: "Brainiac", author: "Ken Jennings", ISBN: "some number"});

There'll be other books without ISBN, but that's simple... if (book.ISBN != null) (or simply if (book.ISBN) of course). So you think of the basic stuff you need to get off the ground running, and you run. You can run, and you can run fast and recklessly, because if you think of something to add, you put it in. There's very minimal pain, or slowing down, in change.

Node and MongoDB are built for scaling, but I'm not too concerned about that at the moment. Then why, you ask, am I using them? Simply because I do not know them! Although it is 10% for the learning factor, 90% for the cool factor. If it's not cool and I don't know it, then I won't go out of my way to learn it. You might ask, if you're trying to make something awesome, why don't you do it in what you know first, then convert it once you start making money? I'd rather do it in something awesome. Knowing these technologies will be more valuable to me than selling the site to Google for a billion dollars! Ok, that's not true. But if it hits that point, and I didn't do it in Node + MongoDB, then I would not have learned them, and I would not have much reason to NOW, being a billionaire, would I? :P

Oh yeah, Happy Birthday to me today :)

JSON and Two Smoking Stacks

JSONJSON is something to me that is just simple to use in its native environment, that I would never consider writing a parser. I would just try to use it in its natural environment, if I ever have to use it. Well, I found a case where that is not the case!

In my dealings with geocoding locations for clients, I've come across many instances where a limit on the amount of geocoding calls was reached, and I would have to wait until the next day to geocode some more locations. I could write a geocoding mega-program that abstractly geocodes addresses with all of the free services available until it reaches a limit, then moves on to the next service! Fun stuff.

The problem is, with the other geocoders out there, they do not let you specify which format you would like the response to be in. XML is easy to do in C#, however I hadn't researched a JSON parser, so I had tasked myself with writing one from scratch with no previous parsing experience.

JSON is a scary beast, for someone like me with limited syntax parsing experience, and no compiler courses taken in college... SCARY!


Although, it can only have a certain number of syntax elements. {}[],: as far as I know. Quotes ("") contain strings, curlies {} contain objects, squares [] contain arrays. Arrays can contain objects, objects can contain arrays. Objects have to be in field : value pairs. So here's the basic structure:


private Stack syntaxStack;
private Stack tokenStack;

StringBuilder sb = new StringBuilder(); // string builder for catching data

for (loop through json){
switch (char){
case '"':
we're either in a string or just out of a string (set a boolean so we can check in the other cases)
if it's a " preceded by a , add it to the string buffer
break;
case '[':
if we're not in a string
push a '[' onto the syntax stack
push an array token onto the token stack
break;
case '{':
if we're not in a string
push a '{' onto the syntax stack
push an object token onto the token stack
break;
case ']':
if we're not in a string
Get the last value in the array, if there is one
( it could have also been an array of arrays, and we're closing the outer array, so there won't be a value)
add to the children of the last token in the token stack
pop from each stack, we're out of the current array
break;
case '}':
if we're not in a string and we're in an object
get the last value in the object, if there is one
set the value in the last object
pop from each stack, we are out of that object
case ',':
if we're not in a string
if we're in an object, a value was just specified. if it's a string value, set the last field's value to the data in the string builder
clear the string buffer
break;
case ':':
if we're not in a string
we should be in an object, and the previous string was the field, so create a field token and add it to the current node
clear the string buffer
break;
default:
append the character to the data string buffer
}
}

bool InArray
= this.syntaxStack.Peek() == '['

bool InObject
= this.syntaxStack.Peek() == '{'


Here is the Visual Studio 2008 project with C# code for parsing your own JSON. I think mine looks decent compared to others out there

Reverse Geocoding in Javascript

This was a neat thing I learned recently. We've all done geocoding... pass an address to a web service, like geocoder.us or through the Google Maps API, and get back a coordinate in latitude and longitude, which you can use to search for things nearby (using a database of things with lat/long coordinates and the trusty Haversine Distance formula, get directions, put a marker on a map, etc.)

Recently I've needed to find out a zip code for the person viewing the website. This won't be an exact science, for the obvious reason of desktop computers don't typically have GPS, and IP address geolocating is pretty good for many people but there are those cases where the ISP might assign an IP for Camden to a person in Philadelphia. I have no proof but go along with it. Either way, it will be close enough for what I'm doing.

Basically, all you do is use Google's GeoCoder object, and pass in a google.maps.LatLng object instead an address! Here's how:

var latLng = new google.maps.LatLng(position.coords.latitude, position.coords.longitude);
var coder = new google.maps.Geocoder();
coder.geocode({ 'latLng': latLng }, showLocaleCallback);


Here's all the code:

function initZip() {
if (navigator.geolocation && typeof (navigator.geolocation.getCurrentPosition) == "function") {
navigator.geolocation.getCurrentPosition(geoCodeCallback);
}
}

function geoCodeCallback(position) {
var latLng = new google.maps.LatLng(position.coords.latitude, position.coords.longitude);
var coder = new google.maps.Geocoder();
coder.geocode({ 'latLng': latLng }, showLocaleCallback);
}

function showLocaleCallback(results, status) {
if (status == google.maps.GeocoderStatus.OK) {
var zip = "";
var res = results[0];
for (var i = 0; i < res.address_components.length; i++) {
if (res.address_components[i].types[0] == "postal_code") {
zip = res.address_components[i].short_name;
}
}

$("[id$=txtZip]").val(zip);
}
}

$(document).ready(initZip);


Be sure to include Google's Map API Javascript and jQuery. These are two cool techs that I really like to work with.

Google's Map API: http://maps.google.com/maps/api/js?sensor=false

Also, if anyone has a pointer on how to more effeciently do this part, I'm all ears:

for (var i = 0; i < res.address_components.length; i++) {
if (res.address_components[i].types[0] == "postal_code") {
zip = res.address_components[i].short_name;
}
}

Experiments in Javascript: Multicast Delegate

There are quite a few blog posts out there about "doing window.onload the right way", albeit from 2007. I looked over them, and took issue with the hard-coded feel that they had.  For instance, a function that returns a function, but takes two functions, and no more, as arguments. I came across an instance where this just wasn't acceptable, according the new dogma of Javascript programmers out there, which is to write as few lines as possible. I explored new ways of accomplishing this.

First and foremost, don't use window.onload :) attachEvent and addEventListener are there for this exact purpose. Let's take a trip back 5 years and pretend this was relevant for window.onload. However it is still relevant for Javascript in general.

The problem of just outright setting window.onload to your function, is that it will overwrite whatever window.onload was set to previously. This can lead to malfunctioning pages that are very difficult to debug. That was the theory of all of the posts about "doing window onload the right way".

function foo(){ alert("foo"); }
window.onload = foo;

// somewhere else on the page, maybe 1000 lines below
function bar() { alert("bar"); }

window.onload = bar;

You can deduce that "foo" will not be alerted. This would be bad if our foo actually did something, like start a video of a cat doing silly things. Devastating. How can they both work?!

The first accepted method is to build a function inline that combines two functions:

window.onload = function(){ foo(); bar(); }
This will work if that's the only time window.onload is set. You can't call window.onload explicitly inside of window.onload unless you like infinite loops. However, storing the referenced function and updating window.onload to the new function is fair. With that knowledge, let's continue our investigation.

The current method out there to get the previous window.onload and the additional function to call correctly looks like this:

function doublecast(fn1, fn2){
return function(){
if (typeof(fn1) == "function")
fn1();
if (typeof(fn2) == "function")
fn2();}
}

You would use this method to add to the window's onload functionality like so:

window.onload = doublecast(window.onload, foo);
If you needed to add more functions, you would do so in successive calls to "doublecast":

window.onload = doublecast(window.onload, foo);
window.onload = doublecast(window.onload, bar);

or through the ugly method of

window.onload = doublecast(window.onload, function(){ foo(); bar(); });
jQuery has made this acceptable, but it's still ugly! And the successive calls are not line-number friendly. Obviously, after reading John Resig's "Learning Advanced JavaScript", I feel I am ready to build a more appropriate and clean function.

function multicast(){
if (arguments == null || arguments.length == 0) return function(){ };

var fns = [], j = 0;
for (var i = 0; i < arguments.length; i++){
if (typeof(arguments[i]) == "function")
fns[j++] = arguments[i];
}

return function(){
for (var i = 0; i < fns.length; i++)
fns[i]();
};
}

Our new code looks like this:
window.onload = multicast(window.onload, foo, bar);
window.onload could have been set previously, and we don't care. We know we've defined two functions in the current module (say, an ASP.NET control or some other type of view) and it will work, so long as no one after us sets window.onload = myAwesomeLoadFunction; which would overwrite it... I guess no solution is completely future proof.
Discuss in the comments, please.

PromoteJS again

JS Array .sort

Just need to do that occasionally because w3schools sucks.

I got an android

It's awesome! I'm using swype to enter this text! I really like it and typing on it is fun, I really am just looking for more opportunities to type with it! I am now married world, sorry ladies. ;) I was just at our company Christmas party, I got some champagne with our white elephant gift. I bought the movie Inception which my boss won. I think he'll enjoy it. Swype is amazing, it knows exactly what I want to say! I could type on this thing forever.

I need to work on this site

But I want to test this thing out first:



Updates since last post: I'm engaged to be married to my love Amanda on 11/13/2010!!! I have an iPad, I fixed my lawn up a little, and I am golfing tomorrow and Saturday. Plus I play StarCraft II a lot with Zatko. Well, not enough.

Precursory Maintenance

Precursory Maintenance: Yelling expletives at a problem before going about fixing it. It usually makes the problem 80% easier to fix.