The Internet: So good, yet so so bad

Oh internets, how I hate thee.

Some days I would just be better off without it. With all the blatant lies and corruption going on, I would love to just stay away. I should. Case in point, OOXML standardization in ISO. Just google "OOXML ISO". Also, I can't stand whiny HD-DVD fans complaining about Blu-ray winning the HD format war. More on that later. Some sites I hate include ZDNet... that's about it.

However, there's some oh so goodness to it. Like xkcd. I'm not gonna link or mention my favorite sites, but some are Amazon, Slashdot, Wikipedia, and iTunes science podcast directory.

Ok, onto my ZDNet-hate-HD-DVD-Blu-ray-hate rant. So, this dude on ZDNet works at Microsoft and posts to that site. Is complaining about Blu-ray today, or whenever, that "the more expensive format won". But then he goes on to call blu-ray "more future proof". I've caught this dude on so many contradictions like this, it just makes me angry. Of course, he never responds to my posts, and he keeps making asinine statements and I keep responding and he keeps ignoring, it's just a frustrating mess. Of course, "more future proof" means less expensive in the long run. It just points to the quality of worker they have there working at Microsoft.

Of course, the non-Microsoft world has its shining stars too. Like the writers of PHP. It's ridiculous. For instance, today I wanted to find the first paragraph in an HTML string, and I'm fairly familiar with regular expressions, so I checked the syntax on how to match a string to a regex and get the matches (groups in any other regular expression library terminology) out of it. This is easy: /<p>(.*?)</p>/g. But not in PHP.


$matches = array();
preg_match_all("/<p>(.*?)</p>/", $htmlString, $matches);


OK, looks easy. $matches will then have an array of matches. But that's where it gets tricky. The fourth parameter is optional, and it's called "flags". Read the "flags" part here.

PREG_PATTERN_ORDER - Orders results so that $matches[0] is an array of full pattern matches, $matches[1] is an array of strings matched by the first parenthesized subpattern, and so on.

PREG_SET_ORDER - Orders results so that $matches[0] is an array of first set of matches, $matches[1] is an array of second set of matches, and so on.

PREG_OFFSET_CAPTURE - If this flag is passed, for every occurring match the appendant string offset will also be returned. Note that this changes the value of matches in an array where every element is an array consisting of the matched string at offset 0 and its string offset into subject at offset 1.

Yes, the "matches" array will be different for every flag. In Java, this looks (half-hearted with no syntax check) like this:


Pattern p = Pattern.compile("/<p>(.*?)</p>/");
Matches matches = p.match(htmlString);
for (Match m : matches){
m.group(1); // always contains the first set of parentheses in the regular expression.
}


I'm tired so I have to finish writing another time. Tune in for part two soon!

blog comments powered by Disqus