I was thinking about PowerShell and how you can get it to do fantastic things. And I wondered how easily it could be used for scraping cricket scores.
So I threw together four lines of code to grab the cricket scoreboard from cricinfo and rip out the title.
$ret = (new-object Net.WebClient).DownloadString("http://content-aus.cricinfo.com/ausveng/engine/current/match/249226.html?view=live;wrappertype=mainframe")
$titlestart = [Regex]::Matches($ret,"<title>","IgnoreCase").Index
$titleend = [Regex]::Matches($ret,"</title>","IgnoreCase").Index
Edited: This can be done easily in one line – Lars pointed out the use of Regex to grab the section between the title tags, which then means we don't need to store $ret at all. It can now be:
It's not particularly elegant, but it works nicely. I would've liked to have handled the HTML as XML instead, and just gone straight to the Title tag, but there's stuff in there that won't convert to XML, so I guess that option wasn't available.
And the really nice thing about this is that I can put these four lines into PowerGadgets, and in all of 10 seconds have a floating gadget which I can use in XP as well as Vista, and (in Vista) put in the sidebar if I want. I've told it to refresh every minute, which won't refresh as quick as some, but hopefully won't stop working too quickly. It's not quite as nifty as Darren Neimke's gadget, but then again, this was really really quick to throw together.
And of course, I've left the advert for Cricinfo in there. I wouldn't want to hide the source of the information. And if they ask me not to do this, then of course I'll stop. Cricinfo have a great site, and I really don't want to upset them.