I was able to successfully refresh the Trendtopics website data today. I used a sample of an updated Wikipedia dataset that I setup on Amazon S3. The updated data is from 1/1/2011-3/31/2011. As crunching through months of data would take too long or cost too much if I used the EC2 cloud, I didn't use the whole dataset..I sampled only one hour of weblogs from each day. You can checkout my local version of TrendingTopics here . (CTRL-click or wheel-click to open in new browser window).
In case the site is offline, you can see the more recent dates in the screenshot below:
The caveat with the refresh is that I can only load 100 records of the sample data. For some reason, the Rails app bombs when I try to load the full dataset.
* must figure out why
--more to come--
No comments:
Post a Comment