Thread: NYT River Overhaul.

A few weeks ago the NY Times firehose feed broke.

I emailed with a friend at the Times, and we were able to get it working again. But the new version of the firehose is a mere trickle compared to the former raging torrent.

This put me in a bad place because I depend on a gush of NYT headlines in my river. I could subscribe to all the feeds I could find, but that means that I’d get duplicate stories because the Times, like other pubs, runs many stories in multiple feeds.

I’ve always been thinking about doing a heuristic to fix this. I’d keep track of the titles that had already appeared in a river and skip duplicates. Last night during the Giants game I gave it a shot, and it worked.

I wrote the change up in this worknote.

I added a huge number of feeds to the NYT river. And it’s starting to feel good again. I wanted to share this as a possible best-practice for other aggregator developers.

    • bold

Updates

  • After running for a few hours — success. The NYT river is back to its rich flow, at a time when there’s lots going on — the presidential election and a hurricane. And there aren’t any duplicates. All is good. 🙂
  • It’s been a while since I really looked at the NYT river. They write such good descriptions. You have a pretty good idea what the article is about even without clicking. Much more useful than getting full text. Because I get a breadth of the news, and the experience is created by editors who know what they’re doing.

Pointers

Advertisements

About Dave Winer

Dave Winer, 54, pioneered the development of weblogs, syndication (RSS), podcasting, outlining, and web content management software; former contributing editor at Wired Magazine, research fellow at Harvard Law School, entrepreneur, and investor in web media companies. A native New Yorker, he received a Master's in Computer Science from the University of Wisconsin, a Bachelor's in Mathematics from Tulane University and currently lives in Berkeley, California.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s