Yesterday Zillow released 7000 neighborhood boundary definitions to the public in SHP format. Quality data sets for this type of geography is hard to come by; the boundaries aren't official and often debated so maintaining them is difficult. From my examination of the Seattle area data it looks like Zillow has done a really nice job. The resolution was more than adequate for the type of tasks folks are going to use this data in and the granularity of the neighborhoods was surprisingly good. [download]
But dealing with data sets of this size in a web based application can be a beast. If you're using the DOM to build up and destroy objects representing thousands of complex objects you know how slow it can be not to mention it isn't very memory efficient. By coincidence (?) Jon Howell and the Microsoft Research MapCruncher team released a library yesterday to make it much easier for developers to work with such massive vector data sets. This tiled vectors technique has been proven against the 60,000,000 segment tiger line data set so it should scale pretty well for most needs. From Jon:
Jon has supplied the basic toolkit here along with sourcecode so that you can adapt it easily to meet your needs. For instance, in the readme he explains that he did not put a lot of effort into the summarization code; if you want your shapes to look better with the same vector budget, you can use the client code Jon supplied and swap in your own custom generalization code in the pre-processor.
Here are a couple of screenshots of the Zillow data for Washington State loaded in Virtual Earth. As a bonus, when you load the data VE will calculate for you the area and perimeter of each neighborhood.
Click here to view the full post.