View Full Version : Add City Coverage to MapPoint using the GeoNames Database

01-31-2011, 09:18 AM
by Richard Marsden

Users of Microsoft MapPoint often request the ability to import their own data – especially for regions not covered by the North American and European editions. It is not possible to modify MapPoint’s road database or dataset structure, but it is possible to add annotation in the form of pushpins and shapes. Here I demonstrate how to write a Python script to add pushpins for cities in countries with limited MapPoint coverage. A frequent problem is that of finding a suitable source of data. I shall use the open source global GeoNames database of places.

The GeoNames database is available at GeoNames (http://www.geonames.org). The database contains over eight million place names, covers all of the world’s countries, and is distributed under the Creative Commons Attribution 3.0 License. The complete database or small extracts can be downloaded as text files, and it can also be accessed using a series of web services. Premium services are available for those who wish to provide financial support, or require professional-grade availability and a service level agreement.

Extracts include: countries, largest cities, highest mountains, capitals, and post codes. We shall use the “all cities with a population > 1000” subset. This can be found in the file cities1000.zip and can be downloaded from the Download Server at GeoNames (http://download.geonames.org/export/dump/).

This file unzips to give the text file cities1000.txt. This is a simple tab-separated table database that uses the standard GeoNames table format. This is described at the above URL. Information available includes place name, country , feature type (ie. Is it a town, mountain, etc), longitude, latitude, population, administration level (four levels available), elevation, time zone, and aliases.

Country codes use the ISO 3166 two-character codes. These are listed in their own GeoNames database, and at ISO - Maintenance Agency for ISO 3166 country codes - English country names and code elements (http://www.iso.org/iso/english_country_names_and_code_elements). Feature types are detailed in the GeoNames documentation.

Python is ideal as a scripting language to read and interpret this data due to its text manipulation and list handling abilities.

I have covered the use of MapPoint and Python before ( Using Python to Control MapPoint, Part 1 - MapPoint Articles - MP2K Magazine (http://www.mp2kmag.com/a131--python.pythonwin.mappoint.html) ) and I will assume that you are familiar with the basics. I shall use Python 3, and the PythonWin extensions (see: Python for Windows (http://www.python.org/download/windows/) ) for COM support.

The Code
For this example, we shall display pushpins for all cities in South Africa with a population above 1000. The pins will be sized according to population. The pushpin ‘balloons’ will also include population and synonym (alias) information for each city. The code will be written so that it can work with multiple countries if desired.

The pushpins are a series of colored circle bitmaps kindly provided by Eric Frost. These are labeled SA1.bmp (small, yellow) through SA8.bmp (large, red).

Let’s get started! First we import the required libraries and define the name of MapPoint classes we wish to use (North America, default):

# Imports cities from the GeoNames database into MapPoint for the
# specified countries

# Import Libraries
import string
import codecs
import math
import time

# COM imports, and define the MapPoint object name
from win32com.client import constants, Dispatch
MAPPOINT = 'MapPoint.Application.NA'

Next we define the countries we wish to import. We do this here, so it is easy to change. We shall only import South Africa (ISO code ‘ZA’) for this example, but multiple countries can be used (cf. the commented example). We define these as a set to enable quick look ups of the form “Is this city in one of the requested countries?”

# List of countries that we require (2 letter ISO country codes)
# The ISO code for South Africa is ZA:
COUNTRY_LIST = set(["ZA"])

# We can list multiple countries if we wish, eg: the Central American isthmus countries:
#COUNTRY_LIST = set(["NI","CR","PA"])

We will define a CityInfo class to store each city’s information. This is a simple class definition, consisting of a number of data members, and a constructor. The constructor creates a new CityInfo object by parsing a single line of the GeoNames database. The class contains a ‘valid’ flag. This is set if the place is a city with a population of 1000. This is not strictly true for all of the places in the database (eg. Grytviken in South Georgia).

Here is the definition:

# This class stores and handles a city's information

class CityInfo:
def __init__(self, sline):
# Split the line into fields: fields are tab separated
ln = sline.split('\t')

# 0 = identifier
# 1 = name, utf8

# 2 = name, ASCII
self.city = ln[2]

# 3 = Synonyms (comma separated)
self.synonyms = ln[3]

# 4 = Latitude, 5 = Longitude
self.latitude = float( ln[4] )
self.longitude = float( ln[5] )

# 6,7 are the feature code and class
# These determine if it is a usable city or not
# Translate into a simple 'valid' flag
# See the GeoNames documentation for a full list of codes
self.valid = False
if (ln[6] == 'P'):
if (ln[7]!='PPLX' and ln[7]!='PPLL' and ln[7]!='PPLQ' and ln[7]!='PPLW'):
self.valid = True # a valid city!
if (len(self.city) ==0):
self.valid = False; # null/bad name

# ISO-3166 2 char country code
self.country = ln[8]

# 9 = alternative country codes (comma separated)
# 10 = 1st level admin code (eg. State in the USA)
self.state = ln[10]

# 11,12,13 = admin levels 2,3,4

# 14 = Population
self.population = int( ln[14] )
if (self.population <=1000 ):
self.valid = False

# 15 = elevation
# 16 = av elevation at 30x30min level (~900m)
# 17 = timezone
# 18 = modification data

That is all that is required to parse the GeoNames data. We just need to feed it each line, and to process the results.

We have now defined everything, so we can now start with the main code. First we initialize a few variables, log some information to the user, and open the cities1000.txt file:

# Main Code
listCities = []
print("Load GeoNames City Data into MapPoint")
print("Reading data...")
cFile = codecs.open('C:\\your\\path\\cities1000.txt', mode='r',encoding='utf-8')

listCities is simply a list of all the cities we wish to write to MapPoint. We could write them to MapPoint as we read them, but storing them into memory first allows us to calculate statistics or perform other operations if we wished to.

Next we read each line from cFile and parse a new CityInfo object from it. We then store it into listCities if it is valid and is in one of the requested countries:

city_line = cFile.readline()
while len(city_line) > 0 :
thisCity = CityInfo( city_line )
if (thisCity.valid):
if (thisCity.country in COUNTRY_LIST):

# fetch next line (and loop to parse it)
city_line = cFile.readline()

# File has been read => close it

# Display some summary statistics
print("Total number of cities loaded=", len(listCities))
print("Lowest population="+str(listCities[0].population)+"; Highest="+str(listCities[len(listCities)-1].population))

All of the city data has been read in, so now it is time to start MapPoint:

print("Starting MapPoint...")
myApp = Dispatch(MAPPOINT)
myApp.Visible = 1
myApp.UserControl = 1
myMap = myApp.ActiveMap

Next we need to load our custom pushpin symbols. The MapPoint Symbols.Add() method returns the index of the new symbol that is created. We store these in a new list, myPinList. We have a total of eight pushpins, but only use four of them. These are allocated in a log-10 manner, eg. The first pin is used for all cities with a population from 1,000 to 9,999. The final pushpin is used for all cities with a population above 1,000,000. For some countries, this includes cities above 10,000,000. Here is the code that loads the pushpins and stores their indexes:

# Load external pushpin symbols
print("Loading pushpin symbols...")

myPinList = []
# myPinList.append(myMap.Symbols.Add("C:\\ your\\path \\SA7.bmp"))
myPinList.append(myMap.Symbols.Add("C:\\ your\\path \\SA6.bmp"))
# myPinList.append(myMap.Symbols.Add("C:\\ your\\path \\SA5.bmp"))
myPinList.append(myMap.Symbols.Add("C:\\ your\\path \\SA4.bmp"))
# myPinList.append(myMap.Symbols.Add("C:\\ your\\path \\SA3.bmp"))
# myPinList.append(myMap.Symbols.Add("C:\\ your\\path \\SA2.bmp"))
myPinList.append(myMap.Symbols.Add("C:\\ your\\path \\SA1.bmp"))
MAX_PIN= len(myPinList)-1

After creating the pushpins, we create a series of corresponding pushpin sets. Doing this allows us to create a meaningful key. Also, by creating the pushpin sets in order from low to high, we automatically arrange for the higher population cities (=later pushpin sets) to be positioned over the lower population cities (=earlier pushpin sets). The pushpin sets are created using the DataSets.AddPushpinSet() method, and are stored in their own list, myPushpinSets. Finally the symbols are allocated to the pushpin sets using a simple loop:

# Create the pushpin sets
myPushpinSets = []
myPushpinSets.append(myMap.DataSets.AddPushpinSet("Population 1,000-9,999"))
myPushpinSets.append(myMap.DataSets.AddPushpinSet("Population 10,000-99,999"))
myPushpinSets.append(myMap.DataSets.AddPushpinSet("Population 100,000-999,999"))
myPushpinSets.append(myMap.DataSets.AddPushpinSet("Population 1,000,000+"))

for i in range(MIN_PIN,MAX_PIN+1):
myPushpinSets[i].Symbol = myPinList[i]

We are finally ready to start drawing the pushpins. All we need to do is to loop over the cities, creating a pushpin for each city. The pin is chosen using a log10 formula arranged so that a population of 1,000-9,999 receives a pin index of 0; a population of 10,000-99,999 receives a pin index of 1; etc. The pin index is then used to choose both the pin symbol and the pushpin set for the pushpin.

The balloon text is then created from the city’s population, and the list of synonyms (if any). All balloons are set to “hide”, but the pushpin names are displayed for the largest cities with the largest pushpin symbol.

Here’s the code:

# Add the cities to the map
print("Drawing the pushpins...")

for thisCity in listCities:
myLoc = myMap.GetLocation( thisCity.latitude, thisCity.longitude)
myPin = myMap.AddPushpin( myLoc, thisCity.city+", "+thisCity.country)
# Calc the pin index to determine the symbol and pushpinset
ipin = int( math.log10(thisCity.population)) -3
ipin = max(MIN_PIN,min(ipin,MAX_PIN))
myPin.Symbol = myPinList[ ipin ]
myPin.Note = "Population: "+str(thisCity.population)+"\nSynonyms: "+thisCity.synonyms
myPin.MoveTo( myPushpinSets[ipin] )

# display the city name for the pin if pop>=1million
if (ipin==MAX_PIN):
myPin.BalloonState = 1

There we have it. All we need to do is to zoom to the imported data, and to tidy up:

# Zoom to the completed dataset

print("Completed the map!")

# tidy up
myLoc = 0
myMap = 0
myApp = 0

Here are the results with the balloon open for Port Saint John’s to demonstrate the synonym information:

http://www.mapforums.com/images/geonames_south_africa.gif (http://www.mapforums.com/images/geonames_south_africa_large.gif)
(click to view the full image)

Due to the scripting nature of Python and some of the design of the CityInfo class, the above code is easily modified to produce different plots, calculate various statistics, or apply other pre-processing to the data. By choosing different GeoNames datasets, it is possible to plot smaller cities or other types of places.

01-31-2011, 03:56 PM
The attached file includes the Python script, a copy of the Geonames places1000 database (January 2011), the bitmap images that were used.

Ideally the latest version of the Geonames database should be downloaded - see instructions and URL in the article text.