MaxMind GeoLite on Rails with PostGIS

Posted October 15th, 2006 in Mapping, Programming, Rails and or Ruby, Tutorials

Ughhhh, many hours of frustration have been spent trying to push the MaxMind GeoLite City CSV files into a database configured with PostGIS. Anyway, I figured I would document the process that I went through incase someone else wants some free IP based geocoding in there Rails app.

MaxMind gives you two CSV files for the city database. One has an assload (2,783,434) of IP blocks that map to the id of a row in the other table containing location data. The location data is pretty rich. It contains a postal code, area code, dma code, country, region (state in US), and a latitude and longitude. Because I wanted to be cool and more efficient, I decided to store the lat / lon as a point in PostGIS format. Using PostGIS along with Guilhem Vellut’s Spatial Adaptor plugin for Rails, makes doing geo-spatial operations sexier and easier than ever. You can use the Spatial Adaptor with MySQL, but you’ll never be as cool as the guy using it with PostgreSQL.

First off I had to create my tables.

create_table "geo_ip_locations" do |t|
  t.column "country", :string, :limit => 2
  t.column "region", :string, :limit => 2
  t.column "city", :string
  t.column "postal_code", :string, :limit => 7
  t.column "dma_code", :integer, :limit => 3
  t.column "area_code", :integer, :limit => 3
  t.column "geom", :point, :null => false, :srid => 4269, :with_z => false
end

add_index :geo_ip_locations, :geom, :spatial => true

create_table "geo_ips" do |t|
  t.column "start_ip", :integer, :limit => 10
  t.column "end_ip", :integer, :limit => 10
  t.column "geo_ip_location_id", :integer
end

add_index :gep_ips, :start_ip
add_index :geo_ips, :end_ip
add_index :geo_ips, :geo_ip_location_id

A few things to note:

  • The geom column has the srid set to 4269, this threw me for a loop for way too long. Make sure you have the spatial_ref_sys table populated. I’m also assuming that the MaxMind data is mapped to the WGS 84 standard.
  • The limit’s on the start and end ip’s are set to 10. This creates bigint columns in my DB. I’m not sure what version of Rails started doing this (I’m on Edge), but the columns have to be bigints for the IP block information to be imported.
  • Remember to create a couple models for the tables
  • You may want to add an index on postal_code or anything else you’ll be using in a SQL WHERE clause

Now because the CSV file with the blocks of IP addresses in it is so large I went with an importing tool to get it into the table. I was going to use PostgreSQL’s COPY command, but it doesn’t seem to support CSV’s with double quotes. I went with Navicat and took advantage of my 30 day trial. The CSV with the locations in it is another story. We need to convert the lat / lon pairs into points for PostGIS. Not only that, but if we want to be really cool we need to convert the city names into UTF-8. The file that MaxMind gives us has ISO-8859-1 encoding. Iconv comes to the rescue. You can either convert the entire file from the command line, or just use the ruby library to do it in the import script below.

require "#{File.dirname(__FILE__)}/../../config/environment"
require 'fastercsv'
require 'iconv'
ICONV = Iconv.new( 'UTF-8', 'ISO-8859-1' )

FasterCSV::HeaderConverters[:underscore] = lambda { |h| h.underscore }

FasterCSV.foreach('geo_ip_locations.csv', {:headers => :first_row, :col_sep => ",", :header_converters => :underscore}) do |row|
  begin
    row['id'] = row.delete('loc_id')[1]
    row['geom'] = Point.from_x_y(row.delete('longitude')[1].to_f, row.delete('latitude')[1].to_f, 4326)
    row['city'] = ICONV.iconv(row['city'])
    cool = GeoIpLocation.new(row.to_hash)
    cool.id = row['id']
    cool.save!
    $stdout.print '.'
  rescue
    puts $!.message
    $stdout.print 'f'
  end
  $stdout.flush
end

Things to note:

  • You need FasterCSV for this to work gem install fastercsv
  • You need to delete the first line (with the copyright info in it) in the CSV for this script to work.
  • If you converted the file from the command line, comment out the three lines referencing iconv to improve performance.
  • I created a directory called transform in my db directory, renamed and moved the location CSV there, and created a file with the above code in it.
  • Always remember to put the srid in the Point.from_x_y call, otherwise you’ll be very very frustrated.
  • You’ll need to specify encoding: unicode Postgres, encoding: utf8 MySQL in database.yml to use UTF-8 with the DB. See the Rails wiki
  • The id is set this way to preserve it.

That’s it! There you have it, the possibilities are now endless. As I said before, I set this up on PostgreSQL, but this should work the same on MySQL. The one main difference will be the srid’s. MySQL doesn’t support them, so you don’t need to specify them. Goodluck!

Kinda like: | 5 comments »

Your own personal Geocoder... for FREE

Posted July 18th, 2005 in Mapping, Tutorials

After the release of the google maps api, I began working on custom applications. Because google doesn't allow address lookups in its api, you have to generate a latitude and longitude yourself for a given address. The immediate problem was the need for geocoding, preferably geocoding within php. This is where the US Census comes in. The easiest solution is to download a mysql dump of all the zips with there corresponding lat/longs. That mysql dump can be found here Well, this is great, but it's not detailed enough. I needed to geocode to the street level. geocoder.us offers a free web service for doing this but you can only make one request every 15 seconds. I have a database of 3500 addresses, so that wasn't going to work. geocoder.us uses the US census's Tiger Data, and the perl Geo::Coder::US module. The problem with this solution is the amount of time it would take compile. You have to download every zip file from every state, and than compile it. I'm not patient enough. Thats when I found Dan Egnor. Dan won the 2002 Google Programing contest with a program that does just what I needed it to. Read on to do it yourself. Ok, so the biggest issue is skipping out of the compile process. Dan offers a free download of a bz2'ed version (~300mb) of the compiled address location data (by the way, the README on his site is good reading material). Dans program does more than I needed it to, so I simplified his source code (Thanks Dan) to do simply what needs to be down. It can be downloaded here. Simply download my source, `bunzip2` the map data from Dans site (assuming Linux here), and tar -jxvf the source, run make on the source and issue a command like `./geo-coder (location of the map data ex. ../../all-map) '(the address you want to find)'`. This returns either SUCCESS or FAILURE. Some simple php code will execute and return the address within your web app:
1
2
3
4
5
6
7
8
9
10

$progress = array();
exec("./geo-coder ../all-map '$address'", $progress);
if(substr($progress[0],0,7) == "SUCCESS"){
 //The below returns an array with, $ll[0] = longitude and $ll[1] = latitude from Dan's program
 $ll = explode(", ",trim(substr(strstr($progress[3], '@'),1)));
}else{
 //use a zipcode (from the query or DB table in my case) to find the lat/long in the mysql zip database
 $zips =& $db->query('SELECT latitude, longitude FROM zipcodes WHERE zip = '.$row['zip']);
}
So there it is. A geocoder in no time at all. The geocoder is about 80% effective (according to Dan). If enough interest is created with this post, I will create a torrent of the most up to date Tiger address information, which should improve the accuracy. Don't hesitate to email me at vanpelt@gmail.com if you have questions or requests.
Kinda like: | 5 comments »

Google maps continues to amaze me

Posted June 8th, 2005 in Cool Tech, Mapping

I was browsing around today, and I found an interesting article on slashdot. It had to do with third party uses of google maps, and google cutting down on them. There turns out to be some amazing stuff being done google maps and third party software. As mentioned earlier, there was the gps in the car use but these applications are just too cool. My favorite was a tool that took craigslist rental listings and displayed them in your area dependent upon price. There is also a site that points out sites to see from the satellite imagery.

In order of coolnees:

Who knows what will pop up next???

Kinda like: | 1 comments »

Mi Casa... Google Maps

Posted April 11th, 2005 in Mapping

Google maps released it's new mapping program a few months ago. Last week they added satellite imagery. I thought this would be a great opportunity to share the unbelievable beauty that is Southern California. Click on the thumb for a detailed look. Life is good out here, but it's no Iowa. Try google maps and check out Southern California. Click on the map and drag, you can scroll wherever you like. It's amazing. Getting directions on the site is also very cool. Google bought Keyhole a couple weeks ago. I look forward to seeing what they do with the technology. Currently only urban areas have high resolution. It's like being in an airplane, and all using relatively simple built in technology. For more about Google Maps and all of Googles cool new features check this out. I'm sure they'll be getting a hold of me soon to create the next big thing :).
Kinda like: | 0 comments »