Tag Archives: GeoIP2

Reverse Geocoding for the Masses – Apache Nutch

The Apache Nutch community has been hard at work developing an open source web crawler. Nutch is a mature, production ready web crawler powering data acquisition, search and discovery for a broad spectrum of organizations over a broader spectrum of use cases. The Nutch 1.x branch enables fine grained configuration and relies on Apache Hadoop™ data structures, which are great for batch processing.

This post documents how reverse geolocation features were added to Nutch via MaxMind’s GeoIP2-java API, making good use of server IP addresses acquired within a Nutch crawl. Readers will take away:

  • insight into why geocoding is appealing in today’s markets,
  • practical code examples from the Nutch 1.x branch, showing how to use the GeoIP2-java API in order to geocode based on server IPs.

Continue reading

Who Has the Most Accurate IP Geolocation Data?


When it comes to choosing between the multiple IP geolocation data providers out there, our customers have told us they are most interested in one thing – accuracy. The question is, who provides the most accurate data?

Continue reading