Every year, more than a billion consumers shop on e-commerce websites. And in 2016, a new startup called Fomo set out to help merchants reach that audience. To do that, Fomo first needed to find a service partner with expertise in geolocation. They chose MaxMind. “We’re a relatively new company, but our growth has been phenomenal since we added MaxMind’s GeoIP2 Precision services,” said Fomo’s co-founder Ryan Kulp.
The Apache Nutch community has been hard at work developing an open source web crawler. Nutch is a mature, production ready web crawler powering data acquisition, search and discovery for a broad spectrum of organizations over a broader spectrum of use cases. The Nutch 1.x branch enables fine grained configuration and relies on Apache Hadoop™ data structures, which are great for batch processing.
This post documents how reverse geolocation features were added to Nutch via MaxMind’s GeoIP2-java API, making good use of server IP addresses acquired within a Nutch crawl. Readers will take away:
- insight into why geocoding is appealing in today’s markets,
- practical code examples from the Nutch 1.x branch, showing how to use the GeoIP2-java API in order to geocode based on server IPs.
When it comes to choosing between the multiple IP geolocation data providers out there, our customers have told us they are most interested in one thing – accuracy. The question is, who provides the most accurate data?