Calculate Distance Between Zip Codes Part 1: Geocode Database Entries with the Google Maps API

A few years back I looked into using the Google Maps API to calculate the distance between two zip codes. The proof of concept seemed to work well enough. But when I attempted to perform larger-scale calculations, the solution began falling apart. And it was slow… So I needed to develop an alternative solution.

Background

The goal of the project was to provide two input fields for a user to enter a zip code and to select a search radius. The script would then compare the user's zip code to all the zip codes in the database and see which ones fall into the user's search radius.

I planned to use Google's Distance Matrix Service to calculate all the distances on the fly. The solution seems to work fine with smaller data sets. But it didn't take long to run into the "too many elements … requested within a certain time period" limitation.

A delay could have been added to the script, but the process was already taking longer than most visitors would be willing to wait. So it was time for an alternative solution.

Haversine Formula

While searching, I noticed that many solutions referred to the Haversine formula. And apparently, most of the processing can happen in the MySQL database. But that will be the topic for the next post. For now, there is some prep work to be done with the database.

Geocode Zip Codes

Since we're querying a database, a connection needs to be established. Note that more information about the connection script used for this example can be found in an earlier post titled "End PHP Scripts Gracefully After a Failed Database Connection."

<?php
//CONNECT WITH DATABASE
require "{$_SERVER['DOCUMENT_ROOT']}/../database_connection.php";
$connect = new connect();
$mysqli = $connect->databaseObject;
?>

A couple of variables are also needed; one to hold the output so we can verify that Google is sending back the desired information. The other will contain a DOMDocument object that will help us process the XML code return by Google.

<?php
//CONNECT WITH DATABASE
require "{$_SERVER['DOCUMENT_ROOT']}/../database_connection.php";
$connect = new connect();
$mysqli = $connect->databaseObject;
 
//INITIALIZE VARIABLES
$output = '';
$xmlDoc = new DOMDocument();

?>

Now we can loop through the zip codes found in the database. With each iteration of the loop, we'll initialize a variable for the zip code / address before and after it's passed through Google. Let's also set an error variable which assumes the current iteration will fail.

<?php
//...
 
//INITIALIZE VARIABLES
$output = '';
$xmlDoc = new DOMDocument();
 
//GET ZIP CODES
$sql = "SELECT id, zipCode FROM profileInfo ORDER BY id";
$result = $mysqli->query($sql);
while($row = $result->fetch_assoc()) {
     //INITIALIZE VARIABLES
     $beforeAddress = $row['zipCode'];
     $afterAddress  = '';
     $errorMessage  = 'Unable to get Lat/Long';
}

?>

To look up the latitude and longitude for each zip code, we can use Google's Geocoding API. The request below indicates that result should be formatted as XML which is then loaded into the DOMDocument object created earlier.

<?php
//...
while($row = $result->fetch_assoc()) {
     //INITIALIZE VARIABLES
     $beforeAddress = $row['zipCode'];
     $afterAddress  = '';
     $errorMessage  = 'Unable to get Lat/Long';
 
     //GET LAT/LONG FROM GOOGLE
     if($xmlDoc->load('https://maps.googleapis.com/maps/api/geocode/xml?address=' . urlencode($beforeAddress))) {

          //...
     }
}
?>

If the XML results successfully loads into the DOMDocument object, we can attempt to extract the <location> tag which contains the <lat> and <lng> tags.

<?php
//...
while($row = $result->fetch_assoc()) {
     //INITIALIZE VARIABLES
     $beforeAddress = $row['zipCode'];
     $afterAddress  = '';
     $errorMessage  = 'Unable to get Lat/Long';
 
     //GET LAT/LONG FROM GOOGLE
     if($xmlDoc->load('https://maps.googleapis.com/maps/api/geocode/xml?address=' . urlencode($beforeAddress))) {
          //GET <location> TAG
          $locationTag = $xmlDoc->getElementsByTagName('location');
          if($locationTag->length != 0) {
               //GET <lat> AND <lng> TAGS
               $latTag = $locationTag->item(0)->getElementsByTagName('lat');
               $lngTag = $locationTag->item(0)->getElementsByTagName('lng');
               if($latTag->length!=0 && $lngTag->length!=0) {

                    //...
               }
          }

     }
}
?>

If we're able to extract the <lat> and <lng> tags, we'll save the corresponding values, get the zip code / address returned by Google, and clear the error message. Note that you'll also want to save the latitude and longitude back to your database, but I'll let you take care of that step.

<?php
//...
               if($latTag->length!=0 && $lngTag->length!=0) {
                    //STORE LAT / LONG
                    $currLat = $latTag->item(0)->nodeValue;
                    $currLong = $lngTag->item(0)->nodeValue;
 
                    //STORE ADDRESS FROM GOOGLE
                    $addressTag = $xmlDoc->getElementsByTagName('formatted_address');
                    if($addressTag->length!=0) { $afterAddress = $addressTag->item(0)->nodeValue; }
 
                    //CLEAR ERROR MESSAGE
                    $errorMessage = '';
 
                    //SAVE LAT / LONG TO DATABASE

                    //...
               }
 
//...
?>

Now we just need to prepare the output which will indicate what happened with the geocoding query. If it was successful, it displays the lat / long; otherwise it displays the error message.

<?php
//...
while($row = $result->fetch_assoc()) {
     //...
 
                    //SAVE LAT / LONG TO DATABASE
                    //...
               }
          }
     }
 
     //ADD CURRENT ENTRY TO THE OUTPUT
     $output .= '<tr>';
     $output .= "<td>{$row['id']}</td>";
     if($errorMessage == '') {
          $output .= "<td>$beforeAddress<br>$afterAddress</td>";
          $output .= "<td>$currLat | $currLong</td>";
     } else {
          $output .= "<td>$beforeAddress</td>";
          $output .= "<td style='color:red;'><strong>$errorMessage</strong></td>";
     }
     $output .= '</tr>';

}
?>

All that's left to do is display the output.

<?php
//...
 
     //ADD CURRENT ENTRY TO THE OUTPUT
     $output .= '<tr>';
     $output .= "<td>{$row['id']}</td>";
     if($errorMessage == '') {
          $output .= "<td>$beforeAddress<br>$afterAddress</td>";
          $output .= "<td>$currLat | $currLong</td>";
     } else {
          $output .= "<td>$beforeAddress</td>";
          $output .= "<td style='color:red;'><strong>$errorMessage</strong></td>";
     }
     $output .= '</tr>';
}
 
//DISPLAY OUTPUT
?>
<style type="text/css">
table, td { border:1px solid #CCC; border-collapse:collapse; }
td { padding:5px; }
</style>
<table>
<?php print $output; ?>
</table>

Final Code

To help give you a better sense on how the pieces fit together, here is the entire script.

<?php
//CONNECT WITH DATABASE
require "{$_SERVER['DOCUMENT_ROOT']}/../database_connection.php";
$connect = new connect();
$mysqli = $connect->databaseObject;
 
//INITIALIZE VARIABLES
$output = '';
$xmlDoc = new DOMDocument();
 
//GET ZIP CODES
$sql = "SELECT id, zipCode FROM profileInfo ORDER BY id";
$result = $mysqli->query($sql);
while($row = $result->fetch_assoc()) {
     //INITIALIZE VARIABLES
     $beforeAddress = $row['zipCode'];
     $afterAddress  = '';
     $errorMessage  = 'Unable to get Lat/Long';
 
     //GET LAT/LONG FROM GOOGLE
     if($xmlDoc->load('https://maps.googleapis.com/maps/api/geocode/xml?address=' . urlencode($beforeAddress))) {
          //GET <location> TAG
          $locationTag = $xmlDoc->getElementsByTagName('location');
          if($locationTag->length != 0) {
               //GET <lat> AND <lng> TAGS
               $latTag = $locationTag->item(0)->getElementsByTagName('lat');
               $lngTag = $locationTag->item(0)->getElementsByTagName('lng');
               if($latTag->length!=0 && $lngTag->length!=0) {
                    //STORE LAT / LONG
                    $currLat = $latTag->item(0)->nodeValue;
                    $currLong = $lngTag->item(0)->nodeValue;
 
                    //STORE ADDRESS FROM GOOGLE
                    $addressTag = $xmlDoc->getElementsByTagName('formatted_address');
                    if($addressTag->length!=0) { $afterAddress = $addressTag->item(0)->nodeValue; }
 
                    //CLEAR ERROR MESSAGE
                    $errorMessage = '';
 
                    //SAVE LAT / LONG TO DATABASE
                    //...
               }
          }
     }
 
     //ADD CURRENT ENTRY TO THE OUTPUT
     $output .= '<tr>';
     $output .= "<td>{$row['id']}</td>";
     if($errorMessage == '') {
          $output .= "<td>$beforeAddress<br>$afterAddress</td>";
          $output .= "<td>$currLat | $currLong</td>";
     } else {
          $output .= "<td>$beforeAddress</td>";
          $output .= "<td style='color:red;'><strong>$errorMessage</strong></td>";
     }
     $output .= '</tr>';
}
 
//DISPLAY OUTPUT
?>
<style type="text/css">
table, td { border:1px solid #CCC; border-collapse:collapse; }
td { padding:5px; }
</style>
<table>
<?php print $output; ?>
</table>

Final Thoughts

In case it isn't obvious, the Geocoding API can accept more than just zip codes; feel free to use real addresses. Google is fairly flexible with how the address is formatted, but you'll want to double check the results.

I have seen cases where Google will change "Avenue" to "Street". I even had an address appear in the wrong state. However you choose to proceed, I'll see you back here next time as we discuss how to execute the Haversine formula.

Related Posts

0 Comments

There are currently no comments.

Leave a Comment