views:

56

answers:

2

I have a JSON API that I need to provide super fast access to my data through.

The JSON API makes a simply query against the database based on the GET parameters provided.

I've already optimized my database, so please don't recommend that as an answer.

I'm using PHP-APC, which helps PHP by saving the bytecode, BUT - for a JSON API that is being called literally dozens of times per second (as indicated by my logs), I need to reduce the massive RAM consumption PHP is consuming ... as well as rewrite my JSON API in a language that execute much faster than PHP.

My code is below. As you can see, is fairly straight forward.

<?php

define(ALLOWED_HTTP_REFERER, 'example.com');

if ( stristr($_SERVER['HTTP_REFERER'], ALLOWED_HTTP_REFERER) ) {

 try {
  $conn_str = DB . ':host=' . DB_HOST . ';dbname=' . DB_NAME;
  $dbh = new PDO($conn_str, DB_USERNAME, DB_PASSWORD);
  $params = array();

  $sql = 'SELECT  homes.home_id,
      address,
      city,
      state,
      zip
    FROM homes
    WHERE homes.display_status = true
    AND homes.geolat BETWEEN :geolatLowBound AND :geolatHighBound 
    AND homes.geolng BETWEEN :geolngLowBound AND :geolngHighBound';

  $params[':geolatLowBound'] = $_GET['geolatLowBound'];
  $params[':geolatHighBound'] = $_GET['geolatHighBound'];
  $params[':geolngLowBound'] =$_GET['geolngLowBound'];
  $params[':geolngHighBound'] = $_GET['geolngHighBound'];

  if ( isset($_GET['min_price']) && isset($_GET['max_price']) ) {
    $sql = $sql . ' AND homes.price BETWEEN :min_price AND :max_price ';
    $params[':min_price'] = $_GET['min_price'];
    $params[':max_price'] = $_GET['max_price'];
  }

  if ( isset($_GET['min_beds']) && isset($_GET['max_beds']) ) {
    $sql = $sql . ' AND homes.num_of_beds BETWEEN :min_beds AND :max_beds ';
    $params['min_beds'] = $_GET['min_beds'];
    $params['max_beds'] = $_GET['max_beds'];
  }

  if ( isset($_GET['min_sqft']) && isset($_GET['max_sqft']) ) {
    $sql = $sql . ' AND homes.sqft BETWEEN :min_sqft AND :max_sqft ';
    $params['min_sqft'] = $_GET['min_sqft'];
    $params['max_sqft'] = $_GET['max_sqft'];
  }

  $stmt = $dbh->prepare($sql);

  $stmt->execute($params);
  $result_set = $stmt->fetchAll(PDO::FETCH_ASSOC);

  /* output a JSON representation of the home listing data retrieved */
  ob_start("ob_gzhandler"); // compress the output
  header('Content-type: text/javascript');
  print "{'homes' : ";

  array_walk_recursive($result_set, "cleanOutputFromXSS");
  print json_encode( $result_set );

  print '}';

  $dbh = null;
 } catch (PDOException $e) {
  die('Unable to retreive home listing information');
 }

}


function cleanOutputFromXSS(&$value) {
 $value = htmlspecialchars($value, ENT_QUOTES, 'UTF-8');
}


?>

How would I begin converting this PHP code over to C, since C is both better on memory management (since you do it yourself) and much, much faster to execute?

UPDATE:

Would Facebooks HipHop do all of this automatically for me?

A: 

You can write your own Apache module.

Here is a tutorial: http://threebit.net/tutorials/apache2_modules/tut1/tutorial1.html

edwin
Thanks, but this doesn't show me the following: 1) how to connect to my MySQL database in C, 2) How to retrieve the GET parameters from the URL in C, how to output a JSON array in C, etc.
TeddyB
It is C, you don't get all those things for free, you have to write them yourself. This is why most web applications are written in high level languages. Google around there are C based solutions for doing those things.
Byron Whitlock
Does Facebooks HipHop have these libraries? http://github.com/facebook/hiphop-php
TeddyB
compiling php to c would be pointless if you didn't get the php libs.... Most of the php libs are written in c to begin with anyway.
Byron Whitlock
@Byron, exactly - which is why I'm now wondering if HipHop does all of this for me
TeddyB
The MySQL C API is documented here: http://dev.mysql.com/doc/refman/5.0/en/c.html .
caf
To connect to MySQL from C, you can use the C-connector:http://dev.mysql.com/doc/refman/5.0/en/connector-c-building.htmlGetting the parameters from the URL should be in de Apache module documentation.Writing JSON is so easy, you can write your own function to do that.As Byron says: it isn't easy.
edwin
No, writing your own apache-module is not easy. (Compared to writing PHP).About HipHop: don't overestimate the improvement over cached PHP.A note: you can make the PHP-code faster (in C), but then you have another bottleneck: MySQL itself.
edwin
+1  A: 

There are better solutions than rewriting this in c. Memcached, adding more memory, tuning php all come to mind.

You need to profile your app to see how much memory is from the php interpreter, and how much is from compressing the output, and pulling the whole sql result set into memory.

Remember that little site called facebook? They use php and get way more traffic than your API. Keep that in mind.

Also think about the maintenance and stability hit you will take compiling this. Any change will now take orders of magnitude longer, and you have to take down the server to deploy. Maybe not an issue, but something to ponder.

I'd bet there are better ways to optimize than what you are thinking. Profiling is the key.

Byron Whitlock
Facebook also only uses PHP for DISPLAY. Not for their APIs, they use C and Erlang. Also, for all of their PHP used for display, they compile it down to C using a program called HipHop http://developers.facebook.com/blog/post/358
TeddyB
True, but without profiling there is no way to tell where your memory/cpu bottleneck is. I'd be interested to hear what you end up doing!
Byron Whitlock
my MySQL instance is on a different server than my PHP JSON API. I've profiled and PHP is getting insane on memory AND this JSON API *is the only place I use PHP on my web application*
TeddyB
If the JSON API is the only place you use PHP, why did you use PHP in the first place? o.0
Amber
Because it was dead simple to write the code above. Seriously, what other language could I have accomplished this specific task in less lines of code?
TeddyB