views:

131

answers:

5

I'm working on a shortest path a* algorithm in java with a mysql db. I'm executing the following SQL Query approx 300 times in the program to find route connections from a database of 10,000 bus connections. It takes approx 6-7 seconds to execute the query 300 times. Any suggestions on how I can speed this up or any ideas on a different method i can use ? Thanks

private HashMap<Coordinate,Node> closedNodes;
private PriorityQueue<Node> openNodes;

..
private List<Coordinate> calculatePath() 
{
    //While there are nodes in the open list
    while (!openNodes.isEmpty()) 
    {
        //Get the node with the lowest gVal+hVal
        Node node = openNodes.poll();
        //Add it to the closed list
        closedNodes.put(node);
        //If it is not the goal node
        if (!node.equals(goal)) 
        {   
            //Get all the neighbours and Create neighbour node
            List<Node> neighbours = helper.getNeighbours(node, goal);
            //For each neighbour
            for (Node neighbourNode : neighbours) 
            {
                //Check if the neighbour is in the list of open nodes
                boolean isInOpen = checkOpenNodes(neighbourNode);
                //If it is not in the open nodes and not in the closed nodes
                if ((!closedNodes.containsKey(neighbourNode))&& (!isInOpen)) 
                {
                    //Add it to the list of open nodes
                    openNodes.add(neighbourNode);
                }
            }
        }
        else 
        {
            // We found the path
            path = backTrackPath(node);
            break;              
        }
    }
    return path;

/**
 * Gets the list of valid Nodes that are possible to travel to from <b>Node</b>
 * @param stopNode Node to find neighbours for
 * @param goal End Node
 * @return list of neighbour Nodes
 */
public ArrayList<Node> getNeighbours(Node stopNode, Node goal) 
{
    ArrayList<Node> neighbours = new ArrayList<Node>(); 
    Node neighbourNode;     
    //get neighbours connected to stop  
        try {
            ResultSet rs = stmt.executeQuery("select To_Station_id, To_Station_routeID, To_Station_stopID," +
                    "To_Station_lat, To_Station_lng, Time from connections  where Connections.From_Station_stopID ="
                    +stopNode.getCoord().getStopID()+" ORDER BY Connections.Time");

            rs = stmt.getResultSet();
            while (rs.next()) {
                int id = rs.getInt("To_Station_id");
                String routeID = rs.getString("To_Station_routeID");
                String stopID = rs.getString("To_Station_stopID");
                String stopName = rs.getString("To_Station_stopName");
                Double lat = rs.getDouble("To_Station_lat");
                Double lng = rs.getDouble("To_Station_lng");
                int time = rs.getInt("Time");
                neighbourNode = new Node(id, routeID, stopID, stopName, lat, lng);
                neighbourNode.prev = stopNode;
                neighbourNode.gVal = stopNode.gVal + time;
                neighbourNode.hVal = heuristic.calculateHeuristic(neighbourNode, goal);
                neighbours.add(neighbourNode);
            }
        }
    catch (SQLException e) {
        e.printStackTrace();
    }
    return neighbours;
}
+1  A: 

In general, if your query is slow and expensive, try caching the results somewhere, so on the next lookup it will be retrieved quickly from cache. So you would (expensively) compute connection between point A and B, store the entire result set in other (temporary=cache) table in the database with a defined lifetime, so for the next X hours/days (or until the routes change) you can retrieve route from A to B from this cache table.

Axarydax
thanks i have cached the all the connection data in a hashTable, this prevents me from continually connecting to the db
paddydub
A: 

You can use the IN clause in order to run the query only once - select * from connections where Connections.From_Station_stopID IN (value1, value2, ...).

Cornel Creanga
+1  A: 

As I understand you have a graph with Stations as nodes and Connections as edges.

Try to create some object which will represent that graph (it can be a matrix in the simplest case) and perform your search on that object. Then you will not need to do 300 calls to your database which are very costly from the performance point of view.

Roman
+1  A: 

For a start, you should be using a PreparedStatement, rather than a normal query, and just do stmt.setInt(1, StopId) each time through.

Also, better to select the specific fields you are interested in, rather than select *.

Those are just general JDBC tips, which will probably not have a huge impact on the runtime, but are worth doing.

After that, I would try to investigate the table indexes, to make sure that the query based on From_Station_stopID is indeed executing as fast as it can.

If it is, and the only overhead is the amount of separate calls to the database, the next step might be to try to combine the queries, perhaps by making it a select ... from connections where From_Station_stopID in (..., ..., ...).

Depending on the size of the table, you may simply want to load the whole thing in advance into memory (perhaps as a HashMap), and then you wouldn't need to go to the database on every iteration.

In short, it depends a lot on the different parameters of the problem, and you will need to check to see which solution works best for you.

Avi
+1  A: 
  1. Make sure you have an index on connections.From_Station_stopID
  2. Instead of SELECT *, only select the columns you need
  3. If only the constant in the WHERE clause for From_Station_stopID changes each time, use a parameterized, prepared query so that the database doesn't have to parse the query and build the execution path each time, or combine the queries into one using WHERE From_Station_stopID IN (value1, value2, ...)
  4. If you're repeating the same queries often, ensure that MySQL is using query caching

If you showed us the rest of the code, where it's looping to call the query 300 times we might be able to help further.

In general, I'd say that if you're computing the shortest path each time, instead, you should build a table that works like a grid, with route distances precalculated for each stop, or even entire routes precalculated from each stop.

Marcus Adams