views:

2478

answers:

5

Hi, I am new to programming language and I am using PHP and mysql. I got an assignment to do a hashtables in php. What I need to do is, store items that a user collected and then display it. After do some research over the internet, I will do the following steps when implement the hashtable, please correct me if I am wrong:

  1. Set up the tables:

    -> Users Table: uid(int[5]), username(varchar[128]), item_id(int[8], items_id_hash(int[50])

    -> Items Table: item_id(int[5]), item_name(varchar[128]), items_id_hash(int[50])

  2. Create a hash function (how to create a hash function? Create by myself or get from internet?) to convert a key into a hash value and then insert into database. E.g.: hash item_id = 001 into hash value = (e.g) 12345. Then insert into users table.

  3. To display/search. Retrieve the hash values from the user and then compare it to the items table and display it.

Questions:

  1. Is my steps correct?
  2. Where to find a good php hash functions? Can i use md5 or sha1 or salt?

Thanks

+4  A: 

I think your idea of a hashtable is a little [defunct]. Hashtables break down keys into lists that are alike. For example: hashtable based on first letter of name, so there would be 26 lists. Your hash is the first letter of the name, which then makes it quicker to search through.

md5, sha1 are used to derive hashes that are used to verify that data has not been tampered. they usually come in either 128-bit or 160-bit versions. So it takes X data and sends it through a hash to come up with a 128-bit alphanumeric string that should be the same no matter where it is done. This is usually a security thing.

EDIT: Expanding on Question of how to derive keys.

You can utilize a modulus of the data to create a key to use for the row. In the example data % X where X is the total number of keys you would like to have. The issue with this is that X is difficult to find; if you have 20 items, then making X into 20 is feasible and makes it a quick search as each item has it's own row. But if you have 1000 items, then doing % 1000 is NOT feasible. Doing something like X = 75 would work better for this.

Suroot
Do you mean that my hash table datashould be something like:key : axxxx1 Values : Shirtskey : axxxx2 Values : T-shirtskey : dxxxx1 Values : Clockkey : dxxxx2 Values : Watches
roa3
So let's say that you have 1000 different items, and each item is either a Longsleeve, T-Shirt, Clock, or Watch, You would create a list for each of these types (meaning that the type is the hashtable).
Suroot
what if the items are various, cannot categorized? Is there any good suggestion to make the keys name alike?? I am in various kind of items situation, very hard to categorized. Even I categorized, each category only have 1 to 5 items
roa3
You could do something like a modulus; so take the data % X (where X is the number of keys that you want to have) and that becomes the key to be used for that row of data. The problem with finding X is you're going to have to decide what a good number is.
Suroot
+2  A: 

You have to main problems :

1) The hashtable paradigm you want to choose (open|closed) hash table.

2) The Hashtable can be an simple array with key indexes and a array reference for collisions cases.

3) You have to study your hash key generation algorithm (the $hash = ord($string[$i]) + ($hash << 5) - $hash; can be enough), but you can choose md5/sha too. If you know your key space , maybe you can use unix gperf.

Here is my hash table implementation :

<?php

/**
        A brief but simple closed hash table class.
        Jorge Niedbalski R. <[email protected]>
**/

class   HashTable       {

        public  $HashTable = array();
        public  $HashTableSize;

        public  function        __construct($tablesize) 
        {
                if($tablesize) {
                        $this->HashTableSize = $tablesize;
                } else {
                        print "Unknown file size\n";
                        return -1;
                }
        }

        public  function        __destruct() 
        {
                unset($this->HashTable);
        }

        public  function        generate_bucket($string) 
        {
                for($i=0; $i <= strlen($string); $i++) {
                        $hash = ord($string[$i]) + ($hash << 5) - $hash;
                }
                print "".$this->HashTableSize."\n";
                return($hash%$this->HashTableSize);
        }
    public  function        add($string, $associated_array)
        {
                  $bucket = $this->generate_bucket($string);

                  $tmp_array = array();
                  $tmp_array['string'] = $string;
                  $tmp_array['assoc_array'] = $associated_array;                

                  if(!isset($this->HashTable[$bucket])) {
                                $this->HashTable[$bucket] = $tmp_array;
                  } else {
                        if(is_array($this->HashTable[$bucket])) {
                                array_push($this->HashTable[$bucket], $tmp_array);
                        } else {
                                $tmp = $this->HashTable[$bucket];
                                $this->HashTable[$bucket] = array();
                                array_push($this->HashTable[$bucket], $tmp);
                                array_push($this->HashTable[$bucket], $tmp_array);
                        }
                }

        }

        public  function        delete($string, $attrname, $attrvalue) 
        {       
                $bucket = $this->generate_bucket($string);

                if(is_null($this->HashTable[$bucket])) {
                                return -1;
                } else {
                        if(is_array($this->HashTable[$bucket])) {
                                for($x = 0; $x <= sizeof($this->HashTable[$bucket]); $x++) {
                                        if(($this->HashTable[$bucket][$x]['string'] == $string) && ($this->HashTable[$bucket][$x]['.$attrname.'] == $attrvalue)) {
                                                unset($this->HashTable[$bucket][$x]);   
                                        }
                                }
    } else {
                                unset($this->HashTable[$bucket][$x]);
                        }
                }       
                /** everything is OK **/                        
                return 0;
        }


        public  function        search($string) 
        {
                $resultArray = array();

                $bucket = $this->generate_bucket($string);

                if(is_null($this->HashTable[$bucket])) {
                        return -1;
                } else {
                        if(is_array($this->HashTable[$bucket])) {
                                for($x = 0; $x <= sizeof($this->HashTable[$bucket]); $x++) {
                                        if(strcmp($this->HashTable[$bucket][$x]['string'], $string) == 0) {
                                                array_push($resultArray,$this->HashTable[$bucket][$x]);
                                        }
                                 }
                        } else {
                                array_push($resultArray,$this->HashTable[$bucket]);
                        }
                }

                return($resultArray);
        }
}

        $hash = new HashTable(16);

        $arr = array('nombre' => "jorge niedbalski");

        $hash->add("astroza", $arr);
        $hash->add("astrozas", $arr);



        print_r($hash->search("astroza"));



?>
Jorge Niedbalski R.
what is the output?
roa3
+1  A: 

Do you mean a hash value (which you store in a table), rather than a hash table?

I don't see how you could store that data in a useful way, in a hash table. (Suroots answer explains hash tables).

To create a hash value using MD5 try

hash('md5', 'string to hash');

see http://au.php.net/function.hash for more details

Sophia
+1  A: 

Hi Suroot,

Thanks for enlighten me.

Okay, if i am handling a large amount of data, in that case, can I use first character of item's name + md5 values of items name? For example: item_id = 005 item_name = shirt item_hash = s + md5(shirt)

Would it avoid the collision?

Thanks Jorge for showing the hash table code, but how would you store and retrieve data? I mean, are you store into database table or temp array? And how are you going to retrieve and display the data?

roa3
+1  A: 

roa3

That code isn't using any persistent datastore backend, you can simple extend it to add mysql support.

You have to implement this using a one_to_many relation (bucket, entries) in a relational database.

Think how to extend this base class.

Good luck

Jorge Niedbalski R.