views:

130

answers:

3

Hello SO,

I'm currently programming an OCaml module defining a type corresponding to a CPU register. The interface of this module is the following :

(*
 * Defines a type which represents a R3000 register.
 *)

type t =
|   R0                                      (* Always 0 *)
|   AT                                      (* Assembler temporary *)
|   V0 | V1                                 (* Subroutine return values *)
|   A0 | A1 | A2 | A3                       (* Subroutine arguments *)
|   T0 | T1 | T2 | T3 | T4 | T5 | T6 | T7   (* Temporary registers *)
|   S0 | S1 | S2 | S3 | S4 | S5 | S6 | S7   (* Register variables *)
|   T8 | T9                                 (* Temporary registers *)
|   K0 | K1                                 (* Reserved for kernels *)
|   GP | SP | FP                            (* Global/Stack/Frame pointer *)
|   RA                                      (* Return address *)

(*
 * Conversion from/to [|0, 31|].
 *)
val of_int : int -> t
val to_int : t -> int

(*
 * Conversion to string for display.
 *)
val of_string : string -> t
val to_string : t -> string

However, I would like the implementation to be fast and not too repetitive. For example, I could code the of_int function like this :

let of_int = function
|    0 -> R0
|    1 -> AT
    (* ... *)

But it would be awful and unmaintainable. I do not want to do this as it conflicts with my programming religion. Moreover, I would need to do this kind of dirty code not only one time, but for the four functions.

The first solution I found would be to use a preprocessor (either Camlp4 or cpp) to generate the code I want. I find this to be overkill but would use this method if you can't help me with my second idea.

After a bit of thought, I thought I could do something like this :

type regdescr = {
    reg : t ;
    name : string ;
    index : int
}

let regs =
    let htbl = Hashtbl.create 32 in
    let li = [ (* regdescr defs here *) ] in
    List.iter (Hashtbl.add htbl) li ;
    htbl

However, in this case, I must choose what field I want to hash. Is there another solution than using three different hashtables in this case ? Maybe a data-structure I do not know about is able to hash over three fields and perform searches on the three of them.

Sorry for the long question for which the answer may be trivial :) .

+2  A: 

Just have three separate hash tables?

newacct
Well, I thought about the problem for the last 3 hours and did not found anything else than either the three hashtbls or the preprocessor trick, so I guess I'll go with the hashtables. Thanks for the confirmation that it is not a bad idea though.
Pierre Bourdon
+1  A: 

Instead of using a hashtable for going from one partial representation of a register to another, have you thought of forcing yourself to always manipulate only pointers to complete descriptions, so that you can access any aspect you like (index, string representation, ...) with just a pointer dereference?

You can use the representation (your type regdescr) as the register.

How often do you need to pattern-match a value of type register?

If you never do, you can even do away with the reg field completely.

module Register : 
sig
  type t = private { name : string ; index : int }
  val r0 : t
  val at : t
  val equal : t -> t -> bool
  val hash : t -> int
  val compare : t -> t -> int
end = 
struct
  type t = { name : string ; index : int }

  let r0 = { name = "R0" ; index = 0 }
  let at = { name = "AT" ; index = 1 }

  let equal r1 r2 = r1.index = r2.index
  let hash r1 = Hashtbl.hash (r1.index)
  let compare r1 r2 = Pervasives.compare r1.index r2.index
end

Note: you can make the whole thing more readable by using files register.ml and register.mli to define the Register module.

If you sometimes need pattern-matching, you can keep the constructor field so that it is possible to write nice pattern-matchings:

match r.reg with
  R0 -> ...
| AT -> ...

But force yourself to write only functions that accept (and pass their callees) the full Register.t.

EDIT: For indexing, first write the generic function below:

  let all_registers = [ r0 ; at ]

  let index projection =
    let htbl = Hashtbl.create 32 in
    let f r =
      let key = projection r in
      Hashtbl.add htbl key r
    in
    List.iter f all_registers ;
    Hashtbl.find htbl

Then pass it all the projections you need:

let of_int = index (fun r -> r.index)
let of_name = index (fun r -> r.name)
Pascal Cuoq
However, I don't see how I could implement my `of_int` or `of_string` function using this representation. It would perfectly solve the `to_int` and `to_string` problems though :) .
Pierre Bourdon
Yes, you still need those, I had forgotten about that aspect of the question. Now that I think about it, doesn't `Hashtbl.add` take one more argument than in the sample code you gave?
Pascal Cuoq
I edited my answer with code that, while being untested, at least applies `Hashtbl.add` completely :)
Pascal Cuoq
+3  A: 

Looks like a perfect fit for deriving.

(*
 * Defines a type which represents a R3000 register.
 *)

type t =
|   R0                                      (* Always 0 *)
|   AT                                      (* Assembler temporary *)
|   V0 | V1                                 (* Subroutine return values *)
|   A0 | A1 | A2 | A3                       (* Subroutine arguments *)
|   T0 | T1 | T2 | T3 | T4 | T5 | T6 | T7   (* Temporary registers *)
|   S0 | S1 | S2 | S3 | S4 | S5 | S6 | S7   (* Register variables *)
|   T8 | T9                                 (* Temporary registers *)
|   K0 | K1                                 (* Reserved for kernels *)
|   GP | SP | FP                            (* Global/Stack/Frame pointer *)
|   RA                                      (* Return address *)
deriving (Enum,Show)

let of_int x = Enum.to_enum<t>(x)
let to_int x = Enum.from_enum<t>(x)

let to_string x = Show.show<t>(x)

let pr = Printf.printf

let () =
  pr "%i %i %i\n" (to_int R0) (to_int RA) (to_int T8);
  pr "%s %s %s\n" 
    (to_string (of_int 0)) (to_string (of_int 31)) (to_string (of_int 24));
  pr "%s %s %s\n" 
    (to_string (Enum.pred<t>(A1))) (to_string A1) (to_string (Enum.succ<t>(A1)));
  ()

Output :

0 31 24
R0 RA T8
A0 A1 A2

Compile with :

 ocamlc -pp deriving -I ~/work/contrib/deriving/0.1.1-3.11.1-orig/lib deriving.cma q.ml -o q
ygrek
This is more than interesting ! I didn't know of the existance of this project, will have more than a look at this to see if I can customize it enough to fit my needs. Thanks a lot.
Pierre Bourdon