tags:

views:

167

answers:

5

So for my programming class we have had a project to create a virtual machine including a memory unit, cpu, Input, Output, Instruction Register, Program Counter, MAR, MDR and so on. Now we need to create a compiler using Java Code that will take a .exe file written in some txt editor and convert it to java byte code and run the code. The code we will be writing in the .exe file is machine code along the lines of:

IN X
IN Y
ADD X
STO Y
OUT Y
STOP
DC X 0
DC Y 0

I am just a beginner and only have 2 days to write this and am very lost and have no idea where to start....Any Help will be much appreciated. Thanks

Ok seeing no one really understands I will clarify......I am in my first year programming course and my teacher had us make a Virtual Machine which I have done and I will post the code for the CPU and Computer Classes but my teacher is very unorganized and we have run out of time for the last project which is the compiler.....The code above is just an example of the code that will be turned into byte code...here is the code for CPU and Computer in my Virtual Machine Package...

class Cpu{
    private MemEl acc;
    private InstReg ir;
    private ProgCount pc;
    private Input in;
    private OutPut out;
    private MemEl mdr;
    private MemEl mar;
    public Cpu()
    {
        pc = new ProgCount();
        ir = new InstReg();
        acc = new MemEl();
    }
    public Boolean stop()
    {
        return ir.getOpcode() == 0;
    }
    public int getMAR()
    {
        return ir.getOpcode();
    }
    public int getMDR()
    {
        return mdr.read();
    }
    public void setMDR(int n)
    {
        mdr.write(n);
    }
    public boolean OutFlag()
    {
        return ir.getOpcode() == 8;
    }
    public boolean InFlag()
    {
        return ir.getOpcode() == 7;
    }
    public boolean StoreFlag()
    {
        return ir.getOpcode() == 2;
    }
 public void fetch()
    {
        mar.write(pc.getValue());
        pc.plus();
    }
    public void reset()
    {
        mar.write(0);
        pc.write(0);
        pc.write(1);
    }
    public void fetch2()
    {
        ir.write(mdr.read());
    }
    public void decode()
    {
        mar.write(ir.getOperand());
        mdr.write(acc.read());
    }
 public void execute()
    {

        switch(ir.getOpcode()){
        case 0:
            System.out.println("Complete");
            break;
        case 1:
            acc.write(mdr.read());
            break;
        case 2:
            acc.write(ir.getOperand());
            break;
        case 3:
            acc.write(acc.read() + mdr.read());
            break;
        case 4:
            acc.write(acc.read() - mdr.read());
            break;
 case 5:
            acc.write(acc.read() * mdr.read());
            break;
        case 6:
            acc.write(acc.read() / mdr.read());
            break;
        case 7:
            mar.write(ir.getOperand());
            break;
        case 8:
            System.out.println(getMDR());
            break;
        case 9:
            pc.write(getMDR());
            break;
        case 10:
            if(0 == acc.read())
                pc.write(getMDR());
            else
                fetch();
            break;
        case 11:
            if(0 < acc.read())
                pc.write(getMDR());
            else
                fetch();
            break;
        }

    }

Here is my Computer Class

import java.io.*;
class Computer{
    private Cpu cpu;
    private Input in;
    private OutPut out;
    private Memory mem;
    public Computer() throws IOException
    {
        Memory mem = new Memory(100);
        Input in = new Input();
        OutPut out = new OutPut();
        Cpu cpu = new Cpu();
        System.out.println(in.getInt());
    }
    public void run() throws IOException
    {
        cpu.reset();
        cpu.setMDR(mem.read(cpu.getMAR()));
        cpu.fetch2();
        while (!cpu.stop())
            {
                cpu.decode();
                if (cpu.OutFlag())
                    OutPut.display(mem.read(cpu.getMAR()));
                if (cpu.InFlag())
                    mem.write(cpu.getMDR(),in.getInt());
                if (cpu.StoreFlag())
                    {
                        mem.write(cpu.getMAR(),in.getInt());
                        cpu.getMDR();
                    }
                else
                    {
                        cpu.setMDR(mem.read(cpu.getMAR()));
                        cpu.execute();
                        cpu.fetch();
                        cpu.setMDR(mem.read(cpu.getMAR()));
                        cpu.fetch2();
                    }
            }
    }
public void load()
    {
        mem.write(0,799);
        mem.write(1,199);
        mem.write(2,1009);
        mem.write(3,398);
        mem.write(4,298);
        mem.write(5,199);
        mem.write(6,497);
        mem.write(7,299);
        mem.write(8,902);
        mem.write(9,898);
        mem.write(97,0);
        mem.write(98,0);
        mem.write(99,1);
    }

}

The Load method is just a temporary method, just to see if the machine works...what it will load is bytecode formed by the compiler.

A: 

If you need to convert something into Java bytecodes, you have to do the following (at least!)

  1. Learn the java bytecodes standard! (unless it's mockup java bytecodes invented for class?)
  2. Parse the input ".exe" (not a good extension name imho) using a StringTokenizer or similar class
  3. Alternatively, use a lexical analyzer to determine what to write for the code.
  4. Format the output using what you have learnt from reading about Java bytecodes' standard

And "that's about it" - but it sounds like your project could use a little more time, unless you are extremely experienced in this topic...

Etamar L.
Actually, I'd say that anybody who could have done the previous project, or contributed significantly to it, could do the assignment as described in hours. The CPU simulator is a large class project, and a real assembler would be unreasonable to write that fast, but not this.
David Thornley
Yep. But I want to be polite about it.
Etamar L.
A: 

If I correctly understoond you, you should:

  • create lexical analyzer. For given text it produces a sequence of lexems.
  • create syntax analyzer. It will produce syntax tree.
  • create interpreter, that goes through the tree and generate code.
  • create VM that run this generated code.
Alex Stamper
+1  A: 

You're a beginner and you have two days to write a compiler?

Wow. Hope your last name is "Knuth".

You should certainly have read this. One of its links is to a list of Java bytecode instructions.

You'll need to know how your "machine code" file instructions map to that. Does "DC" equate to "double compare"? If so, is it dcmpg (hex 98) or dcmpl (hex 98)? And so on.

And seriously? Good luck.

CPerkins
+2  A: 

This seems pretty ambitious based on your question and the time you have to do it, but I'll try to put you on the right track. Obviously, since it's homework no one will just give you the answer ;)

The code in your example would more accurately be termed "assembly code". Look at it this way; you have to:

  • Read in each line
  • Look at the first "word" (instruction or operator) and equate that to a Java bytecode. Look here.
  • Figure out how many arguments (operands) should be read in for the operator.
  • Make sure the rest of the line contains the appropriate number of operands.
  • Write out the bytecode in the proper order according to the Java spec.
  • Load the bytecode into the VM and run it

The assembly code in your example looks like it has some explicit rules that the instructor probably gave to you. For instance, "ADD X" means add the contents of location X to -- what? Does "IN" mean "input" or "increment"? "STO Y" means store something to location Y -- what? It seems like maybe there's an implicit register that holds results. That should be part of the instructor's specification, too. Good luck! Get hacking!

Rob Heiser
+1  A: 

I don't think this is Java byte code that you are trying to create, right? This is actually byte-code for your specific CPU. This greatly simplifies your project. This also looks like it may have a stack architecture, thus you only have single operands.

Since your input is text and just the assembly language defined for your custom CPU you should be able to read and parse the text quite easily and write out the binary. Here's some pseud code that should help.

initialize Map of instruction names (keys) to instruction op codes (ByteCodeInfo);
initialize empty bytecode-operations list;
open input text file;
while (more to read)
{
    read next line;
    split line by spaces;
    lookup ByteCodeInfo in the Map;
    if (num actual operands != num expected operands - from ByteCodeInfo)
        throw exception(parse failed on line ####);
    add new operation to list of operations (each element in the list is an address)
    if there is a variable reference (e.g. "X") add this to a symbol Map;
    if this is a variable declaration (DC...) update the symbol object with the address;
}
close input text file;

open output binary file (the byte-code file);
for each element in operation list
{
    write address, byte-code, operands (if any);
}
close byte-code file;

You will have to keep track of your storage addresses and instruction addresses

This is not an impossible task so take courage and it is possible to do this in a day or so, if you have experience with creating those other classes you show.

EDIT: Added ByteCodeInfo class which represents information about your byte-codes, such as the id, number of operands, expected types of operand, etc. This class could also be used to emit the byte code based on the parsed line information. This would provide a better abstraction then just storing an opcode int in the Map as I had original suggested.

Kevin Brock