Do the agents store its state on the device itself?
You can store data on or off device. Both are possible and both are done. The problem with an agent storing (cached) state information about a remote device is that the management system never really knows if the (cached) data in the agent is acceptably up to date. If you cannot count on it, you'll need to use the manager to trigger a synchronization or to poll the state of the remote device and/or the communication link between the agent and the remote device. Once you get into that game, it is often better just to put a subagent on the remote device, and use standard SNMP protocols to get the information.
If there is a trap set on an agent, can you do a poll on the same OID to get the same information?
Most well-designed MIBs actually put the changed MIB object right into the trap. That way, your SNMP Manager does not have to poll the agent just to be sure.
Having said that, the trap on the Entity-MIB does not have any state variables. However, that MIB is used to describe physical inventory such as shelves, cards, and ports, and the trap is thrown only if the physical configuration changes. In this case you are expected to have your SNMP Manager walk the Entity-MIB again to get the full new physical configuration.
Without using a mib file, is there a way to query a device for all of its information at once?
Yes. Roll your own custom MIB and put whatever you want in it. You could put your entire device configuration into one MIB object. The down side of this is that you'll have to write a parser on your SNMP Manager to parse out the structure, and if the structure changes you'll need to figure out the meaning of the difference between the current value and the previous value. i.e. You'll re-invent some SNMP MIB. However, for very small MIBs, this might be worth doing.
You are probably better off using SNMP GET-BULK, or just doing a MIB walk by successively calling SNMP-GET-NEXT until no more objects are returned.
If not, and you're writing your own customized manager, do you have to know the structure of what it reports up front?
If you want to keep your "customized manager" simple, you'll have to know the structure up front. If you want flexibility, you'll need structure-description language with which to encode your structure, and your manager will need to be able to decode this from the agent data and populate the manager, and take data from the manager and encode it in this format to send it to the agent(s). i.e. You'll re-invent SNMP/SMI, CMIP/CMISE, CIM, and a host of other management systems and protocols that have already been deployed.
If you're setting up an agent to report, is there usually a way to control the frequency of how often it sends a trap? Or does it usually send a trap as often as some condition is satisfied?
This is a good question, because you often get a trap storm congesting your network precisely when you need your network the most. That makes it hard to predict how much network to provision.
Use traps judiciously. For example, the Entity-MIB only has one trap, and that one is worth using because it reports on physical structure changes. The Interfaces-MIB has potentially many traps per port. For this MIB, it is best just to enable traps for the interface bound to a physical port, and not for interfaces stacked on top of lower layer interfaces. For a large network, it is often best just to use a combination of polling plus traps for physical equipment and physical interfaces. That way, you can predict how much of your network will be used for management traffic whether during normal operation or during a network disaster.
Some standard MIBs specify how often or when you can throw a trap. If you are OK with that, then use it. You can always roll your own Enterprise MIB with configuration MIB objects that let your manager throttle particular traps.