Why do you need to eliminate the low-level locking? Do you have deadlock issues? Do you have performance problems? Or scaling issues? Are the locks generally contended or uncontended?
What environment are you using? The answers in C++ will be different to the ones in Java, for example. E.g. uncontended synchronization blocks in Java 6 are actually relatively cheap in performance terms, so simply upgrading your JRE might get you past whatever problem you are trying to solve. There might be similar performance boosts available in C++ by switching to a different compiler or locking library.
In general, there are several strategies that allow you to reduce the number of mutexes you acquire.
First, anything only ever accessed from a single thread doesn't need a mutex.
Second, anything immutable is safe provided it is 'safely published' (i.e. created in such a way that a partially constructed object is never visible to another thread).
Third, most platforms now support atomic writes - which can help when a single primitive type (including a pointer) is all that needs protecting. These work very similarly to optimistic locking in a database. You can also use atomic writes to create lock-free algorithms to replace more complex types, including Map implementations. However, unless you are very, very good, you are much better off borrowing somebody else's debugged implementation (the java.util.concurrent package contains lots of good examples) - it is notoriously easy to accidentally introduce bugs when writing your own algorithms.
Fourth, widening the scope of the mutex can help - either simply holding open a mutex for longer, rather than constantly locking and unlocking it, or taking a lock on a 'larger' item - the object rather than one of its properties, for example. However, this has to be done extremely carefully; you can easily introduce problems this way.