There are some different types of overhead that can occur when using a DataSet over a DataReader:
A DatSet contains DataTable objects, which contains DataRow object, that contain the data. There is a small overhead creating all the objects. Each DataRow treats all it's values as objects, so any value types are boxed which adds a bit of overhead for each field.
When you use a DataAdapter to populate a DataSet, it's easy to get a lot of data that you won't use. If you don't specify what fields you want, you get all the fields even if you won't use them all. If you don't filter the query, you get all the rows from the table. Even if you filter them later with a DataView on the DataTable, you still have fetched them from the database. With a DataReader you are closer to query that gets the data, so the connection to what you get in the result is more obvious.
If you fetch data into several DataTable objects in a DataSet and use relations to let the DataSet combine the data, you make the DataSet do work that you could have let the database do, which is more optimised for it.
If you use a DataSet well, the overhead is not that bad, rather 30% than 1000%.
You are correct to assume that a DataAdapter uses a DataReader. If you are careful how you use the DataAdapter, the database operations itself is the same as if you use the DataReader yourself.
A DataReader will fetch a record at a time from the underlying database driver, which in turn will fetch a buffer full of records at a time from the database. If the records are very large only one at a time might fit in the buffer, but usually there are tens of records in the buffer or even hundreds if they are really small.