With recv() and nonblocking the system will give you whatever number of bytes are currently available when you make the call (up to the size of the buffer which you provide).
So, if your sender for example sends 4K, this will be broken into smaller packets over the IP stream and if your call to recv() happens when just the first packet has arrived you will only get that (the rest of the data will arrive milliseconds or seconds later, depending on the latency of the network). Subsequent calls to recv() will then get none, some or all the rest of the data (depending on network timing, buffering, etc.).
Also (with non blocking IO) recv() can come back with an rc of EAGAIN which means no data is available at this point. In that case usually you use a call to select() to wait until something is available and call recv() again.
What may also help is this call:
int err, avail= 0;
err= soioctl(skt FIONREAD, (char*)&avail);
This call checks upfront how many bytes are available for read, so when you subsequently call recv() you will be served at least this number of bytes (assuming the buffer you provide to recv() is large enough).
Regarding your question on how to stop:
Usually with data over an IP stream, there is a way to tell if the message is complete. E.g. with many RFC communications (e.g. talking to a mail server or http server) commands are terminated with \n so if you receive data and there's no \n in it, you continue to read because there's supposed to be more.
In other forms the beginning of the data will have some way of telling you how much data will follow. Headers for HTTP requests have a size field or when making a connection over a SOCKS proxy, the request and return packets have defined lenghts. In such caes you can loop until you have that amount data (or until the connection fails).
But that is really a matter of defining the protocol on the stream, i.e. the sender and receiver must establish some sort of agreement how data is sent. An IMAP server reads commands up to \n and processes them, a print server might read all data until the connection is terminated (recv() failing with an rc<0) and another protocol may just first two bytes as a length field to tell how many data will follow.