When you plot some resource use (memory, time, disk space, network bandwidth) against concurrent users, you get a function that describes how the application works at different scale factors.
Small-scale -- a few users -- uses a few resources.
Large-scale -- a large number of users -- uses a large number of resources.
The critical question is "how close to linear is the scaling?" If it scales linearly, then serving 2,000 concurrent users costs 2 times as much as serving 1,000 users and 4 times as much as serving 500 users. This is a tool/framework/language/platform/os that scales well. It's predictable, and the prediction is linear.
If it does not scale linearly, then serving 4,000 users costs 1,000 times as much as serving 2,000 users which cost 100 times serving 500 users. This did not scale well. Something went wrong as usage went up; it does not appear predictable and it is not linear.