Big O refers to the worst case run-time order. It is used to show how well an algorithm scales based on the size of the data set (n->number of items).
Since we are only concerned with the order, constant multipliers are ignored, and any terms which increase less quickly than the dominant term are also removed. Some examples:
A single operation or set of operations is O(1), since it takes some constant time (does not vary based on data set size).
A loop is O(n). Each element in the data set is looped over.
A nested loop is O(n^2). A nested nested loop is O(n^3), and onward.
Things like binary tree searching are log(n), which is more difficult to show, but at every level in the tree, the possible number of solutions is halved, so the number of levels is log(n) (provided the tree is balanced).
Something like finding the sum of a set of numbers that is closest to a given value is O(n!), since the sum of each subset needs to be calculated. This is very bad.