>>14261747Simple. Because we're shrinking the information space.
Remember P(A|B) implies we KNOW B so we don't have to worry about all the probabilities associated with P(B). We're not subtracting P(B) either because we aren't saying the probability of B not happening. We're saying we know B, we're shrinking the space of possibility by B and thus increasing the proportion of our probability that P(A And B) occurred.
Think of it like a pie chart that's evenly split into Red Blue and Purple (Red/Blue). If we are given that we're already in a red region, then the probability of Blue is P(R AND B)/P(R) or 1/2. If we aren't given that information then our probability of being in blue is 2/3 since two regions have blue in it.
It makes sense if you visualize being given information as shrinking the space of the problem and then looking at the appropriate area then.