That's an old(ish) version. They've probably done a bit of optimization. For example, with the (speed 3) optimization, and letting the compiler know that sum is a fixnum, I can get it down to <2 billion cycles: See my answer here: http://stackoverflow.com/a/18065714/2423072
Consing sounds like it's allocating bignums. My guess is that you're using a 32-bit build of SBCL. In that case, fixnums only go up to something like 2^30, and arithmetic with larger numbers will allocate memory. Can you check?
(time (let ((sum 0)) (loop :for x :from 1 :to 1000000000 :do (incf sum x)) sum))
took about 3 seconds from his REPL with SBCL, with about 8.5 billion CPU cycles and 0 bytes consed.
Does anyone know why the same code on my version of SBCL (1.0.55.0-abb03f9) on a Mac took 156 billion cycles and consed 24 billion bytes?