You could use sum, but that will eat up a lot of RAM because of the laziness.
EDIT: For the fun of it, I decided to do the same in a slightly more esoteric language, so here's a Prolog version (given that your stack is big enough)
rangesum(0,0).
rangesum(N,X) :- M is N - 1, rangesum(M,Y), X is Y + N.
?- rangesum(1000000000, X), write(X).
Pass GHC the -O2 option to turn on optimizations. You need the strictness analyzer to run so that it can determine that sum is strict, otherwise you get a space leak.
GHC's strictness analyzer will recognize that sum is strict for Integer and optimize the laziness away. It will only cause a space leak when optimizations are turned off.
"The key in this case is using C99's long long data type. It provides the biggest primitive storage C can manage (128-bits on 32-bit machines and 256-bits on a 64-bit machine) and it runs really, really fast."
Isn't long long "typically" 64-bit? (I know the C standard doesn't actually specify any actual size).
What platform does this long long type really give you the full 128 or 256 bits on?
And do 64-bit CPU's indeed support 256 bit integer types? If so, what can I do to play with it!! C does not provide it for me on Linux!
If you want a guaranteed 64-bit type, put in your code:
#include <stdint.h>
then, use uint64_t for unsigned and int64_t for signed. If you want 128 bits, in gcc you can use __uint128_t (it has two extra underscores at the beginning because that size is nonstandard), but I don't think there is support for 256 bit integers.
long long is merely required to be at least 64-bit and in practice is 64-bit on every platform. AFAIK no general-purpose processors support 256-bit ints. AVX2 has 256-bit vector operations, but that is not at all the same thing.
This issue happens on 32 bits builds of PHP and nodejs : The language switches to a floating point representation when the result of some operation exceeds INT_MAX.
In 64 bits PHP builds, the computation is done right.
32-bit or 64-bit won't matter for Node.js. The Number type in JS is specifically defined as using the 64-bit floating point format as defined by IEEE 754, except that all NaNs are coerced to a single value. In terms of the abstraction, there is never a cast when the value overflows; it should just always be considered a double. Under the hood, there may be differences in how the number is actually being treated.
Presumably because it allows you to do anything involving double precision or 32-bit integer arithmetic, and performance was not originally a major consideration. It's pretty rare to need more than 53 bits of precision (and was even rarer for JS's original intent), so it makes sense that the numeric type is kept simple. Edit: and to clarify, the advantage is that this makes basic implementation extremely simple. Only if you want to optimize your engine's performance do you have to worry about shuffling types around.
These days the solution for having more precision is to use an external library. I think that's generally fine, although I think performance is a concern. Financial applications aside, working with arbitrary precision is a good hint that you might be doing something processor intensive. It's certainly a case where I'd like the library to be compiled to target asm.js, and maybe optionally NaCL, once those have widespread adoption. Ideally, ECMAScript would also have a native implementation, but that won't eliminate the need for a library for shimming for years to come.
Performance was always enough of a consideration that even BE's original implementation had both int32 and double types internally, though black-box unobservable (as in the black box, everything appears as a double).
As a side note, I'll bet you could have actually observed the difference via timing at the time, assuming you knew what hardware you were working on. On an early Pentium, a floating point add would have taken up to 3 times as long as an integer add (depending on implementation), so by comparing in a loop, you might be able to tell if a given value was being treated as an integer or a double.
Floating point numbers ("floats") work like the scientific notation (e.g. "12 * 10^-3" ). They have an mantissa and an exponent. Double precision (64 bit) floats have a 52 bits integer mantissa, a sign bit, and 11 bits used to represent the exponent, or to encode NaN values ("Not a Number"), like the result of "0/0".
> Why [only floats in JavaScript] ?
The language was supposed to have a low barrier of entry, and having a single number type was thought to be less complex to explain, even though floats have some weird corner cases regarding rounding when you don't understand how they are implemented. Beyond the loss of precision in large numbers, some rationals that have a finite representation in base 10 have an infinite representation in base 2.
For example, 0.2 in base 10 is 0.00110011... in base 2. It is thus rounded to 52 bits. Who says rounding says rounding error.
Here's a Node.JS session that demonstrate the behaviour:
> e = 0.2
0.2
> e = e + 0.2
0.4
> e = e + 0.2
0.6000000000000001
> e == 0.6
false
> e = e + 0.2
0.8
> e = e + 0.2
1
> e == 1
true
As for why? I have no clue. It's specified in the specification. That's all. There are ToInteger(), ToInt32(), ToUint16(), ToUInt32 functions defined as well, but I think that's for the host to implement
EDIT: oops. there isn't a ToInt64() function defined.
I think I might be doing it wrong, because I didn't do any looping. I'm a bit too lazy and impatient for that, who wants to spend their afternoon adding all those numbers up even with a computer.
[joe24pack@staropramen ~]$ python
Python 2.6.6 (r266:84292, May 1 2012, 13:52:17)
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def gauss(x):
... return (x+1)*(x/2)
...
>>> gauss(10)
55
>>> gauss(1000000000)
500000000500000000
>>>
According to the time it took to my macbook air to calculate it in ruby, it would be pretty interesting to have someone generate benchmarks for different languages :)
1E9 is 1,000,000,000
⍳1E9 generates a vector containing integers from 1 to 1,000,000,000 (inclusive)
+/ is the sum of the vector
Any good APL interpreter will not actually generate a vector with a billion numbers but rather recognize the above expression and optimize the resulting operation for speed and minimal resource utilization.
EDIT:
Technically the "/" is the "reduction" operator acting along the last axis. In the case of a single dimensional array it acts along the only available axis. If, instead, it was acting on a matrix it would reduce along the columns. Here's a longer annotated example with output from the interpreter:
Generate a vector from 1 to 10:
⍳10
1 2 3 4 5 6 7 8 9 10
Sum:
+/⍳10
55
Generate a vector of ten one's and zero's, repeating until the end:
10⍴1 0
1 0 1 0 1 0 1 0 1 0
Now use that vector to reduce the original 1 to 10 vector, grabbing every other
element starting with the first. The effective result is that you end up with
all the odd numbers between 1 and 10:
(10⍴1 0)/⍳10
1 3 5 7 9
Same things, now grabbing the even numbers by flipping the 1 0 sequence to 0 1:
(10⍴0 1)/⍳10
2 4 6 8 10
Sum of all odd integers between 1 and 10:
+/(10⍴1 0)/⍳10
25
Sum of all even integers between 1 and 10:
+/(10⍴0 1)/⍳10
30
One could create a single vector with the odd integers between 1 and 10 followed by
the even integers between 1 and ten by simply concatenating the generating
expressions (APL executes from right to left):
((10⍴1 0)/⍳10),(10⍴0 1)/⍳10
1 3 5 7 9 2 4 6 8 10
And then you can reshape ("⍴") the result into a matrix:
2 5⍴((10⍴1 0)/⍳10),(10⍴0 1)/⍳10
1 3 5 7 9
2 4 6 8 10
Finally, use the scan operator again, now applied to a matrix, to sum along the
columns and produce a result for each row. The effect is to output a two element
vector with the sum of the odd integers between 1 and 10 as the first element
and the sum of all the even integers between 1 and 10 as the second:
+/2 5⍴((10⍴1 0)/⍳10),(10⍴0 1)/⍳10
25 30
If you want to try this type the lines immediately following my comments above right into the interpreter. The rho "⍴" or reshape operator is entered by typing ALT+r.
Hope this helps make sense of it. Of course, there are other ways to accomplish the same thing.
Seeing that some are adding timings to the posted solutions.
The APL solution I posted above (+/⍳1E9) takes 56 microseconds to execute on my system (checked by solving it 100,000 times in a loop).
The interpreter is obviously not doing a huge memory and clock-cycle sucking expansion of a one billion element vector.
This highlights another advantage of a symbolic language: Idioms or patterns within code can be recognized replaced with equivalent highly-efficient, highly-tuned operations. The above example is obviously solved by having the interpreter swap it out for the well-understood mathematical solution.
Because of this the programmer can focus on the problem space rather than having to dork around with figuring out optimizations. Granted, this is a simple one for anyone with a decent math background, but it can get far more complex. Have a look at the famous Finn APL idiom library:
An interpreter designer might very well decide to detect a good number of these idioms and execute highly tuned code instead of the memory and CPU-cycle hogging expansions that might result from running the actual code as written.
In many ways I equate this to what happens when using a language like Verilog to design FPGA circuits. You are designing hardware, not software. FPGA compilers have inference engines that recognize certain structures to mean specific circuit constructs. It's a contract. We agree that when I write this I mean to ask that you instantiate that and everyone is happy.
Python 2.7 (Mid 2012 Macbook Pro, 2.5 GHz i5, 8GB, not using SSD)
sum + xrange (consumes ~20MB virtual memory):
$ time python2.7 -c "print sum(xrange(1,1000000001))"
500000000500000000
python2.7 -c "print sum(xrange(1,1000000001))" 11.06s user 0.02s system 99% cpu 11.089 total
reduce + xrange (consumes ~20MB virtual memory):
$ time python2.7 -c "print reduce(lambda a, b: a + b, xrange(1,1000000001))"
500000000500000000
python2.7 -c "print reduce(lambda a, b: a + b, xrange(1,1000000001))" 128.74s user 0.13s system 94% cpu 2:16.51 total
My machine swapping like crazy for more than an hour when I try using range(). I suspect it hasn't even finished allocating the list when I kill the process after it consumes >30GB virtual memory.
$ time python2.7 -c "print sum(range(1,1000000001))"
3 milliseconds, man, the C optimizers blow my mind, they basically just cheat and stick the answer in there. :-)
[C]cat sum.c
#include <stdio.h>
int main(void)
{
unsigned long long sum = 0, i;
for (i = 0; i <= 1000000000; i++) //one billion
sum += i;
printf("%lld\n", sum); //500000000500000000
return 0;
}
This is perl of course but it is exploiting the fact that the sum of integers is a sum of constants ($max + 1) with an additional term (the 'middle' integer) if the top number is odd.
That's an old(ish) version. They've probably done a bit of optimization. For example, with the (speed 3) optimization, and letting the compiler know that sum is a fixnum, I can get it down to <2 billion cycles: See my answer here: http://stackoverflow.com/a/18065714/2423072
Consing sounds like it's allocating bignums. My guess is that you're using a 32-bit build of SBCL. In that case, fixnums only go up to something like 2^30, and arithmetic with larger numbers will allocate memory. Can you check?
This casts only $sum to int. If $i is a float, the result is a float.
With 32 bits, $sum will be a float bigger than PHP_INT_MAX at some point, and the cast will truncate it, which is unlikely to give the right answer.
With 64 bits, which I assume is the platform you tested on, $sum is never bigger than PHP_INT_MAX, so it's never converted to a float, and the (int) cast does nothing (and the computation is done right).
Correction, properly learning about data types and binary representation is critical to "get" programming.
I've found that web developers often undervalue (or even scoff at) web developers with CS degrees, but any first-year CS/EECE student at a decent university (or even someone who has taken a couple of Coursera CS classes) would have instantly recognized this to be a floating point related issue. Being able to whip up some scaffolds in Rails is great and immensely useful, but I'm of the opinion that a great Rails programmer will need a proper understanding of the C & Assembly that's behind all the magic.
It's both, really. Knowing the correct data type won't get you anywhere if you use a language like PHP and assume that it will always correctly guess the required data type for a given problem.
I find that when it comes to programmers who reach a certain level of proficiency, it doesn't matter whether they learned these things through a CS degree, or through hard knocks.
I started out with VBScript and ASP 1.0. I didn't get a CS degree, but I ended up learning about floating point errors rather quickly. This lead me to learn more about data types. Similar result, different path. I ended up paying for my error, much like someone would pay for an education.
There are certainly bad programmers without CS degrees who make these mistakes, but I know plenty of programmers who hold a CS degree and will consistently misuse float unless reminded to use something more appropriate.
Interesting, sure, but incredibly hard to create with an uncoordinated Internet community. It'd take 1 good programmer/writer, ideally.
Without a clear, quantitative standard as to which Python* implementation is best, you'll find: a one-liner that claims to be 'pythonic', a 150-liner that's slightly more efficient, and a 30-liner that looks like it was written in C. The more popular the language, the wider the variety of responses. The more helpful the language (that is, 'newbies' can contribute solutions to otherwise difficult problems), the more solutions. The more divisive the language, the more bickering you'll have.
The Programming Language Shootout doesn't suffer this fate because it is 0% subjective. Discussion is only interesting with subjectivity, but technical discussion is presumed to be mostly objective. StackOverflow treads the line carefully.
1 writer could do it. 2, if they don't disagree on anything.
* I don't mean to pick on Python. Python is great. Insert language-of-choice
I don't think it would be that interesting - and I don't think we need to rediscover the fact that some languages use IEEE754 as the default number type over and over again
I agree that stackoverflow.com sometimes is a little too trigger-happy when it comes to closing questions, but we really don't need a community-wiki top answer with 1845 upvotes that is basically a rosetta code page, spawing dozens of copy-cat questions with slight variations on the theme.
It's not right. You should add one to the bound because "range" uses half-open intervals. Also, it only works in Python 3; in Python 2, it creates a list of a 1000000000 integers, so you should use "xrange" instead.
I don't. I'm far from an expert on high-performance Clojure. (I'm really glad that there is such a thing, and that people focus on it, however.) Joy of Clojure and Programming Clojure get into optimizations a little bit, but I think that field is still fairly new.
Sometimes with seqs one can end up with a "holding head" problem; if you're doing stream processing but holding on to a seq, you can end up having the whole thing in memory, which would kill you. That's not what's happening there, though; a default-configured JVM can't hold anything close to a billion longs in memory.
One of the neat things is that, because the REPL actually compiles code (there's no interpreter) you get the same performance with the time macro as you would get in compiled code. What that means is that testing for performance can be done at the REPL and quickly.
To explain what I did and why, I figured that the tight loop would be optimized to Java-like performance. With the more elegant formulation, I didn't know what was going on in terms of types (how is +, a vari-aritied function with many type signatures, being handled)? If the loop performed poorly, I'd probably have put type hints on the arguments and replaced + with unchecked-add; but it performed well so I left it as it was.
EDIT: For the fun of it, I decided to do the same in a slightly more esoteric language, so here's a Prolog version (given that your stack is big enough)