Another subtle but important point is that the Julia code can be very easily made completely generic without sacrificing any performance or clarity. All you need to do is remove those Float64 annotations and replace the 0.0 values with `zero(y[1]*weights[1])`. With that small change, the same exact code will work for Float32, Rational, BigFloat, Complex (if that even makes sense), etc. – even new, user-defined numeric types that didn't exist when you wrote this code and that you know nothing about, as long as they can be compared, multiplied, added, and divided.
This kind of generality isn't necessary for a lot of user-level code, where there's a specific application, but it is quite important when writing libraries that need to be as reusable as possible. And this is where the Cython solution has some problems – sure, you can make it pretty fast, but in doing so, you lose the generality that the original Python code had, and your code becomes as monomorphic as C.
Cython does have limited support for generics in the form of its fused types [1]; it can be made to generate C code for different primitive types, for instance. It's nowhere near as flexible as Julia in that regard, since you can't redefine operators for working on Complex numbers for instance, but if you want the interoperability and maturity (read: available programmers and existing libraries) that Python offers, you don't need to trade away generics completely. I'd love to live in a completely-Julia ecosystem, but unfortunately that's not quite reality yet, and luckily Cython is more than capable for making do.
This kind of generality isn't necessary for a lot of user-level code, where there's a specific application, but it is quite important when writing libraries that need to be as reusable as possible. And this is where the Cython solution has some problems – sure, you can make it pretty fast, but in doing so, you lose the generality that the original Python code had, and your code becomes as monomorphic as C.