Hacker News new | past | comments | ask | show | jobs | submit login

> What do you mean by "one new user a second"?

On average, once a second a user comes to the, call it, the home page of the site, and is a "new" user in the sense of the number of unique users per month. The ad people seem to want to count mostly only the unique users. At my site, if that user likes it at all, then they stand to see several Web pages before they leave. Then. with more assumptions, the revenue adds to the number I gave.

At this point this is a Ramen noodle budget project. So, no racks for now. Instead, it's mid-tower cases.

One mid-tower case, kept busy will get the project well in the black with no further problems about costs of racks, Xeon processors, if they are worth it, etc.

Then the first mid-tower case will become my development machine or some such.

This project, if successful, should go like the guy that did Plenty of Fish, just one guy, two old Dell servers, ads just via Google, and $10 million a year in revenue. He just sold out for, $575 million in cash.

My project, if my reading of humans is at all correct, should be of interest, say, on average, once a week for 2+ billion Internet users.

So, as you know, it's a case of search. I'm not trying to beat Google, Bing, Yahoo at their own game. But my guesstimate is that those keyword/phrase search engines are good for only about 1/3rd of interesting (safe for work) content on the Internet, searches people want to do, and results they want to find.

Why? In part, as the people in old information retrieval knew well long ago, what keyword/phrase search needs are three assumptions: (1) the user knows what content they want, e.g., a transcript of, say, Casablanca, (2) know that that content exists, and (3) have some keywords/phrases that accurately characterize that content.

Then there's the other 2/3rds, and that's what I'm after.

My approach is wildly, radically different but, still, for users easy to use. So, there is nothing like page rank or keyword/phrases. There is nothing like what the ad targeting people use, say, Web browsing history, cookies, demographics, etc.

You mentioned probability. Right. In that subject there are random variables. So, we're supposed to do an experiment, with trials, for some positive integer n, get results x(1), x(2), ..., x(n). Then those trials are supposed to be independent and the data a simple random sample, and then those n values form a histogram and approximate a probability density.

Could get all confused thinking that way!

The advanced approach is quite different. There, walk into a lab, observe a number, call it X, and that's a random variable. And that's both the first and last hear about random. Really, just f'get about random. Don't want it; don't need it. And those trials, there's only one, for all of this universe for all time. Sorry 'bout that.

Now we may also have random variable Y. And it may be that X and Y are independent. The best way to know is to consider the sigma algebras they generate -- that's much more powerful than what's in the elementary stuff. And we can go on and define expectation E[X], variance E[(X - E{X})2], covariance E[(X - E[X])(Y - E[Y])], conditional expectation E[X|Y}, convergence of sequences of random variables, in probability, in distribution, in mean-square, almost surely, etc. We can define stochastic processes, etc.

With this setup, a lot of derivations wouldn't think of otherwise become easy.

Beyond that, there were some chuckholes in the road, but I patched up all of them.

Some of those are surprising: Once I sat in the big auditorium as the NIST with 2000 scientists struggling with the problem. They "were digging in the wrong place". Even L. Breiman missed this one. I got a solution.

Of course, users will only see the results, not the math!

Then I wrote the software. Here the main problem was digging through 5000+ Web pages of documentation. Otherwise, all the software was fast, fun, easy, no problems, no tricky debugging problems, just typed the code into my favorite text editor, just as I envisioned it. Learning to use Visual Studio looked like much, much more work than was worth it.

I was told that I'd have to use Visual Studio at least for the Web pages. Nope: What IIS and ASP.NET do is terrific.

I was told that Visual Studio would be terrific for debugging. I wouldn't know since I didn't have any significant debugging problems.

For some issues where the documentation wasn't clear, I wrote some test code. Fine.

Code repository? Not worth it. I'm just making good use of the hierarchical file system -- one of my favorite things.

Some people laughed at my using Visual Basic .NET and said that C# would be much better. Eventually I learned that the two languages are nearly the same as ways to use the .NET Framework and get to the CLR and otherwise are just different flavors of syntactic sugar, and I find the C, C++, C# flavor bitter and greatly prefer the more verbose and traditional VB.

So, it's 18,000 statements in Visual Basic .NET with ASP.NET, ADO.NET, etc. in 80,000 lines of my typing text.

But now, something real is in sight.

You are now on the alpha list.

That will be sooner if I post less at HN!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: