Show HN: Heap is a new approach to analytics. Just capture everything

DodgyEggplant · on March 22, 2013

Poor man's analytics: events, client side, server side, whatever. Set up a separate DB with one big table, many columns for properties. One server call to write an event (a row), one ajax call to write from the client, throw it wherever you need. A bit of good old simple SQL for queries and reporting, and you are done.

pc86 · on March 22, 2013

I've often wondered about the efficacy of having a simple catch-all script that just tracks everything a user does on the site. I would think it would pound the database relentlessly under even modest load.

darkxanthos · on March 22, 2013

One trick is to not use a database but a flat text file. You can optimize for querying nightly.

ukoki · on March 23, 2013

AWS comes into its own for these kinds of use cases - make a lightweight autoscaling events API on Elastic Beanstalk pumping events into DynamoDB coupled with a bunch of workers running at the EC2 spot price sorting and processing these into a traditional RDBMS = massively horizontally scalable events handling for pennies an hour.

mcos · on March 22, 2013

You could read all the responses into a Queue-type structure and then populate the database in the background.

stevoyoung · on March 22, 2013

Clickish can do this with GA is you wanted. It's also intelligent with how it routes request back to GA. http://www.clickish.org

msoad · on March 22, 2013

Then you have a lot of form data including passwords in your poor-manDB

eksith · on March 22, 2013

That's why he said "Set up a separate DB with one big table" ;)

I think it can actually work quite well, provided you can setup a separate service just for analytics. It's not too unreasonable to assume there are sites already doing this since the concept of a central repository that you alone control (and isn't provided by a third party) can be appealing.

Also, you have to worry less about privacy and just concentrate on keeping your own systems and services secure.

pvnick · on March 22, 2013

Or a table with two columns, Key and Value. Index by key. Specify keys in code with constants. No need to alter table when adding new metrics.

bcoates · on March 22, 2013

Please don't do this! Entity-Attribute-Value tables are a nightmare waiting to happen about 98% of the time.

If there's something wrong with your DBMS where the alter table to add a row is an expensive or dangerous operation, just add a new table for every attribute, having lots of tables in a database is no worse than having lots of variables in a program.

If your DBA gets angry at you for adding new tables all the time, get a new DBA.

dustin · on March 22, 2013

This!

pindi · on March 22, 2013

I have a system exactly like this at my company, with a prettier UI on top of the SQL querying. I've been considering open sourcing it for a while; I wonder if there really is a demand for this, or if it's one of those things that everyone prefers to roll themselves.

sachinag · on March 22, 2013

Nothing wrong with putting it out there and no uptake. At least you won't have pull requests to deal with.

But, seriously, everyone rolls this themselves and does it wrong. If you've done it right, and can release it, you'd be doing good for the world.

fellowniusmonk · on March 23, 2013

Very interested.

tlarkworthy · on March 22, 2013

yeah or I could just pay them $25 / month and let them pay for bandwidth on their own server time

DodgyEggplant · on March 22, 2013

you still have the issue of another party domain and all the blockers

meritt · on March 22, 2013

Windows user here. The fonts on your site do not render well on Windows machines. I see an increasing number of sites using embedded fonts that, for whatever reason, render poorly on Windows. Please cross-platform test your sites.

http://d.pr/i/T7Wq Chrome 25

http://d.pr/i/p0AO Firefox 19

matm · on March 22, 2013

Thanks for the heads-up, we'll fix this asap.

We've run into this issue before, and it seems to have something to do with the font's Unicode values being "out of range" on Windows, whatever that means.

crazygringo · on March 22, 2013

FWIW, I've worked on projects where we "blacklist" embedded fonts on certain combinations of OS+Browser -- particularly chrome on older versions of windows.

Based on the user string, we would serve up a version of the HTML that didn't request the embedded font, and then those users would see Arial instead of, for example, Proxima Nova.

Super-annoying to have to deal with. But it works.

mattl · on March 22, 2013

Your move from self-hosted WordPress assets to S3 left some weird stray URLs on your blog that you could fix too...

http://blog.heapanalytics.com/wp-content/uploads/2013/03/blo... for example.

Now can I have an invite? ;) mattl [at] gnu org

yareally · on March 23, 2013

For what it's worth, they seem okay on Opera in Windows 7.

http://imageshack.us/a/img195/3558/fontsn.png

azov · on March 22, 2013

What happens if I'm a smallish site and suddenly get slashdotted / DDoSed? Will I be charged thousand dollars? Is there a way to cap monthly costs?

matm · on March 22, 2013

With the current model, your charges would spike. Which is bad.

But we plan to offer caps and the option to auto-sample if you exceed your tier.

aaronblohowiak · on March 22, 2013

Your target market is also probably familiar with http://en.wikipedia.org/wiki/Burstable_billing

corwinstephen · on March 23, 2013

Perhaps something like Cloudflare might serve as reasonable insurance against DDoS?

khromov · on March 22, 2013

Would also like to know this.

Also, I assume anonymous visitors are also users?

matm · on March 22, 2013

Correct.

anonfunction · on March 22, 2013

As a site owner you could easily only attach the JS snippet to people who have logged in.

stdbrouw · on March 22, 2013

But then you can't do conversion analysis anymore.

anonfunction · on March 22, 2013

Your right, but like everything else there are pros and cons and tradeoffs that should be considered.

icelancer · on March 22, 2013

Lacking conversion analysis is not a "con." It is a mandatory feature in something like this.

anonfunction · on March 23, 2013

http://cdn.memegenerator.net/instances/400x/20294656.jpg

Seriously though, for my purposes I'm much more interested in tracking events of my current users. Obviously for you conversion funnel analysis is important, and thats cool, but realize everyone has different needs (and budgets!)

argonaut · on March 23, 2013

Icelancer is right. Conversion analysis for landing pages is is a core, mandatory component. If I'm going to have to use something like Mixpanel for the landing page, then I might as well use Mixpanel for everything. "Obviously for you conversion funnel analysis is important, and thats cool, but realize everyone has different needs" is a severe understatement of how crippled an analytics platform would be without conversion funnel analysis.

anonfunction · on March 23, 2013

Don't get it confused, no one is saying that conversion analysis isn't important or shouldn't be included in the featureset. All I was trying to point out that you don't have to include the code for everyone if you don't need or want to for whatever reasons.

netvarun · on March 23, 2013

Being ddossed is not going to affect you. For Heap to kick in, the page needs to be rendered and their javascript snippet processed. A DDos attack is just going to send lots of traffic (eg: a gazillion wget requests made to your domain. Your apache or nginx logs are the ones that are going to explode.)

huhtenberg · on March 22, 2013

Welcome to the blacklist.

I, for one, very much object to all my clicks and "activity" on a page being captured and streamed down some black hole for god knows what purposes. Not to worry though - you are in a good company with all other analytics services out there. The only difference is that you are far more open (and proud?) about how obscenely intrusive your service is, so you get an honorary 2nd spot, right after KissMetrics.

pc86 · on March 22, 2013

I don't mean this to sound as snarky or mean as it probably will. I'm truly interested in your answer.

Why should you have the choice to object? What is obscenely intrusive about tracking how you use an application you (may or may not) pay for? I don't see why anyone would have any reasonable expectation of privacy when they're using a service online.

Note I am only referring to analytics contained on a single domain. I'm not making an argument for or against services like FB that track activity across multiple domains.

huhtenberg · on March 22, 2013

> Why should you have the choice to object?

What? Are you asking me to justify having an option of being able to object? That's a bit too meta, with today being Friday and all.

emmett · on March 22, 2013

You can't opt-out of server-side tracking. They could easily architect the app to require a round-trip to the server on every call (this is how typical early web apps worked). In that case you really don't have an option to object.

Does that bother you too? (A service recording your usage of it)

Why is measuring on the client-side different from measuring on the server-side?

icebraining · on March 22, 2013

The difference is that server-side tracking doesn't track my clicks, scrolling, mouse moves or if I mistakingly type or paste some personal data (e.g. password) into some text field.

Server-side tracking is much more controllable in what and when the information is sent, instead of essentially having Telescreen[1] websites.

[1]: https://en.wikipedia.org/wiki/Telescreen

huhtenberg · on March 22, 2013

The privacy trade-offs of pulling the info off a website are simple and well-understood.

The trade-offs of a page that keeps pinging a mothership with every mouse wiggle are less comfortable.

The trade-offs of a page streaming out user's activity to a dedicated service are even more troublesome.

mchusma · on March 22, 2013

Agree, was typing my thoughts prior to seeing this pop up.

mchusma · on March 22, 2013

To be clear, your on page is basically tracked regardless of whether you even use an analytics service, at least any on page activity that interacts with a webserver.

In other words, if you are object to website owners tracking you on their site, then basically you can't use the internet. And to be clear, if you object to in store tracking in physical locations, you are going to be able to shop at fewer and fewer locations (anyplace with a security camera).

If you are concerned about cross site tracking or permanent cookies, that is a different issue. Did you see any mention of that for this product? KissMetrics has definitely taken heat for this.

icebraining · on March 22, 2013

There are levels of tracking. JavaScript enables much more intrusive tracking than server logs. Ever pasted a password into the wrong textfield by mistake? With JS tracking, it may be already in someone's server.

if you object to in store tracking in physical locations, you are going to be able to shop at fewer and fewer locations (anyplace with a security camera).

Maybe, but at least here in the EU you can't use that footage for analytics or share it with any company. Even our government got CCTV project shut down because they didn't comply with Data Protection laws.

I don't advocate for Internet legislation, but I too will block it and wish people were more respectful of others' privacy.

benatkin · on March 22, 2013

It was nice of them to post a list of customers so I know which web services to avoid. Now I have another reason to recommend against Posthaven.

mp3geek · on March 22, 2013

Taken care of..

https://hg.adblockplus.org/easylist/rev/b9b1e238ce62

https://hg.fanboy.co.nz/rev/786329f3c981

cheald · on March 22, 2013

That "purpose" is generally to help service providers understand you better so that they can provide a product that better serves your wants and needs, as their customer.

huhtenberg · on March 22, 2013

Say, you have a desktop client-server app. Outlook, for example. How would you feel if it were uploading every click on the menu and every mouse move (in the confines of its window, of course) to the server. Strictly to help the developer improve the product and make it better for you, their customer. This should be totally OK with you, right?

cheald · on March 22, 2013

You mean like Eclipse does? http://www.eclipse.org/epp/usagedata/

Yes, if the product is collecting usage information (which does not include personal information) to help the software vendor improve the product, I have no problem with it. I like using good software. It's particularly potent in the context of web software, because you can A/B test and rapidly adapt the product to signals from your analytics packages in order to make sure that you are providing the most usable, most user-friendly product possible to them.

DanBC · on March 22, 2013

I'm fine if they tell me they're doing it, and what they're collecting, and giving me an option to turn it off.

huhtenberg · on March 23, 2013

Bingo.

nostromo · on March 22, 2013

Why do a HN launch with "request an invite"?

This seems like a common anti-pattern.

emmett · on March 22, 2013

How is it an anti-pattern? That's how Dropbox launched, and it seemed to turn out OK for them. Generally speaking if you have a hard-to-scale service (Heap, Dropbox) doing an invite-base system makes sense.

spyder · on March 22, 2013

Especially when there are better and a little cheaper alternatives. For example I don't see mouse move and scroll events in their demo so it's not "capture everything". For example http://clicktale.com or http://mouseflow.com can record mouse movements too and has better visualization using heatmaps instead just the list of elements clicked.

matm · on March 23, 2013

It's quite different. Clicktale and Mouseflow capture a similarly large volume of user data, but they do it at the expense of flexibly quantifying that data.

Most of their visualizations - while useful in their own right - are pre-built reports that serve a different role than the flexible reporting capabilities in Heap (and Google Analytics et al).

matm · on March 22, 2013

Our thinking is that it's an efficient way to grow responsibly while still gathering user feedback early and often.

We tried to include as much tangible product as is possible with an invite-only product.

evolve2k · on March 22, 2013

Feel free to launch with whatever you like, i just dont agree with calling an invite page 'Show HN' as it's misleading.

nostromo · on March 22, 2013

No worries, it's a cool looking product and I'm sure you'll do well. :)

My reaction anymore to consumer "sign-up required" sites or saas "request an invite" products is usually to close the tab, even if my interest is piqued.

I do believe that this used to be a powerful design pattern (Gmail, and as pointed out elsewhere Dropbox) but its prevalence has also become its undoing for me personally.

Start up founders often worry about becoming too popular too quickly -- when usually they should be worried about not becoming popular fast enough.

lucaspiller · on March 22, 2013

My guess is that most products that do this aren't much further along than a weekend hack. It's pretty easy to throw up a marketing site, and it let's you quickly see if people are interested. Thing of it as a μMVP :D

samstave · on March 22, 2013

They have a list of customers at the bottom of their splash - this looks like its more than a weekend hack...

They may just not have the infrastructure to support a widely available service yet.

goronbjorn · on March 22, 2013

You might notice that those are all YC companies also ;)

samstave · on March 22, 2013

HA! - No, I didn't notice that at all...

Good point, thus those customers might be "load test" and "POC" and not necessarily paying...

jessep · on March 23, 2013

Stripe launched and developed this way (yc first). Seems to work out pretty well.

samstave · on March 23, 2013

I didn't mean to imply it was a bad thing!

Just that on my first post I saw "customers" thinking they may have been more traditional customers.

But after being informed they were YC - it made sense they were more likely load-test and POC customers that help participate in the product evolution.

acgourley · on March 22, 2013

I'm positive there is a market for it. Unfortunately "capturing everything" has an unsolvable problem beyond capture performance, storage cost and query performance (which are hard but solvable) - it let's you find the outcome you were looking for all along.

raviparikh · on March 22, 2013

This is a really good point - blind data analysis without any notion of multiple hypothesis testing leads to false positives. This XKCD is a great illustration of that: http://xkcd.com/882/

This is something we're thinking about and are definitely conscious of.

darksaints · on March 22, 2013

Not just false positives in a confirm/deny scenario. It can also lead to overfitting in a modeling scenario.

I once created a model that showed very strongly that US/China exchange rates were a driver of revenue. This model was intuitive based on the market dynamics as we understood them, and they helped provide a strong predictive power which helped reduce operational costs. Over time, this model began to be taken as irrefutable truth among quite a few people in the org. Then when the economy collapsed, we very quickly learned that it wasn't US/China exchange rates that was the strong predictor, but rather something else which correlated with those rates.

Not only was the model broken, but it had now breeched trust in the ability of model based forecasts. In other words, if I were a business, I would be bankrupt.

I think the best value-add that you can possibly provide with your service is a way of helping your customers understand what is meaningful and what isn't.

foobarqux · on March 22, 2013

What was the actual underlying correlation?

darksaints · on March 22, 2013

Inflation...specifically using an industrial goods index.

acgourley · on March 22, 2013

I encourage you to commission or find web-comic style explanations of common fallacies people could reach by having too much data. Then try and direct them to better strategies. I think that could have a good impact not just on your business but on general perceptions of how to do business analytics.

codegeek · on March 22, 2013

The demo page (https://heapanalytics.com/dashboard/demo) fails in IE8.

Is it just me or IE8 is just not considered by web startups anymore ? Sorry to deviate a little but every time I try to look at a "Show HN" in IE8 (work computer), it fails for about 85% of the time. I could understand that some startups heavily depend on latest browsers but what about others ?

mbell · on March 22, 2013

IE8 is a complete mess in comparison to modern browsers. To support IE8 you often have to go through a lot of extra work after the site works in modern browsers to get it functional even if your not doing something particularly complex. At the very least its often a whole bunch of different CSS / replacing CSS with images to get the rendering to look decent. In addition a lot of 'Show HN' posts tend to use 'cool new tech' that just doesn't work at all in IE8. To get it functional you often have to implement an entirely different approach just for IE8. Most 'Show HN' links are MVP products, the devs just decided to release quick instead of spending a few extra weeks dealing with IE8's shenanigans.

dubcanada · on March 22, 2013

IE8 is perfectly fine, and also somewhere between 20-30% of your users.

If you have an analytics software that doesn't work in IE8. You lost analytics for 20-30% of your users.

And since array.prototype.slice.call is not supported in IE8, this analytics software is about 80-70% useful. And maybe even less if you have a large IE clientele.

Seems strange to me that one would limit themselves that much.

mbell · on March 22, 2013

> IE8 is perfectly fine,

No, its really not, its a huge pain in the ass to support unless you handcuff yourself to its limitations.

> and also somewhere between 20-30% of your users.

Those numbers are wildly skewed by market segment. IE8 is a relic that mostly only exists on corporate controlled laptops at this point. If they aren't your target user, you can pretty much ignore them. None of the sites I work on see over 5% IE8 usage.

ricardobeat · on March 22, 2013

No, it's not perfectly fine. It's terrible and takes a good amount of extra work to get working for any mildly complex website.

Around here IE8 is at < 7% market share; if your product is slightly related to tech that's more like 0%.

dubcanada · on March 22, 2013

I spend more time fixing issues in IE10, Opera, and the WebKit family then I do fixing IE8.

Sure it doesn't support a lot of HTML5/CSS3. But if you build a site using 100% HTML5/CSS3 then why do you even care what it looks like in anything but WebKit. Why spend 15 hours getting it to work in IE8. Let IE8 look like IE8, and the rest look like the rest. That's how it's suppose to be done.

IE8 is old, if you want to support it. You use older technologies. That's the point. If you want to use the newest and greatest you lose old support.

But to call it terrible is wrong, it is/was a solid browser.

ricardobeat · on March 22, 2013

That's the point. IE8 is old, and not 'perfectly fine' if you're creating a web-based product and not just looking for rounded corners. It was great after living with IE6, but we moved on. Even Google has dropped support for IE8 in most apps.

bdt101 · on March 22, 2013

I don't understand the comment about Array.prototype.slice.call. I only see them using it to convert arguments to an array which works fine in IE8.

lenazegher · on March 22, 2013

>IE8 is perfectly fine, and also somewhere between 20-30% of your users.

Perhaps in some markets.

For one of the sites I write for, it's less than 5% over a large sample.

Randuin · on March 23, 2013

I thank God everyday for our outstandingly small IE population

eksith · on March 22, 2013

It's not your imagination.

For Heap, it would be a bit unwise to ignore IE < 9 still since they're limiting themselves from the start.

A lot of new projects on the front page lately don't work on IE9 either. I lost count of the number of shiny new things (usually involving HTML5) that just show a plain black (/white/gray/blue etc...) screen on IE9 and I'd need to fire up Firefox or Chrome just to look at it.

To be fair, not all of these are catering to IE based browsers (and I dread to think how some of these would look on a Nokia mobile... or, heaven forbid, a BlackBerry), but they're effectively ignoring a very, very, very large percentage of users worldwide -- of not just IE -- to be first out of the gate. BlackBerry in particular is still very popular in Latin America and Southeast Asia.

That foot in the door is all important, I suppose, since the buzz will help sustain at least an initial signup for whatever it is.

The web applications in particular are exactly that. Because they're applications they're crafted to a very select set of platforms since they're executed. For better or for worse, a lot of new projects are already going in the direction of the browser as the OS.

I presume backwards compatibility will be tacked on later and maybe removed as those browsers fall further behind.

Hovertruck · on March 22, 2013

I can speak to this a little bit, actually.

Chartbeat has never supported IE, which as you say sounds like cutting off a major part of your market. Even moreso when you consider our customers which include pretty much every major publisher in the US. However what we've learned is that if your product provides enough value, people have no problem installing Chrome or Firefox or whatever.

To this day we receive pretty much no requests for IE support at all, which enabled us to build a product with heavy HTML5 (mostly canvas) and CSS3 usage.

ianstormtaylor · on March 22, 2013

+1 to this.

Also people need to realize that client-side library support is different from what the product's website itself supports. Chartbeat, and every analytics provider I know of supports IE for their tracking library.

eksith · on March 22, 2013

That's very true. We don't have the same complaint about programs specifically built on one family of operating systems not working on another. That also speaks to how much we depend on the browser these days to act as an OS in a way, particularly for full featured applications that depend on (relatively) new technology.

scott_karana · on March 22, 2013

Everyone, not just web startups, is sick of IE8 and earlier. As the other commenter described, it's a lot of effort. I think when it comes to bigger organizations (such as Google not officially support IE8 anymore), it's to force laggard corporate offices to at least upgrade to IE9, if not Chrome or Firefox.

dubcanada · on March 22, 2013

In 5 years everyone is going to be complaining that people use WebKit. And why they haven't upgraded to HerpyHouse.

Oh technology...

mortehu · on March 22, 2013

But the most popular WebKit browser keeps itself up-to-date with unprecedented efficiency:

http://royal.pingdom.com/wp-content/uploads/2011/12/111206-c...

Full article:

http://royal.pingdom.com/2011/12/06/comparing-chrome-and-fir...

jdlshore · on March 22, 2013

This is a deal-breaker for me. I'm very interested in Heap, but an analytics tool that ignores (or breaks) a non-trivial percentage of my users is a non-starter.

I mention this because it looks like Heap devs are monitoring this thread--Heap addresses real problem for me that I'm willing to pay money to solve. Codegeek's post literally stopped the sale.

nostromo · on March 22, 2013

It doesn't break for the end user, it just doesn't render in IE 8 for administrators.

jdlshore · on March 22, 2013

Oh, thanks for the clarification. I have to admit I didn't try it. I don't see any issue, then.

ceejayoz · on March 22, 2013

The target market for this sort of analytics solutions is going to have vanishingly small numbers of IE8 users.

dubcanada · on March 22, 2013

More like zero lol.

rbkillea · on March 22, 2013

Are you (dare I say it) trolling? :)

dubcanada · on March 22, 2013

IE8 doesn't support some of the javascript functions/methods they provide you. Namely Array.prototype.slice.call

So no I'm not trolling, I'm being 100% honest.

tg3 · on March 23, 2013

As bdt101 mentioned here[1], the way they're using Array.prototype.slice.call is supported by IE8. It is not supported when trying to slice a NodeList, as a NodeList is not a JScript object.

1 - https://news.ycombinator.com/item?id=5424874

vyrotek · on March 22, 2013

Looks great! I'm mostly intrigued by the ad-hoc Group By query functionality. Traditionally that's an indicator of some sort of relational database on the backend, but is that the case here?

On the other hand, the dynamic nature of the data makes me inclined to think that they're using MongoDB or something similar. In the past I've had to create similar systems but we denormalized the data coming in based on pre-defined Group By settings.

I'd love to learn more about the stack and database used for HeapAnalytics and any other similar services.

matm · on March 22, 2013

We initially tried to build this on Mongo, and it was a huge pain. I'll need to elaborate on why at some point, but the very ad-hoc nature of all our querying precludes any non-relational database from being our store (at least any that I'm aware of).

vyrotek · on March 22, 2013

It sounds like you and I had very similar experience. :) Please do share and elaborate some time.

For my past implementation of a similar data-store we had some different requirements that let us cut a few corners. But, if I were to take another stab at it with your requirements then I would try a new approach.

I think I would use a single table to represent the 'event' itself and have that table be essentially owned by the customer. This table could be put on a specific shard set aside for each customer or you could just name it "{customerId}_{eventName}" and have a lot of tables. Then each event coming in would potentially perform an 'alter' on the table in order to make sure all custom properties have a column.

Are there any downsides to this approach I'm not considering?

kybernetyk · on March 22, 2013

First: The service sounds great. Something I would sign up for.

But I don't really get the pricing structure:

What if I have less then 2500 unique users/month? Do I pay $25? Is it free? If it's $25 what then if I have more than 2500 but less than 20000 users? Is it $150 then?

raviparikh · on March 22, 2013

Right now the way we price is pro-rate it for people who are between tiers. For example, if you have 3000 uniques per month, then you'll pay $28.75 ($25 for the first 2500 users, and then $3.75 pro-rated for 500 additional users on the next tier). If you have below 2500 users, you just pay for the portion of that $25 that you use (a site with 1000 users would pay $10). We're updating our pricing page to clarify this.

pc86 · on March 22, 2013

Thank you for explaining this. That's a huge difference that the impression of tiered pricing that I got.

Any plans for either a free trial period or a very small (<50 users?) free plan?

raviparikh · on March 22, 2013

Definitely - we offer a one month free trial of the lowest tier plan (2500 users/month).

TylerE · on March 22, 2013

I would highly advise you to rethink that. What if we're a larger site? To be honest, your pricing is way out of our ballpark anyway (We're a smallish daily newspaper, ~300k uniques/monthly), there's no way I'd ever get approval for $20k/yr for analytics, no matter how cool, and I'd never get approval to spend $1k+ just to try it.

For orgs like us, your pricing seems rather brutal, as we're probably much lower in pages/visit than an "app" type website. (We're at ~10 page views per visit, average)

matm · on March 22, 2013

Feel free to reach us at sales@heapanalytics.com. We'd love to talk about fair pricing, and we certainly need more data points to evaluate what that fair pricing is.

swores · on March 22, 2013

Do you feel there's scope for pricing coming down a lot? I'm in a similar boat to TylerE: around a million monthly uniques, but in publishing, so there's really no way to justify $4k/month for Analytics when stuff like Google's already exists. 10% of that might be justifiable, but even then it's not a no-brainer.

TylerE · on March 22, 2013

Yea, precisely. I'd say for us we could maybe to like, $100/month. Also - how does it work for multiple sites? We actually have 4 separate sites for separate towns.

raviparikh · on March 22, 2013

Depending on how usage looks on your site we can find a way to accomodate. Shoot us an email at sales@heapanalytics.com and we can coordinate a time to chat.

stdbrouw · on March 22, 2013

Newspapers routinely find the money to pay $570/mo and up for ChartBeat, and then never use it, or I-don't-know-how-much for Adobe Omniture when they could have Google Analytics for free. So I don't think you're representative of the news industry as a whole.

Also, content-heavy sites have different analytics needs from applications. It'd be interesting to see where Heap fits in, but most of the analyses you can run in KISSMetrics, Mixpanel and other app-focused analytics tools aren't of much use for news organizations.

TylerE · on March 22, 2013

That's a misunderstanding of the industry, I think. While the website as a whole isn't very "appy", some smaller sections of it _are_, and it would be incredible to actually understand how our customers are interacting with it.

P.S. We used to have Omniture, and switched to Google to save $.

physcab · on March 23, 2013

Can you put your email in your signature? I'd like to ask you more questions about your analytics needs and have you check out a project I'm working on.

TylerE · on March 23, 2013

It's my username @ gmail

khromov · on March 22, 2013

In the same boat as this guy. High anonymous visitor count, but would like to graph it, but pricing is not gonna make it possible.

pc86 · on March 22, 2013

I know you're reworking the pricing page as you said but you may want to consider doing what a lot of email service providers do and list the per-user price in tiers.

Something like

* Users 1-2,500: $0.01/user

* Users 2,501-20,000: $0.0625/user

* Users 20,001-80,000: $0.0583/user

might be a tad more intuitive, especially if you listed the top-end price along with that.

pbreit · on March 22, 2013

I don't think per user is the way to go. Customers want more predictability. Plus, per user feels like it's penalizing growers. It seems a bit more appropriate to charge per/email since there's som atomicity there.

For analytics, simple tiers (without pro rating) should work fine.

matm · on March 22, 2013

This is interesting. Our naive thinking was "fewer numbers" implies "easier comprehensibility", but I don't think the pricing was quite as understandable as we would've hoped. I like your suggestion a lot.

eddmc · on March 22, 2013

Keep the pricing message simple. I'm not a fan of variable pricing - I like to know what the monthly cost is going to be beforehand. I appreciate tiers where I can see what I'm going to hit depending on growth over time.

Someone below said that you should base your pricing on cost - I completely disagree. You should absolutely NOT look at cost when you put pricing together - rather, you need to ask your customers about the value of what it is you're providing. For example, Heap will be saving the time and hassle of deploying a generic solution on a server and the time of customising it. Plus Heap will be improving their product every day whereas the self-hosted version would need a developer to add new features. These sort of things all add up - you'll be surprised what you find out when you ask your customers.

Pricing is hard. There are a lot of good articles and discussions on HN if you do a bit of searching.

Good luck. Heap looks good

ozi · on March 22, 2013

Perhaps a slider or some other interactive tool would be better for estimating monthly cost.

Anyone who'd use your service can look up their existing active user counts and plan accordingly -- much easier than estimating data points. However, you're going to lose some people if you require mental gymnastics to get an estimate based on arbitrary tiers and pro-rated pricing.

cyanbane · on March 22, 2013

I would love to see prepay/caps implemented in a similar way. I give you $50 for the equivalent number of users, you stop serving me after that is exhausted. I can then chose to replenish today, tomorrow, next week or never.

kybernetyk · on March 22, 2013

Thanks for the clarification. The pricing structure sounds fair.

isalmon · on March 22, 2013

It's ironic that on their website they don't have their own code (well, maybe it's on the backend) and use Segment.io + Mixpanel instead.

dubcanada · on March 22, 2013

? It certainly does have their own code. Look at the second script tag in the head.

isalmon · on March 22, 2013

Ha they just added it!

matm · on March 22, 2013

You don't see Heap on there?

erichocean · on March 22, 2013

Yes! I've been doing this for awhile. If you feed the data into a Bayesian classifier, you can figure out what kind of user you're dealing with, too.

seancron · on March 22, 2013

That sounds really interesting. Could you describe how you do that a bit more?

erichocean · on March 22, 2013

Sure. The simplest approach is to use a naive Bayesian classifier (you can find a bunch of open source implementations).

In my app, I actually track client-side actions -- essentially, clicks, but with more context. Anyway, you can treat a single user session like you would the text of an email, where the "actions" are the words.

From there, all you need to do is capture a bunch of sessions and tell the classifier which users are not strong computer users and which know what they're doing, passing the corresponding "documents".

Now you can feed in new documents and determine which of your users know what they're doing, and which aren't really computer users.

(It my experience, people fall into one of those two camps.)

robflynn · on March 22, 2013

I have done something similar before but also measured which content was being consumed. Are they readers, listeners, or watchers?

Eventually, you can pretty easily detect the best way to approach each potential client from a marketing perspective.

raviparikh · on March 22, 2013

We've been thinking about a lot of the same things you mention here. Send us an email at team@heapanalytics.com, I'd love to chat some time.

joonix · on March 22, 2013

Dumb question but how does this work?

  Since Heap captures everything from day 1, all analysis is automatically retroactive. There's no need to wait days for data to accumulate.

So it studies my server's web logs? I'm confused about what this means.

raylu · on March 22, 2013

I assume when you put the JS on the page, it tracks everything. Later, you define "symbols" and that gives you "retroactive" analysis.

darkstar999 · on March 22, 2013

It means that if you start tracking on day X, then on day Y you decide you want to generate a certain report, the data is already there, you don't have to start tracking anything new.

flippyhead · on March 22, 2013

But also if you change your page structure your reports might stop working

gkoberger · on March 22, 2013

It seems it just sends every single event to the server and lets you filter later.

soneca · on March 22, 2013

I liked it very much as a non-tech founder that is not quite sure how to define some events on KissMetrics and regret not having qualified a particular form of sign up as an event before.

Comparing prices though, a question: is every visitor of my site a new "user" for you? If it so, then my monthly average events for user would be very low, then an "event based" pricing would be better for me than a "user based" pricing. Comparing prices with KissMetrics, my guess is KM would be cheaper for me with an average of 3 events per visitor. But if your "user" entity doesn't consider, at least, the ones that bounced, then maybe is a good deal.

raviparikh · on March 22, 2013

This is a really good point and something we thought about when constructing the pricing model. Right now we're not a very affordable option for sites with a lot of bounced traffic or low amounts of user interaction. However we'll soon be amending our prices to better accomodate customers such as yourself.

soneca · on March 22, 2013

In my case, if you just say "we don't count bouncing visitors as users", tchanananam!! you just became the perfect analytics tool for me. Even still considering that I have a lot of users that just click once and leave. Please, consider it.

PS: "tchanananam" is a brazilian expression onomatopoeia when there is a punch line revelation...

tommoor · on March 22, 2013

Interesting pricing model, it's actually kind of nice to see it segmented by user instead of events - definitely a case of putting the more useful information for the customer first rather than just using events.

My first impression is that it would be a bargain for some businesses (low number of users / high value) and far too expensive for large consumer sites with low user value, I guess this is the case for most analytics providers though.

AaronFriel · on March 22, 2013

This seems really interesting, but starting at 250 users per month seems high. I notice the price/active user starts at a penny and falls 60% as the users climb to over 500,000.

Would it be possible to create a price option for even smaller companies, demo sites, etc, at a higher rate?

Say, what if the owner of a small blog wanted to use this to do really fine grained analysis of users actions but they average 250 hits a day?

matm · on March 22, 2013

Our bottom tier (which includes a 1 month free trial) lets you track 2,500 users/month.

That said, pricing is really hard to get right. There's basically zero chance we nailed it on the first try.

web007 · on March 22, 2013

You definitely didn't get it right. =) That's one of the joys of startup life, A/B test it until you do!

Even for a "successful" site (>1mm MAU) $2,000 a month feels insanely expensive. The performance difference between this and GA ($0) is not high enough _initially_ to offset the sticker shock.

Look at NewRelic, Tracelytics or even Mixpanel (expensive IMO) for better pricing levels. You've built the equivalent of "hosted snowplow" (https://github.com/snowplow/snowplow) with a nice GUI and tools, you should expect to charge something much closer to cost than your current approach appears to.

eddmc · on March 22, 2013

I'm not saying that Heap have got the pricing right but you should absolutely NOT look at cost when you put pricing together - rather, you need to ask your customers about the value of what it is you're providing. For example, Heap will be saving the time and hassle of deploying a generic solution on a server and the time of customising it. Plus Heap will be improving their product every day whereas the self-hosted version would need a developer to add new features. These sort of things all add up - you'll be surprised what you find out when you ask your customers.

Pricing is hard. There are a lot of good articles and discussions on HN if you do a bit of searching.

stuffihavemade · on March 22, 2013

If they can sell successfully at that price point, I'd consider starting a competitor and undercutting them.

tyw · on March 22, 2013

pricing seems high to me, unless I'm thinking about it in the wrong way. Is Unique Users/month equivalent to the number google analytics reports for 'Unique Visitors' over a 30 day span? I guess the site I'd consider using this on isn't really in your target market (1.8M uniques in the past month, not enough revenue to make the idea of dropping $2k+/mo on analytics attractive)

bertomartin · on March 22, 2013

Great technology, but you guys will know a lot about my business and my users and I'd like to be convinced that you won't do ill with this data...not that I don't trust you, but it would be more comforting if you made your policy towards the information you collect more up-front and clear. But great job.

physcab · on March 22, 2013

I imagine that your event feed is going to get pretty unwieldy if you have a thousand events named like the following:

  div.star.empty
  button.btn.btn-primary
  button#nodisturb.muted
  a.bold-text.text

Do you have any ideas how you are going to overcome this? It seems unintuitive.

matm · on March 22, 2013

Very keen observation. There are a number of ways to solve this problem. Machine learning helps (there are a few simple heuristics we currently use to sort the Event Feed).

But the right interface for defining events helps even more.

Sorry for the frustratingly mum response.

pnt · on March 22, 2013

What if you tracked the event's parents, e.g. button <- div <- .sidebar <- body, etc.? When presenting the events to the user, you could hide generic parents like 'div' and only show interesting parents with ids or classes. To search through events, perhaps the user could use css selector syntax in addition to the current UI you have.

bravura · on March 22, 2013

Can you discuss how you address the "schema-on-read" problem?

One of the downsides of the cheap bit-bucket approach of tech like NoSQL and Hadoop is that it's easy to get the data in, but harder to get it out. The producer of the data has less work. The consumer of the data now has more, since there are no longer guarantees on the structure.

An emerging 3rd-wave approach is that of an "eventual schema" (see http://arunxjacob.blogspot.com/2011/11/schema-on-read-not-so... by Arun Jacob, chief of data at Disney). But best practices for eventual schema are immature and evolving.

How do you approach the problem of getting data out, and allowing people to use common concepts to query their data?

flippyhead · on March 22, 2013

I love the concept, definitely addresses a pain point in using other platforms. The thing I worry about is when I make some change to the styles or HTML structure of my site I have to now consider how that will break my analytics.

Cakez0r · on March 22, 2013

Great job! I think other analytics platforms will start to follow suit as more people start to realise that it's cheap enough to just mercilessly capture every user event. The UI looks beautiful and seems very intuitive too.

Jasber · on March 22, 2013

Looks very cool––what kind of performance hit is introduced by logging every event?

matm · on March 22, 2013

It's essentially imperceptible. We measured it, and our client-side script introduces a 0.1% CPU overhead.

raylu · on March 22, 2013

I don't think anyone is concerned about the CPU overhead. (Also, let's not talk about measuring things without defining exactly what was measured. On a blank page I'm sure your JS runs great.)

How does it affect network performance? If I click a link and it takes me to another page, how do you track that?

matm · on March 22, 2013

Good point. In terms of network performance, we try to batch events and minimize server requests.

For link tracking, our method is similar to how other analytics products track link clicks - we delay the page navigation the min of: 1) 250ms, 2) the time it takes for our server to respond to a click event.

uptown · on March 22, 2013

>we delay the page navigation the max of: 1) 250ms, 2) the time it takes for our server to respond to a click event.

So if your server takes 10 seconds to respond to a click event, the user waits?

matm · on March 22, 2013

Oh, whoops. Good catch. That should be min, not max.

amccloud · on March 22, 2013

I think he meant max (timeout) is 250ms or sooner if the server responds in < 250ms.

pnt · on March 22, 2013

Does Heap track event streams like mouse position and scrolling? The storage overhead would be significantly larger than instantaneous events like clicks, but this data would be interesting.

matm · on March 22, 2013

Unfortunately, we're not tracking those events now, mostly because they don't fit neatly into the event-based analytics model (i.e. they're not as "graphable").

But that doesn't mean they aren't important. Agreed the data can be very useful.

reddiric · on March 22, 2013

I'm obviously outside of the audience this is for, and probably most other commenters since only one other person mentioned performance.

Perf??

And of course, this follows the "include a script tag to our server" cloud product web API pattern. This isn't setting off giant alarm-bells for anyone else? This gets past the sniff test?

The fact that this doesn't fail the sniff test for most people is probably why most web-apps feel like most web-apps feel, instead of like http://prog21.dadgum.com/.

namabile · on March 22, 2013

This looks great. I was looking in the docs and you even have a way to track purchases/revenue and identify users.

In addition, the fact that you're able to associate identity data with past actions should help tie users together across sessions and devices, if I'm understanding correctly.

From the Heap docs: User properties are associated with all of the user's past activity, in addition to their future activity. Custom user properties can be queried in the same fashion as any other user property.

jstsch · on March 22, 2013

Looks nice. I do worry about the privacy implications of analytics like this. I'm also pretty sure that this doesn't fly in Europe... something to consider perhaps!

dropdownmenu · on March 22, 2013

Awesome idea to help make capturing data more easy, but how will you deal with the problem of the curse of dimensionality? As you start to capture more variables you data begins to appear far apart leading to less meaningful results.

Also, while 'capturing everything' sounds like an awesome idea it can lead to worse performance if you don't take into account that your data now has more variables that can each introduce noise.

disgruntledphd2 · on March 22, 2013

The thing about the curse of dimensionality is that it only really kicks into play when you try to put every variable into a model (or when you have low n relative to p, generally). If you restrict yourself to analysing events that you actually care about, its much less of an issue.

matm · on March 23, 2013

"Capture everything" need not imply "measure everything." It should imply "measure anything you need to, immediately and retroactively".

k3n · on March 22, 2013

Anyone know of a decent user-tracking / analytics package that isn't SaaS? And isn't log-based, like Urchin (which I just discovered is yet another victim of Googlicide[1])?

My app is in dire need of user-tracking and analytics, but it's a self-hosted application and many of my clients run it on LAN segments that don't have WAN connectivity.

1. https://www.google.com/urchin/

namityadav · on March 22, 2013

Piwik (http://piwik.org/ , GPL 3, PHP-based) is perhaps the best analytics solution that you can host on your own.

k3n · on March 22, 2013

Thanks! I'll definitely look into this.

revorad · on March 22, 2013

http://piwik.org/

http://snowplowanalytics.com/

k3n · on March 22, 2013

Thank you!

smnl · on March 22, 2013

What if my site uses e.stopImmediatePropagation() in any click/event handlers?

Would this still be able to capture those events? And if so, how?

netvarun · on March 22, 2013

We use heap at semantics3 and it's really awesome. I highly recommend it. It's really like crack for devs.

anonfunction · on March 22, 2013

Wow this is pretty amazing and something I wish google had offered for awhile. The one area where I think your going to need to work on is your pricing. It's just to high for most websites. Of course, if your an elephant hunter...

dm8 · on March 22, 2013

I would love to use this product! I was talking with my team recently about having product similar to heap and we've used bunch of analytics products in the past.

@Founders of Heap:

I would love to do beta test for you guys. When do you guys open the gates for beta?

czzarr · on March 22, 2013

this sounds really awesome if they can manage to stay up under heavy load. It's such a huge pain to set up and maintain the tracking of events in application code. Having everything retroactively sounds positively great.

shadowmint · on March 23, 2013

I'd be amazed if we don't start seeing a "Tracking BlockPlus" for this sort of thing, if it doesn't exist already.

edit: Sort of already does it seems: https://www.fanboy.co.nz/

randall · on March 22, 2013

Closing the lightbox doesn't stop YouTube, fyi. Also, can you handle scrolls?

soneca · on March 22, 2013

Does it also offers cohort abalysis based on the time of a particular event (not just a particular property like Browser)? eg. filter everyone that signed up at the first week of January.

kerno · on March 23, 2013

Looks like a great tool and I'm looking forward to putting it in action.

But, what's with the robot and spaceships blowing up a city in the background of your site?

krazydad · on March 22, 2013

Cool service, but the pricing is insanely high. My piddly website gets 100k uniques a month, which would cost $600 to track. Kraaaazy.

philfreo · on March 22, 2013

You should integrate with Analytics.js...

https://github.com/segmentio/analytics.js

coolsunglasses · on March 22, 2013

I plan to make my next website for mainland China, I really need to know that IE is going to work fine here and which versions thereof.

michaelmior · on March 23, 2013

Strange. I was just envisioning a service like this last week. Cool to see someone actually building it. Best of luck!

enigmabomb · on March 22, 2013

I'm really excited about this, I signed up. We've been trying to solve this with different things for a while.

lingben · on March 22, 2013

looks great but you should take a look at your pricing structure! pricing is extremely difficult so I don't mean to over-criticise, the thing is that you've basically priced yourself out for 90% of potential customers, especially larger ones

endlessvoid94 · on March 22, 2013

What are the performance implications on, say, a javascript web application?

gcb0 · on March 22, 2013

Hope your sites don't have dialup or mobile over 2G users. ever.

blakeeb · on March 22, 2013

Reticulating Splines!! Amazing Maxis reference, I'm sold.

markus_tipgain · on March 22, 2013

That looks awesome! I definitely try it out.

homakov · on March 22, 2013

2 questions

1) why would i need this info?

2) how do u track follow/like events

biolime · on March 22, 2013

This looks absolutely amazing, great work!

philfreo · on March 22, 2013

Please integrate with Analytics.js!

orangethirty · on March 22, 2013

I suggest to test the text on orange button. From "Request an invite" to:

"Get exclusive access"

"Learn more"

"Take a better look"

"Find why N is for you"

"Join our FREE client club."

"Get a FREE account" Note that free account != free product. Just give then login details.

Samuel_Michon · on March 22, 2013

"Get a FREE account" Note that free account != free product. Just give then login details.

Wow, that's really sleazy. I really hope nobody actually uses that idea.

orangethirty · on March 22, 2013

That is one of the highest converting messages you can put on a button.

Samuel_Michon · on March 22, 2013

I think that depends on what kind of conversion you're after. If you just want to harvest email addresses, then sure. If you're looking for satisfied paying customers, this kind of trickery won't work.

If you build a button that promises something for free, you need to give something for free — free tier, 30 day trial, ebook, screencast, etc. Having visitors fill out an account form is not a gift to them, it's a gift to YOU.

orangethirty · on March 22, 2013

The issue here is that I'm sharing findings from my own data. This is not me making stuff up. I have actually used this very same button text to increase conversions (up by .04% on a site with thousands of visitors). It was not used to harvest emails. A practice that simply does not work, unless you plan on sending Viagra ads. I don't do such things.

If you build a button that promises something for free, you need to give something for free — free tier, 30 day trial, ebook, screencast, etc. Having visitors fill out an account form is not a gift to them, it's a gift to YOU.

See, this is where you miss the point. The person is getting a free account. You might not think its something, but there are a lot of accounts out there for which you have to pay. Need to go to Costco? Hey, you have to pay for membership (an account), just to make sure you can enter the store. Free accounts may not be what they used to be (in your opinion), but they are still very valuable overall.

Now, of course you are not going to just give them an account. That would be a waste of a good lead. You will then follow up with another offer. Say, pay half-off your service price for the first month. And so on.

I know some of this stuff looks rather strange to people here. But the business world is very, very different from programming.

Samuel_Michon · on March 22, 2013

Thanks for your explanation. I think we have different definitions of what an 'account' is. For me, an account is in which a user shares his contact information (maybe also business & payment info). That in itself doesn't add value.

From your last comment, I gather that (to you) 'account' surmounts to 'access to the service or product'. If so, we agree.

When you apply for Costco membership, you get something in return: the chance to buy products for less than they would normally cost. I don't see how that translates to the example provided.

NB: I'm a publisher, marketer, editor, and designer. I develop software, but I'm not a programmer.

orangethirty · on March 23, 2013

I'm happy to be talking to another business dev. Shoot me an email. I love sharing data with others.

oijaf888 · on March 23, 2013

Does that actually work on B2B products that are in the hundreds or thousands dollar range every month? I don't think that cost is really an issue in that world if its delivering value at a large multiple, its more integration/implementation worry. Perhaps I'm wrong though.