Hacker News new | past | comments | ask | show | jobs | submit login

Remember that reCAPTCHA v1 used to be noble: reading books and converting them to text.

Now you're just training many Google machine learning algorithms by classifying data. In which they get more useful for the consumer, thus more powerful.




I hate them as much as you do, but you're wrong. Those storefront and traffic sign captchas are not useful for training ML models. If they were to be useful, they would be much more varied, like the original ones (used for OCR).


>Those storefront and traffic sign captchas are not useful for training ML models.

Not to get all tin-foil-hat, but this is going to sound like it, but if you have a car that has 9+ cameras upon it that drives in areas full of these, then maybe there would be some use for it for Google.

Bear in mind that I'm not saying that they are doing this but to dismiss it unequivocally as something that can't or wouldn't be done entirely ignores the premise that it could prove useful to other areas of their business, which might have a vested interest in such use (say, for example, if Google or it's parent company were trying to break into the self-driving car area[0]).

[0] - https://en.wikipedia.org/wiki/Waymo


Of course that's what they're doing... I thought this was well known? I don't think they claim otherwise.


I hate them as much as you do, but you're wrong.

I would love to see some evidence (a link or something) of this. I see captchas that look like pretty good edge-detection discriminators- street lights in tree limbs, bicycles against brick, and so on.


My brain is unwilling to accept this.

What is the purpose of those choices then?


Since they introduced the square-selecting captchas I have always assumed that they use it for identifying the user. I bet that depending on how you solve the captchas they can identify who you are if their system already has a theory of who you might be.


they're there for denying access to automated scripts.


This is the reason they exist in the first place, but doesn't answer the question why they're implemented this particular way.


They're implemented this particular way to provide training data for image segmentation systems, they move the image around inside the frame which allows them to use a few people doing the challenge to create a boundary representation that can be used to train things like YOLO style ML systems


They are able to verify that the user selection is correct. It is possible only if they already have the right answer. If they already have the right answer, what are they training for.


They have some known right answers and some they don't know. They check that you get the ones they know correct, and then they take the other info you provide and add some confidence that they are correct. This bootstraps the system.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: