Modern captchas are designed to be anti-human and use as for free training of their AI algorithms. We're slaves for free while Google benefits from this, both technologically and financially.
Next captcha from google will be even more aggressive, you're not logged to Google = you're a bot, you can't access that content.
Its very difficult to block an extremely motivated and targeted attack. With things like this, you aren't trying to necessarily block a highly targeted attack. You mostly need to just ward off the majority of low effort bot spam and random internet trolls. Having extremely tight security can be expensive and/or difficult for most organizations.
This is exactly why something like reCAPTCHA exists and is used prevalently.
To me, it sounds like your system is just security by obscurity. It wouldn't scale, if it did become used prevalently then it would be very easy for bots to circumvent.
I normally agree with concerns about security through obscurity, but I disagree here: this isn’t a security feature. It is spam protection. Everything that creates more work for any attacker here helps reducing spam, on top of that Google itself uses code obsfucation (”Security through obscurity”) in their Captcha for precisely that reason.
It won’t scale, because it mustn’t scale. It is a dead simple solution to a complicated problem and works as long as it works, without selling your user data and brainpower toone of the biggest tech companies there is.
If it should happen that the spam bots overcome it or your site becomes big enough to be targeted you just change it for something stricter, stronger or more sophisticated.
Why do you assume they aren't automating it. The obvious thing if I'm a spammer is to hire humans to solve the problem, collect their output and feed it into my ML training. I now have the same dataset that google is using, for my ML.
Actually I'm not sure I need to go to full ML: after a few rounds I can probably just use image compare (not ML) and just feed humans images that I haven't seen before.
Of course round two of the above is to expand on the above. Doing ML for image recognition isn't hard (other than CPU cost). I can also collect statistics, images humans take longer on I will take longer on as well (I can potentially collect eye movement so I have better data than google here - this can feed into ML). Images that humans are unsure of I will fake unsure of by sometimes clicking sometimes not at similar rates to humans.
I don't know what ML google has that isn't public, but we also don't know what scammers have. Ultimately google needs to expose enough data to scammers (who see more captchas than anyone else by nature of their operations) that their ML algorithms have a large training set. Once a scammer realizes the types of data good is looking for it isn't hard to collect other samples for your private training set. Go outside in any city and you will find stoplights and street signs... you now have a training set of data that isn't googles to test on - you need a few cities and seasons worth of course, but that is an implementation detail.
Which is why spammers will have humans in the backroom for the foreseeable future. If google tries something different they go to humans to figure it out, it google keeps doing it they automate it.
The game is more expensive for google because google needs expensive people to create the scheme, they can hire cheap people to figure it out. (if cheap people can't figure it out google has failed) They only need expensive people only if/when they decide to automate the scheme.