It's actually a perfect analogy, IMO. Tattoo, is a form of (typically chosen) self-branding. A lot of companies make great products and then deminish them by thier mis/use of branding. Usually in a tactless way. This is a prime example of that.
A lot of people buy products because of the branding. How many people would buy a YETI cooler if or a Coach bag if it didn't have the branding to show off that they have a YETI cooler or a Coach bag? It's conspicuous consumption.
I think those two in particular might be poor examples, since they generally do quite a bit of de-branding on their products. I know what you mean however. Supreme is a prime brand for this. Nobody is buying a Supreme t-shirt for any other reason than the logo. Whatever thoughts on people who buy things to impress others may be, those kind of people exist and to each thier own. They at least made the choice to let everyone else know what they value.
> Enforcing traffic laws is good, actually. Automated enforcement is even better so that we don't need to use armed police and can enforce consistently.
We don't use armed police to enforce traffic laws. Police mainly monitor traffic as a revenue device. It's already been proven that monitoring traffic and automating fines in fact promotes reckless driving and causes more accidents than it stops.
> We don't use armed police to enforce traffic laws.
In what world? In the US "manual" traffic enforcement is almost exclusively done by armed police and sheriffs. Unarmed civilian traffic enforcement is only done in Berkeley, CA, and a town in Minnesota, afaik.
> It's already been proven that monitoring traffic and automating fines in fact promotes reckless driving and causes more accidents than it stops.
Do you have a citation for that? There are numerous studies that show significant drops in accident rates in areas with red light cameras. On the order of 10-23%!
They definitly can be. That's the slope. Where do you stop. Any totalitarian worth thier salt can easily make that leap. Use monitoring to curb one type of crime and "undesirable" behavior, why not use it for other types and before you know it, your entire existince is monitored in detail just to make sure you're acting exactly the way "they" want you to. That's how it works. The "I have nothing to hide" is a long debunked argument.
Then why enforce any laws? Any enforcement is on the same slippery slope.
I don’t think slope is nearly as slippery as you claim. There are miles of high friction slope between enforcing traffic laws on public roads and totalitarianism.
Technically, it's both. Parents don't teach their kids because they didn't/don't really know. Financial literacy isn't in the curiculum at all pre-college level. Even then it's not a "core" competency in any degree that isn't finance related. Even just ~10-15 years ago I wouldn't say the topic itself had as wide of discussion as it does today.
COVID really pumped up the market. Lots of hiring at insane TCO especially for unproven talent. Correction had to happen. As someone else said, if you have the experience and skills and fall into the comp. range companies are willing to pay you'll be fine. If you're missing any that it's going to be rough. My company is hiring like crazy, but only for very specific dev roles.
More tokens = more useful compute towards making a prediction. A query with more tokens before the question is literally giving the LLM more "thinking time"
It correlates but the intuition is a bit misleading. What's actually happening is that by asking a model to generate more tokens, it increases the amount of information it has learnt to be present in its context block.
It's why "RAG" techniques work, the models learn during training to make use of information in context.
At the core of self-attention is dot product measurement which causes the model to act like a search engine.
It's helpful to think about it in terms of search: the shape of the outputs look like conversation but were actually prompting the model to surface information from the QKV matrices internally.
Does it feel familiar? When we brainstorm we usually chart graphs of related concepts e.g. blueberry -> pie -> apple.
>What's actually happening is that by asking a model to generate more tokens, it increases the amount of information it has learnt to be present in its context block.
I'm not saying this isn't part of it but even if it's just dummy tokens without any new information, it works.
This paper is a great illustration of how little is understood about this question. They discovered that appending dummy tokens (ignored during both training and inference) improves performance somehow. Don’t confuse their guess as to why this might be happening with actual understanding. But in any case, this phenomenon has little to do with increasing the size of the prompt using meaningful tokens. We still have no clue if it helps or not.
>They discovered that appending dummy tokens (ignored during both training and inference) improves performance somehow. Don’t confuse their guess as to why this might be happening with actual understanding.
More tokens is more compute time for the model to utilize, that is completely true.
What they guess is that the model can utilize the extra compute for better predictions even if there's no extra information to accompany this extra "thinking time".
Yes, more tokens means doing more compute, that much is true. The question is whether this extra compute helps or hurts. This question is yet to be answered, as far as I know. I tend to make my GPT-4 questions quite verbose, hoping it helps.
This is completely orthogonal to CoT, which is simply a better prompt - it probably causes some sort of better pattern matching (again very poorly understood).
>The question is whether this extra compute helps or hurts.
I've linked 2 papers now that show very clearly the extra compute helps. I honestly don't understand what else it is you're looking for.
>This is completely orthogonal to CoT, which is simply a better prompt - it probably causes some sort of better pattern matching (again very poorly understood).
That paper specifically dives in on the effect of the length of the CoT prompt. It makes little sense to say - "oh it's just the better prompt" when Cot prompts with more tokens perform better than the shorter ones even when the shorter ones contain the same information.
There is also the clear correlation with task difficulty and length.
It's not a loss if they're still net positive from their unrealized gains. It's basically the home ownershipt version of "I know what I've got" but they in fact don't.
> I’ve heard similar tactics being used at other companies–mostly large companies–and it’ll only continue in 2024 as they make decisions that drive short term profits over all else.
When you tie leadership incentives to short-term profits, that's the only type of decision making that will be done.