More

randomifcpfan · 2025-01-28T14:10:07 1738073407

In my application, code generation, the distilled DeepSeek models (7B to 70B) perform poorly. They imitate the reasoning of the r1 model, but their conclusions are not correct.

The real r1 model is great, better than o1, but the distilled models are not even as good as the base models that they were distilled from.

randomifcpfan · 2025-01-26T04:45:57 1737866757

The DeepSeek R1 paper explains how they trained their model in enough detail that people can replicate the process. Many people around the world are doing so, using various sizes of models and training data. Expect to see many posts like this over the next three months. The attempts that use small models will get done first. The larger models take much longer.

Small r1 style models are pretty limited, so this is interesting primarily from an “I reproduced the results” point of view, not a “here is a new model that’s useful” pov.

rahimnathwani · 2025-01-26T05:18:00 1737868680

From the Deepseek R1 paper:

  For distilled models, we apply only SFT and do not include an RL stage, even though incorporating RL could substantially boost model performance. Our primary goal here is to demonstrate the effectiveness of the distillation technique, leaving the exploration of the RL stage to the broader research community.

The impression I got from the paper, although I don't think it was explicitly stated, is that they think distillation will work better than training the smaller models using RL (as OP did).

nielsole · 2025-01-26T05:42:52 1737870172

> We demonstrate that the reasoning patterns of larger models can be distilled into smaller models, resulting in better performance compared to the reasoning patterns discovered through RL on small models

I found this statement from the paper to be at odds with what you cited, but I guess they mean SFT+RL would be better than either just SFT and RL

rahimnathwani · 2025-01-26T06:10:03 1737871803

I think they're saying that some reasoning patterns which large models can learn using only RL (i.e. without the patterns existing in the training data), can't be learned by smaller models in the same way. They have to be 'taught' through examples provided during SFT.

randomifcpfan · 2024-12-28T20:12:59 1735416779

College degrees from reputable colleges used to serve this purpose, but grade inflation has greatly weakened this signal.

programjames · 2024-12-28T20:26:10 1735417570

You also want colleges to signal to their applicants, not force them to also signal for their alumni. The two will naturally be correlated, but you can do better by specializing.

randomifcpfan · 2024-09-29T12:17:24 1727612244

“You should consider using this in your requirements” implies that this is not a hard rule, it’s just an ignorable suggestion. It would be interesting to audit gov.uk web pages over time to see whether this advice is being followed.

philipwhiuk · 2024-09-29T12:53:46 1727614426

Having

> You should consider

Is Gov UK's way of allowing people internally to point to it and say 'Well, did you consider it?'.

UK Digital don't have any direct power to force change - they have to use sensible advice and internal process to encourage better design.

etothepii · 2024-09-29T12:33:37 1727613217

Don't forget the rules of British English that make it very clear that the grammatical construction: "you should consider" means "you must in all circumstances save for the immediate alternate outcome being a genocide."

randomifcpfan · 2024-09-29T13:06:50 1727615210

Thanks for explaining! That’s quite different from the US English (and RFC English) meaning of “should”.

etothepii · 2024-09-29T13:20:10 1727616010

This translation guide is usually helpful.

https://polish2english.files.wordpress.com/2011/11/55551980-...

I'm afraid I don't know what RFC English is and neither does Google.

valley_guy_12 · 2024-09-29T19:10:26 1727637026

Huh, for me the very first non-ad result for googleing RFC English is https://en.wikipedia.org/wiki/Request_for_Comments, which is the correct citation.

etothepii · 2024-09-30T00:26:47 1727656007

For me it was to do with Rugby Football.

layer8 · 2024-09-29T13:24:30 1727616270

To be fair, here is the RFC meaning:

   SHOULD   This word, or the adjective "RECOMMENDED", mean that there
   may exist valid reasons in particular circumstances to ignore a
   particular item, but the full implications must be understood and
   carefully weighed before choosing a different course.

It means you can’t simply ignore it, and instead have to have compelling reasons to justify any deviation.

valley_guy_12 · 2024-09-29T19:13:15 1727637195

Unfortunately, in many organizations, "the library we use doesn't follow this recommendation" is a valid compelling reason. Which means that in practice "SHOULD" effectively means "WOULD BE NICE IF".

randomifcpfan · 2024-09-08T02:16:30 1725761790

I remember seeing the PERQ at trade shows. The best thing about the PERQ was its monitor, which was unusually sharp for that era. It used a yellow-white long persistence phosphor. A CMU grad student friend told me that the monitor designer was “a close personal friend of the electron”, implying that the analog circuitry of the PERQ monitor was especially high quality.

randomifcpfan · 2024-08-07T14:30:43 1723041043

They identify 6 bugs/mistakes, of which, not doing staged releases, was the final mistake.

They stop short of identifying the real root issues of running at kernel level, and of not auto-backing-out updates that cause crashes, perhaps because those causes are harder to fix.

randomifcpfan · 2024-06-16T14:42:55 1718548975

A tool for updating bazel build target dependencies. It inspects build files and source code, then adds/removes dependencies from build targets as needed. It requires using global include paths in C/C++ sources. It is not perfect, but it is pretty nice!

rsc · 2024-06-16T16:01:45 1718553705

If you're using Go with Bazel, gazelle is available outside Google: https://github.com/bazelbuild/bazel-gazelle

Enabling tools like these was exactly the point of the enforced formatting. It worked extremely well.

dijit · 2024-06-17T00:43:33 1718585013

I should add that it seems Gazelle is being expanded to other programming languages other than Go.

For example: https://github.com/Calsign/gazelle_rust

randomifcpfan · 2024-06-13T03:38:45 1718249925

That’s certainly how it seemed from the Android side at the time. The Linux side was hoping that they could adapt the desktop Linux stack to work well on mobile devices, without introducing new concepts like wakelocks. It took them a long time to give up on that approach.

randomifcpfan · 2024-05-17T12:02:38 1715947358

https://opengoal.dev/ has reverse-engineered a compiler and runtime for the non-GC lisp GOAL that Naughty Dog used for the Jax and Daxter action adventure games. They have decompiled the game code for all three games.

VyseofArcadia · 2024-05-17T14:18:37 1715955517

Tangential rant:

My understanding is that Naughty Dog abandoned GOAL after the Jak games due to pressure from Sony. Sony wanted all of their studios to be able to share source code. [0]

I hate this. Any two organically grown codebases (like, for example, games from different studios) are going to be so different that significant, non-trivial code sharing between them is going to be impossible anyway. Anything sufficiently generic might as be distributed as a precompiled library, and then use the FFI in your language of choice to take advantage.

[0] https://web.archive.org/web/20070720142546/http://lists.midn...

cess11 · 2024-05-17T15:03:37 1715958217

"Add this to the difficulty curve of learning a new language for new hires".

This argument is such a pointy-haired boss argument. Mature applications and systems will be more complicated and take longer to learn than the basics of pretty much any programming language. Grab some juniors and teach them if local seniors don't want to work in the language for a reasonable price.

VyseofArcadia · 2024-05-17T15:24:40 1715959480

Most seniors I know wouldn't balk at learning a new language for a job, because most seniors know what you just stated to be true. It will take much longer to get up to speed on the codebase than it will be to learn the language. Even for "difficult" languages like a lisp or Haskell.

Management and HR seem to assume it will take significantly longer to get up to speed in a new language, but don't seem to care that new hires have to learn all of their weird C++ idioms that have built up over decades like atherosclerosis.

Jach · 2024-05-17T15:54:28 1715961268

Not to mention working programmers are expected to keep up with changes to C++, Python, Java, JavaScript (and its frameworks), Go2.0, etc., many of which constitute "new language" levels of different, not to mention actual language changes like JS -> TypeScript, or Java -> Kotlin, or ObjC -> Swift, and even occasionally mobile lang -> C++ (maybe just a shared core). There's plenty of evidence that it's not that bad. And meanwhile, Common Lisp hasn't changed, code from the 90s works unmodified, the only things to keep up with really are which libraries and implementation-specific features are new/interesting/in fashion (same as any language ecosystem).

pjmlp · 2024-05-17T19:28:10 1715974090

It is quite different, as those are incremental changes, and most of them can be ignored until there is a requirement to use a library or SDK that makes more recent features a requirement.

Any corporate developer knows the pain of actually being allowed to upgrade toolchains, traditionally lagging behind several years behind lang v-latest.

cess11 · 2024-05-17T20:14:04 1715976844

I much prefer upgraded libraries and tooling over sticking to deprecated stuff.

Handling upgrades is like doing the dishes, it has to be done, there's no use complaining about it.

pjmlp · 2024-05-17T20:17:40 1715977060

Someone has to put the money on the table for those upgrades.

This is the great thing about consulting instead of product development, developers are made constantly aware of their hourly rates.

No budget for spending time on upgrades, no upgrades.

cess11 · 2024-05-18T05:24:53 1716009893

Don't work for people that disallow upgrades and maintenance. Don't make deals with people who don't understand that software is finished when it's dismantled and the code deleted.

pjmlp · 2024-05-18T05:53:09 1716011589

Easier said than done, I have seen a lot since 1980's.

cess11 · 2024-05-18T20:30:02 1716064202

Do what you have to survive, then bail when you can.

randomifcpfan · 2024-03-24T19:04:22 1711307062

You can see them throughout Southern California. They are strikingly beautiful when in bloom. Not mentioned in the article is that the blossoms exude a sticky nectar that makes an absolute mess of the area under the tree.

_dp9d · 2024-03-24T22:48:21 1711320501

We had a very large one leaning over our pool in Australia. Every year Dad threatened to cut it down, every year Mum said she would leave if he did.

Tree is still there today, beautiful as ever. [1]

[1] (it's not flowering in this photo, it's the monster green one visible through the pergola - and it looks bigger than I remember!) https://bucket-api.domain.com.au/v1/bucket/image/2013673631_...

Also TONS of Eucalyptus and Jacarandas throughout South Africa which reminded me strongly of home.

ozzydave · 2024-03-25T02:18:52 1711333132

In Brisbane, the Jacarandas in bloom meant it was exam time

mixmastamyk · 2024-03-24T21:05:57 1711314357

Yes and later the purple blossoms fall too and become a purple goo. They’re pretty but probably belong in a park rather along streets with sidewalks.