Writing scales in time much better than speaking.
Talking in person provides the highest bandwidth mode of communication, but good writing lasts much longer.
My academic pet peeve is when someone writes a paper where the bulk of the work involves software or data processing, but they release no code or data with the paper. In the past, I’ve found these papers to be close to useless when there are more small, unstated problems they had to solve in order to make their project work than what is described in the paper.
This essay reminded me of the one valid justification for focusing more on the paper - a well-written paper describing an algorithm lasts much longer than a code implementation. Actually, I’d much rather get a thorough description on paper of some algorithm than some IDL scripts showing a proof of concept.
What can language models actually do?
The title of the article screams to me “this might be just wrong in 6 months”, as is common in LLM critique articles. For example, this February 2023 Steven Pinker article points out that ChatGPT fails to answer “If Mabel was alive at 9 a.m. and 5 p.m., was she alive at noon?”. Claude 3 Opus and ChatGPT-4 both answer this correctly. Other points he makes in the article are correct, and I don’t see LLMs producing writing or writing advice anywhere close to the caliber of Pinker’s, but it’s a waste of time to say “look, here’s something language models can’t do now. AI will never be able to have this capability.”
This author is a little more practical in explaining what LLMs can do, so is less likely to be wrong in 12 months than Steven Pinker. Still, he goes too far on what counts as “compression”: There may be many aspects of intelligence which are effectively compression, but I’m not sure that poetry counts. Something as broad as “poetry”, with all the different genres and styles over the years, can’t be tagged with a single label, especially if that label is “compression”.