Friend/foe individual writers on Hacker News
Number theorist Jared Lichtman says this AI proof is from "The Book", the highest compliment one can give. He also says:> I care deeply about this problem, and I've been thinking about it for the past 7 years. I'd frequently talk to Maynard about it in our meetings, and consulted over the years with several experts (Granville, Pomerance, Sound, Fox...) and others at Oxford and Stanford. This problem was not a question of low-visibility per-se. Rather, it seems like a proof which becomes strikingview on HN →
Holy moly this is upsetting to see on HN. If even here we're cheering on data center bans, AI is on track to become the next Concorde, or nuclear in the US. AI is the most amazing tech innovation that I've seen in my career since I started programming Perl back in 1994... Gosh, I'm gonna be gloomy for the next day.view on HN →
The Anthropic writeup addresses this explicitly:> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings. While the specific run that found the bug above cost under $50, that number only makes sense with full hindsight. Like any search process, we can't know in advance which run will succeed.Mythos scoured tview on HN →
As you know, I deeply respect you. Not trying to argue here, just provide my own perspective:> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?I write things for two main reasons: I feel like I have to. I need to create things. On some level, I would write stuff down even if nobody reads it (and I do do that already, with private things.) But secondly, to get my ideas out there and try toview on HN →
As is always the case with incredibly precise and rigorously fact-checked reporting like this, where every word is chosen carefully (the initial closing meeting for this one was nearly eight hours long, with full deliberation about each sentence), there is more out there on that subject than is explicitly on the page.view on HN →
One of the big users of Odin at the moment is JangaFX's EmberGen, which does real-time volumetric fluid simulations for games and film. https://jangafx.com/software/embergen/Odin has aided them with a huge amount of productivity and sanity of life which other languages such as C or C++ cannot offer, such as a strong and comprehensive type system, parametric polymorphism which is a pleasure to use, the implicit context system, extensive support for custom allocators, the `using` statement, the `dview on HN →
The issue is that “woman’s sports” is itself intentionally discriminatory. That the issue of discrimination comes up is to be expected.The idea of competitive sports exists in a framework of discrimination means that you will always have unhappy people.The good news is that sports, for the most part, is mostly symbolic, and rarely affects ones livelihood.view on HN →
My two cents as a transfem athlete:The attention this topic receives is disproportionate considering how rare we are, especially close to the Olympics level.Most of us do sports for fun/friends and don’t care how they rank us, but would be sad to be banned.There might be more “biological advantage” nuance with people just starting their transition, but by this many years in it feels silly. I registered as a man for the last event in case anyone might get upset, the staff changed it to say “womanview on HN →
We have ceded too much ground in this debate. When I say "trans women are women" I mean that, ontologically, it is really true that trans women are a subcategory of the general class "women."Like you say, we are searching for outliers. We don't cut women that are too strong or too tall. We shouldn't cut out women that happen to be trans. If all the top levels of women's sport end up dominated by trans athletes (something I don't see occurring, and that isn't supported by the data), then good, ouview on HN →
I might as well answer my own question, because I do think there are some coherent arguments for fundamental LLM limitations:1. LLMs are trained on human-quality data, so they will naturally learn to mimic our limitations. Their capabilities should saturate at human or maybe above-average human performance.2. LLMs do not learn from experience. They might perform as well as most humans on certain tasks, but a human who works in a certain field/code base etc. for long enough will internalize the rview on HN →
You're not doing yourself a favor when you point out "but they can't do arithmetic!" as if anyone says otherwise. Yes, we all know they can't do arithmetic, and that's just how they work.I feel like I'm saying "this hammer is so cool, it's made driving nails a breeze" and people go "but it can't screw screws in! Why won't anyone talk about that! Hammers really aren't all they're cracked up to be".view on HN →
Hey, yeah the default specification includes a set of action generators that are picked from randomly. If you write a custom spec you can define your own action generators and their weights.Rerunning things: nothing built for that yet, but I do have some design ideas. Repros are notoriously shaky in testing like this (unless run against a deterministic app, or inside Antithesis), but I think Bombadil should offer best-effort repros if it can at least detect and warn when things diverge.Shrinkingview on HN →
Wrong and short-sighted take given that the LLM explores serially learning along the way, and can tool use and change code arbitrarily. It seems to currently default to something resembling hyperparameter tuning in absence of more specific instructions. I briefly considered calling the project “autotune” at first but I think “autoresearch” will prove to be the significantly more appropriate name.view on HN →
It’s not false. But it’s also weaselly worded.Note that the article doesn’t say that he told staff they have to attend the meeting. It says he “asked” staff to attend the meeting. Which again, it’s really really normal for there to be an encouragement of “hey, since we just had an operational event, it would be good to prioritize attending this meeting where we discuss how to avoid operational events”.As for the second quote: senior engineers have always been required to sign off on changes fromview on HN →
So you read the three-part series of blogs that are packed in details in 3 minutes after I shared the link and put yourself into a position of entitled opinion and calling my position a silly take? Sure thing.view on HN →
I run a small open source LLM inference company, Synthetic.new. As far as I can tell, CNBC isn't reporting this accurately: the problem isn't that Oracle is building "yesterday's data centers": they're building Blackwell DCs! Those are today's DCs.The problem appears to be that Oracle is building today's DCs... Tomorrow. And by the time they come online, Vera Rubins will be out, with 5x efficiency gains. And Oracle is unlikely to want to drop the price of Blackwells 5x, despite them being 5x lesview on HN →
Right. The alternative is that we reward Dan for his 14 years of volunteer maintenance of a project... by banning him from working on anything similar under a different license for the rest of his life.view on HN →
Im touched that “Ghostty but for X” is a marketing point but what does it mean in this case? I thought this might be based on the architecture I did for Ghostty. But it’s not. Or it might be full native UI, but it’s not (it’s GPUI). Not trying to be rude or unappreciative but as the creator of Ghostty here… what do you mean?view on HN →
Not trying to be snarky, with all due respect... this is a skill issue.It's a tool. It's a wildly effective and capable tool. I don't know how or why I have such a wildly different experience than so many that describe their experiences in a similar manner... but... nearly every time I come to the same conclusion that the input determines the output.> If they implement something with a not-so-great approach, they'll keep adding workarounds or redundant code every time they run into limitations lview on HN →
What I keep hearing is that the people who weren't very good at writing software are the ones reluctant to embrace LLMs because they are too emotionally attached to "coding" as a discipline rather than design and architecture, which are where the interesting and actually difficult work is done.view on HN →
The marquee feature is obviously the 1M context window, compared to the ~200k other models support with maybe an extra cost for generations beyond >200k tokens. Per the pricing page, there is no additional cost for tokens beyond 200k: https://openai.com/api/pricing/Also per pricing, GPT-5.4 ($2.50/M input, $15/M output) is much cheaper than Opus 4.6 ($5/M input, $25/M output) and Opus has a penalty for its beta >200k context window.I am skeptical whether the 1M context window will provide mateview on HN →
only code anyone will be touching in a museum in 800 years will be the good code. I hope they don't talk about what great craftsmen we all were because someone saw an original Fabrice Bellard at the Louvre.Survivor bias plays a role in glorifying the past.view on HN →
Many people don't know this, but the Luddites were right. I studied Art History and this particular movement. One of the claims of the Luddites is that quality would go down, because their craft took half a lifetime to master (it was passed down from parent to chile.)I was able to feel wool scarves made in europe from the middle ages. (In museum storage, under the guidance of a curator) They are a fundamentally different product than what is produced in woolen mills. A handmade (in the old tradiview on HN →
Libraries create boundaries, which are in most cases arbitrary, that then limit the way you can interact with code, creating more boilerplate to get what you want from a library.Abstractions are the source of bloat. Without abstractions you can always reduce bloat, or you can reduce bloat in your glue, but you can't reduce glue.It takes discipline to NOT create arbitrary function signatures and short-lived intermediate data structures or type definitions. This is the beginning of boilerplate.Soview on HN →
What the author and many others find hard to digest is that LLMs are surfacing the reality that most of our work is a small bit of novelty against boiler plate redundant code.Most of what we do is programming is some small novel idea at high level and repeatable boilerplate at low level. A fair question is: why hasn’t the boilerplate been automated as libraries or other abstractions? LLMs are especially good at fuzzy abstracting repeatable code, and it’s simply not possible to get the same resulview on HN →
> I don't know how one can spins this as a bad thing.People spin all kinds of things if they believe (accurately or not) that their livelihood is on the line. The knee-jerk "AI universally bad" movement seems just as absurd to me as the "AGI is already here" one.> Spore is well acclaimed. Minecraft is literally the most sold game ever.Counterpoint: Oblivion, one of the first high-profile games to use procedural terrain/landscape generation, seemed very soulless to me at the time.As I see it, it'view on HN →
Fine tuning is a story that is nice to tell but that with modern LLMs makes less and less sense. Modern LLMs are so powerful that they are able to few shot learn complicated things, so a strong prompt and augmenting the generation (given the massive context window of Qwen3.5, too) is usually the best option available. There are models for which fine tuning is great, like image models: there with LoRa you can get good results in many ways. And LLMs of the past, too: it made sense for certain useview on HN →
Oh I wrote up a post on X on this exact question! https://x.com/danielhanchen/status/1979389893165060345?s=201. Cursor used online RL to get +28% approval rate: https://cursor.com/blog/tab-rl2. Vercel used RFT for their AutoFix model for V0: https://vercel.com/blog/v0-composite-model-family3. Perplexity's Sonar for Deep Research Reasoning I think was a finetuned model: https://docs.perplexity.ai/docs/getting-started/overview4. Doordash uses LoRA, QLoRA for a "Generalized Attribute Extraction modview on HN →
At some point you just have to stop responding to these "stochastic parrot/auto-complete" people.It isn't worth your intellectual bandwidth. They will eventually understand or they won't (Which I'm not sure how that is going to work for them... but the Amish had to start somewhere I suppose..)view on HN →
Well, just fired it up on Windows, and already dislike it. And I went in with a positive attitude, because I would welcome a better tool than VS Code.Main problem: No menu. Where are the settings? The first thing I wanted to do was move the file treeview to the left side; I don't know what country the authors live in, but in Western countries we read from left to right. But nope, there's no View menu or anything of the sort.Then I examined every other little button around the UI, to no avail. Iview on HN →
I need more than that because I have no guarantee that its true. I need the source. Or I at least need them to provide a build that they promise doesn't have that stuff in it at all, so that if any analysis was done on a decompilation, there would be some level of certainty that they were telling the truth. Anything that leaves any of it in complicates that effort and makes the certainty that less certain.view on HN →
We should start a support group.I feel like LLMs[1] are going to cause a kind of "divorce" between those who love making software and those who love selling software. It was difficult for these two groups to communicate and coordinate before, and now it is _excruciating_. What little mutual tolerance and slack there was, is practically gone.Open source was always[2] a fragile arrangement based on the kind of trust that involves looking at things through one's fingers (turning a blind eye may beview on HN →
Thank you, that feels like important context!view on HN →
Agree. Additionally, it’s really disheartening that people do this with Erdos problems specifically. They are not major research questions in mathematics, but were intended as little conjectures that people could use as a way into serious number theory with a small cash reward and a little bit of minor fame for being the person who did the work to solve one of them. They are not things where the solution itself provides an amazing amount of insight or moves the frontier of mathematics forward pview on HN →
One of the people on the Erdös problem website (https://www.erdosproblems.com/forum/thread/1196), Jared Lichtman, is involved in a AI startup:https://www.math.inc/That AI startup also partners with Terence Tao:https://www.math.inc/veritas-fellowshipshttps://www.math.inc/a-conversation-with-terry-taoThese two AI "enthusiasts" have massive conflicts of interest, which should perhaps be investigated by an ethics commission.view on HN →
Which one do you trust most, the disclaimers or the article?view on HN →
> into thinking they are turbo-charged devsFortunately no-one sane enough among us, computer programmers, believes in that bs, we all see this masquerade for what it mostly is, basically a money grab.view on HN →
Next up:Spend 1.99 and get a chest full of Anthropic emeralds, that you can redemem for Claude Chests, and a chance at winning a million more tokens.Or watch this 3 minute ad, for 1000 tokens.I did not think this day would come this soon, but I assure you that anthropic has no moat.view on HN →
My kingdom for a way to stop this godforsaken industry from stripping Tolkien's fiction for parts.view on HN →
Isn't this pretty much the standard across projects that make heavy use of AI code generation?Using AI to generate all your code only really makes sense if you prioritize shipping features as fast as possible over the quality, stability and efficiency of the code, because that's the only case in which the actual act of writing code is the bottleneck.view on HN →
Probably all describe problems stem from the developers using agent coding; including using TypeScript, since these tools are usually more familiar with Js/Js adjacent web development languages.view on HN →
Why do we think this emerged “on its own”? Surely this technique has been discussed in research papers that are in the training set.view on HN →
tfw le AI guy has LLM psychosis. We're cookedview on HN →
I think the OP's comment is entirely fair. Karpathy and others come across to me as people putting a hose into itself: they work with LLMs to produce output that is related to LLMs.I might reframe the comment as: are you actually using LLMs for sustained, difficult work in a domain that has nothing to do with LLMs?It feels like a lot of LLM-oriented work is fake. It is compounding "stuff," both inputs and outputs, and so the increased amount of stuff makes it feel like we're living in a higherview on HN →
Have you actually used LLMs for non trivial tasks? They are still incredibly bad when it comes to actually hard engineering work and they still lie all the time, it's just gotten harder to notice, especially if you're just letting it run all night and generate reams of crap.Most people are optimizing for terrible benchmarks and then don't really understand what the model did anyone and just assume it did something good. It's the blind leading the blind basically, and a lot of people with an AI-pview on HN →
I feel like most of this recent Autoresearch trend boils down to reinventing hyper-parameter tuning. Is the SOTA still Bayesian optimization when given a small cluster? It was ~3 years ago when I was doing this kind of work, haven't kept up since then.Also, shoutout SkyPilot! It's been a huge help for going multi-cloud with our training and inference jobs (getting GPUs is still a nightmare...)!view on HN →
Oh, I thought it was about the wholesale theft (relicensing) of code by laundering through an LLM trained on the same code. ¿Porque no los dos?view on HN →
> We, agentic coders, can easily enough fork their project and add whatever the featuresBold of you to assume that people won’t move (and their code along with it) to spaces where parasitic behaviour like this doesn’t occur, locking you out.In addition to just being a straight-up rude, disrespectful and parasite position to take, you’re effectively poisoning your own well.view on HN →
Well at least you, agentic coders, already understand they need to fork off.Saves the rest of us from having to tell you.view on HN →
> I've done nothing to argue that the harm isn't real, downplayed it, nor misrepresented it.You're literally saying that the upsides of hallucinanigenic gifts are worth the downside of collapsing society. I'd say that that is downplaying and misrepreting the issue. You even go so far to say>Telling people "no AI!" (even if very well defined on what that means) is toothless against people with little regard for making the world (or just one specific repo) a better place.These aren't balanced arguview on HN →
The premise LLM are "AI" is false, but are good at problems like context search, and isomorphic plagiarism.Given the liabilities of relying on public and chat users markdown data to sell to other users without compensation raises a number of issues:1. Copyright: LLM generated content can't be assigned copyright (USA), and thus may contaminate licensing agreements. It is likely public-domain, but also may conflict with GPL/LGPL when stolen IP bleeds through weak obfuscation. The risk has zeroview on HN →
Wait. You built a new language, that there's thus no training data for.Who the hell is going to use it then? You certainly won't, because you're dependent on AI.view on HN →
I'd love to be a fly on the wall when this argument is tried in front of a bankruptcy court. It drives me nuts. Of course there's evidence that they're selling tokens at a loss.The only thing these companies sell are tokens. That's their entire output. OpenAI is trying to build an ad business but it must be quite small still relative to selling tokens because I've not yet seen a single ad on ChatGPT. It's not like these firms have a huge side business selling Claude-themed baseball caps.That meaview on HN →
They're all that small if you split them as OP did. Just look at "transportation", it's like 25% of co2 emitted globally, but once you break it down:Aviation is 2.5%: https://ourworldindata.org/global-aviation-emissionsShipping industry is 3%: https://www.transportenvironment.org/topics/shipsLarge truck freight is 3%: https://www.statista.com/statistics/1414750/carbon-dioxide-e...Medium truck freight is 1%The single biggest non divisible sector you can realistically come up with is "personal traview on HN →
This is basically a weaponized, highly destructive version of the old MySpace Samy worm. Hitting MediaWiki:Common.js is the absolute nightmare scenario for MediaWiki deployments because that script gets executed by literally every single visitor and editor across the entire site, creating a massive, instant propagation loop. The fact that it specifically targets admins and then uses jQuery to blind them by hiding the UI elements while it silently triggers Special:Nuke in the background is incredview on HN →
I love OSS drama so I found some links:https://github.com/SerenityOS/serenity/pull/6814https://x.com/LundukeJournal/status/1970907449499484266view on HN →
I think his claim basically boils down to "if you're expecting AI, LLMs don't cut it". And I think he's basically right on that count. There's a lot of tooling and harnessing being put in place to course correct them on the job, and from the other angle standards are simply being lowered to accommodate them. So they can be made to be useful, but they're still not what you would want from an actual AI. Marcus wants to augment them with symbolic AI. I don't know how feasible that is, but he's notview on HN →
Hacker Smacker — Friend and foe writers on Hacker News