• 0 Posts
  • 15 Comments
Joined 11 months ago
cake
Cake day: March 8th, 2024

help-circle

  • But that’s the thing, there was actual competition. It’s not like they weren’t competing with each other.

    They are freaking out because the competition is Chinese, specifically. I seriously doubt the read of this situation would be that the bottom fell out of AI if one of the usual broligarchs had come up with a cheaper process for training.

    Did the US accidentally generate an incentive for that to happen in China by shoddily blocking tensor math accelerators but only the really fancy ones and only kinda sorta sometimes? Sure. But both the fearmongering being used to enforce those limitations and the absolute freakout they are currently having seems entirely disconnected from reality.

    Maybe we can go back to treating this as computer science rather than an arms race for a while now.






  • Sure, 15% isn’t the worst adjustment we’ve seen in a tech company by a long shot, even if the absolute magnitude of that loss is absolutely ridiculous because Nvidia is worth all the money, apperently.

    But everybody is acting like this is a seismic shock, which is fascinatingly bizarre to me. It seems the normie-investor axis really believed that forcing Nvidia to sell China marginally slower hardware was going to cripple their ability to make chatbots permanently, which I feel everybody had called out as being extremely not the case even before these guys came up with a workaround for some of the technical limitations.


  • But the models that are posted right now don’t seem any smaller. The full precision model is positively humongous.

    They found a way to train it faster. Fine. So they need fewer GPUs and can do it on slower ones that are much, much cheaper. I can see how Nvidia takes a hit on the training side.

    But presumably the H100 is still faster than the H800s they used for this and presumably running the resulting model is still just as hard. All the improvements seem like they’re on the training side.

    Granted, I don’t understand what they did and will have to go fishing for experts walking through it in more detail. I still haven’t been able to run it myself, etiher, maybe it’s still large but runs lighter on processing and that’s noticeable. I just haven’t seen any measurements of that side of things yet. All the coverage is about how cheap the training was on H800s.


  • Reposting from a similar post, but… I went over to huggingface and took a look at this.

    Deepseek is huge. Like Llama 3.3 huge. I haven’t done any benchmarking, which I’m guessing is out there, but it surely would take as much Nvidia muscle to run this at scale as ChatGPT, even if it was much, much cheaper to train, right?

    So is the rout based on the idea that the need for training hardware is much smaller than suspected even if the operation cost is the same… or is the stock market just clueless and dumb and they’re all running on vibes at all times anyway?



  • Well, yeah, but people say they won’t pay 150 bucks for a game, so that stable 60 dollar price had to come from somewhere.

    Honestly, it’s a lot of whiplash to see people paint this as a big corporate conspiracy and then turn around to defend Valve who, let’s not forget, invented the whole idea. It’s not like chain gaming retailers were a particularly strong force for good, either, but they did pay wages to more people than Steam, I guess.

    It’ll be very interesting to see how much of this is people walking away from the Switch, coming back to the Switch 2 or just… you know, only ever playing Fortnite and Minecraft for their entire lives. The issues here are bigger and not a Sony conspiracy to steal trucker wages (although there’s that, too).



  • But I can wrap my head around that 51 is divisible by seventeen because of 21 and seven plus something that deals with the remaining 30 somewhere.

    I know that’s not how it works, but as you say it fixes my vibes when I see the 21 hiding inside the 51.

    I’ll say this: the other thing that makes this one a hard pill to swallow is that 17 looks way too big, and my vibes fix doesn’t address that, but hey.


  • I think it helps to remember that 3 times 7 is 21. When I think about that it looks less wrong.

    It’s the stupid seven multiplication table. Whatever glitch in human software makes it look so much less intuitive than all the others messes with so many other things that should be easy. I swear I struggle every time I have to look at it. I had to double check seven times three multiple times right now.