‘Sputnik moment’: $1tn wiped off US stocks after Chinese firm unveils AI chatbot

Alas Poor Erinaceus@lemmy.ml · 3 months ago

‘Sputnik moment’: $1tn wiped off US stocks after Chinese firm unveils AI chatbot

PlantPowerPhysicist@discuss.tchncs.de · 3 months ago

Remember to cancel your Microsoft 365 subscription to kick them while they’re down

Joe Dyrt@lemmy.ml · 3 months ago

Joke’s on them: I never started a subscription!

Zink@programming.dev · 3 months ago

I don’t have one to cancel, but I might celebrate today by formatting the old windows SSD in my system and using it for some fast download cache space or something.

Treczoks@lemmy.world · 3 months ago

Looks like it is not any smarter than the other junk on the market. The confusion that people consider AI as “intelligence” may be rooted in their own deficits in that area.

And now people exchange one American Junk-spitting Spyware for a Chinese junk-spitting spyware. Hurray! Progress!

Naia@lemmy.blahaj.zone · 3 months ago

I’m tired of this uninformed take.

LLMs are not a magical box you can ask anything of and get answers. If you are lucky and blindly asking questions it can give some accurate general data, but just like how human brains work you aren’t going to be able to accurately recreate random trivia verbatim from a neural net.

What LLMs are useful for, and how they should be used, is a non-deterministic parsing context tool. When people talk about feeding it more data they think of how these things are trained. But you also need to give it grounding context outside of what the prompt is. give it a PDF manual, website link, documentation, whatever and it will use that as context for what you ask it. You can even set it to link to reference.

You still have to know enough to be able to validate the information it is giving you, but that’s the case with any tool. You need to know how to use it.

As for the spyware part, that only matters if you are using the hosted instances they provide. Even for OpenAI stuff you can run the models locally with opensource software and maintain control over all the data you feed it. As far as I have found, none of the models you run with Ollama or other local AI software have been caught pushing data to a remote server, at least using open source software.

kshade@lemmy.world · edit-2 3 months ago

Looks like it is not any smarter than the other junk on the market. The confusion that people consider AI as “intelligence” may be rooted in their own deficits in that area.

Yep, because they believed that OpenAI’s (two lies in a name) models would magically digivolve into something that goes well beyond what it was designed to be. Trust us, you just have to feed it more data!

And now people exchange one American Junk-spitting Spyware for a Chinese junk-spitting spyware. Hurray! Progress!

That’s the neat bit, really. With that model being free to download and run locally it’s actually potentially disruptive to OpenAI’s business model. They don’t need to do anything malicious to hurt the US’ economy.

gerryflap@feddit.nl · 3 months ago

The difference is that you can actually download this model and run it on your own hardware (if you have sufficient hardware). In that case it won’t be sending any data to China. These models are still useful tools. As long as you’re not interested in particular parts of Chinese history of course ;p

tetris11@lemmy.ml · 3 months ago

It is progress in a sense. The west really put the spotlight on their shiny new expensive toy and banned the export of toy-maker parts to rival countries. One of those countries made a cheap toy out of jank unwanted parts for much less money and it’s of equal or better par than the west’s.

As for why we’re having an arms race based on AI, I genuinely dont know. It feels like a race to the bottom, with the fallout being the death of the internet (for better or worse)

تحريرها كلها ممكن@lemmy.ml · 3 months ago

It is open source, so it should be audited and if there are back doors they can be plugged in a fork

UnderpantsWeevil@lemmy.world · edit-2 3 months ago

And now people exchange one American Junk-spitting Spyware for a Chinese junk-spitting spyware.

LLMs aren’t spyware, they’re graphs that organize large bodies of data for quick and user-friendly retrieval. The Wikipedia schema accomplishes a similar, abet more primitive, role. There’s nothing wrong with the fundamentals of the technology, just the applications that Westoids doggedly insist it be used for.

If you no longer need to boil down half a Great Lake to create the next iteration of Shrimp Jesus, that’s good whether or not you think Meta should be dedicating millions of hours of compute to this mind-eroding activity.

daltotron@lemmy.ml · 3 months ago

I think maybe it’s naive to think that if the cost goes down, shrimp jesus won’t just be in higher demand. Shrimp jesus has no market cap, bullshit has no market cap. If you make it more efficient to flood cyberspace with bullshit, cyberspace will just be flooded with more bullshit. Those great lakes will still boil, don’t worry.

WoodScientist@sh.itjust.works · 3 months ago

There’s nothing wrong with the fundamentals of the technology, just the applications that Westoids doggedly insist it be used for.

Westoids? Are you the type of guy I feel like I need to take a shower after talking to?

wulrus@programming.dev · 3 months ago

With understanding LLM, I started to understand some people and their “reasoning” better. That’s how they work.

RandomVideos@programming.dev · 3 months ago

artificial intelligence

AI has been used in game development for a while and i havent seen anyone complain about the name before it became synonymous with image/text generation

kshade@lemmy.world · 3 months ago

It was a misnomer there too, but at least people didn’t think a bot playing C&C would be able to save the world by evolving into a real, greater than human intelligence.

Treczoks@lemmy.world · 3 months ago

Well, that is where the problems started.

MetalMachine@feddit.nl · 3 months ago

The best part is that it’s open source and available for download

CeeBee_Eh@lemmy.world · 3 months ago

I asked it about Tiananmen Square, it told me it can’t answer that because it can only respond with “harmless” responses.

Valmond@lemmy.world · 3 months ago

That’s kind of normal, it was made in China after all and the developers didn’t want to end up in jail I bet.

That said, china is of course a crappy dictatorship.

MetalMachine@feddit.nl · 3 months ago

Yes the online model has those filters. Some one tried it with one of the downloaded models and it answers just fine

Ascend910@lemmy.ml · 3 months ago

When running locally, it works just fine without filters

CeeBee_Eh@lemmy.world · 3 months ago

This was a local instance.

apprehensively_human@lemmy.ca · 3 months ago

Does the same thing on my local instance.

Phoenicianpirate@lemm.ee · 3 months ago

So can I have a private version of it that doesn’t tell everyone about me and my questions?

MetalMachine@feddit.nl · 3 months ago

SpaceRanger@lemmy.world · 3 months ago

Checkout ollama. https://ollama.com/library/deepseek-r1

Phoenicianpirate@lemm.ee · 3 months ago

Thank you very much. I did ask chatGPT was technical questions about some… subjects… but having something that is private AND can give me all the information I want/need is a godsend.

Goodbye, chatGPT! I barely used you, but that is a good thing.

λλλ@programming.dev · 3 months ago

Yep, lookup ollama

tooclose104@lemmy.ca · 3 months ago

Can someone with the knowledge please answer this question?

TonyTonyChopper@mander.xyz · 3 months ago

Yes, you can run a downgraded version of it on your own pc.

tooclose104@lemmy.ca · 3 months ago

Apparently phone too! Like 3 cards down was another post linking to instructions on how to run it locally on a phone in a container app or termux. Really interesting. I may try it out in a vm on my server.

boomzilla@programming.dev · edit-2 3 months ago

I watched one video and read 2 pages of text. So take this with a mountain of salt. From that I gathered that deepseek R1 is the model you interact with when you use the app. The complexity of a model is expressed as the number of parameters (though I don’t know yet what those are) which dictate its hardware requirements. R1 contains 670 bn Parameter and requires very very beefy server hardware. A video said it would be 10th of GPUs. And it seems you want much of VRAM on you GPU(s) because that’s what AI crave. I’ve also read 1BN parameters require about 2GB of VRAM.

Got a 6 core intel, 1060 6 GB VRAM,16 GB RAM and Endeavour OS as a home server.

I just installed Ollama in about 1/2 an hour, using docker on above machine with no previous experience on neural nets or LLMs apart from chatting with ChatGPT. The installation contains the Open WebUI which seems better than the default you got at ChatGPT. I downloaded the qwen2.5:3bn model (see https://ollama.com/search) which contains 3 bn parameters. I was blown away by the result. It speaks multiple languages (including displaying e.g. hiragana), knows how much fingers a human has, can calculate, can write valid rust-code and explain it and it is much faster than what i get from free ChatGPT.

The WebUI offers a nice feedback form for every answer where you can give hints to the AI via text, 10 score rating thumbs up/down. I don’t know how it incooperates that feedback, though. The WebUI seems to support speech-to-text and vice versa. I’m eager to see if this docker setup even offers programming APIs.

I’ll probably won’t use the proprietary stuff anytime soon.

Mongostein@lemmy.ca · 3 months ago

Yeah, but you have to run a different model if you want accurate info about China.

Alsephina@lemmy.ml · 3 months ago

Unfortunately it’s trained on the same US propaganda filled english data as any other LLM and spits those same talking points. The censors are easy to bypass too.

Phoenicianpirate@lemm.ee · 3 months ago

Yeah but China isn’t my main concern right now. I got plenty of questions to ask and knowledge to seek and I would rather not be broadcasting that stuff to a bunch of busybody jackasses.

Mongostein@lemmy.ca · 3 months ago

I agree. I don’t know enough about all the different models, but surely there’s a model that’s not going to tell you “<whoever’s> government is so awesome” when asking about rainfall or some shit.

jaschen@lemm.ee · 3 months ago

Yes but your server can’t handle the biggest LLM.

scratsearcher 🔍🔮📊🎲@sopuli.xyz · 3 months ago

All of this deepseek hype is overblown. Deepseek model was still trained on older american Nvidia GPUs.

ocassionallyaduck@lemmy.world · 3 months ago

Your confidence in this statement is hilarious the fact that it doesn’t help your argument at all. If anything, the fact they refined their model so well on older hardware is even more remarkable, and quite damning when OpenAI claims it needs literally cities worth of power and resources to train their models.

b161@lemmy.blahaj.zone · 3 months ago

AI is overblown, tech is overblown. Capitalism itself is a senseless death cult based on the non-sensical idea that infinite growth is possible with a fragile, finite system.

Phen@lemmy.eco.br · 3 months ago

“wiped”? There was money and it ceased to exist?

mosscap@slrpnk.net · 3 months ago

It’s pixie dust

breadguyyyyyyyyy@sh.itjust.works · 3 months ago

“off US stocks”

protist@mander.xyz · edit-2 3 months ago

The money went back into the hands of all the people and money managers who sold their stocks today.

Edit: I expected a bloodbath in the markets with the rhetoric in this article, but the NASDAQ only lost 3% and the DJIA was positive today…

Nvidia was significantly over-valued and was due for this. I think most people who are paying attention knew that

someacnt@sh.itjust.works · 3 months ago

To be fair, NQ futures momentarily dropped 5% before recovering some. A few days from now on would be interesting.

Hexadecimalkink@lemmy.ml · 3 months ago

Trump counterbalance keeping it in check but my gut is saying once tariffs come in February there’s going to be a market correction. Pure speculation on my part.

Jimmycakes@lemmy.world · 3 months ago

You don’t have to say speculation when talking about the future of stocks. It’s implied unless you are a time traveler in which case you should lead with that.

Hexadecimalkink@lemmy.ml · 3 months ago

I am a time traveller and I was trying to throw you off my trail but I seem to have failed.

shootwhatsmyname@lemm.ee · 3 months ago

There’s been a lot of disproportionate hype around deepseek lately

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 3 months ago

I’d argue this is even worse than Sputnik for the US because Sputnik spurred technological development that boosted the economy. Meanwhile, this is popping the economic bubble in the US built around the AI subscription model.

toothbrush@lemmy.blahaj.zone · edit-2 3 months ago

One of those rare lucid moments by the stock market? Is this the market correction that everyone knew was coming, or is some famous techbro going to technobabble some more about AI overlords and they return to their fantasy values?

scratsearcher 🔍🔮📊🎲@sopuli.xyz · 3 months ago

Most rational market: Sell off NVIDIA stock after Chinese company trains a model on NVIDIA cards.

Anyways NVIDIA still up 1900% since 2020 …

how fragile is this tower?

themoonisacheese@sh.itjust.works · 3 months ago

It’s quite lucid. The new thing uses a fraction of compute compared to the old thing for the same results, so Nvidia cards for example are going to be in way less demand. That being said Nvidia stock was way too high surfing on the AI hype for the last like 2 years, and despite it plunging it’s not even back to normal.

davel@lemmy.ml · 3 months ago

If AI is cheaper, then we may use even more of it, and that would soak up at least some of the slack, though I have no idea how much.

CameronDev@programming.dev · 3 months ago

How is the “fraction of compute” being verified? Is the model available for independent analysis?

toothbrush@lemmy.blahaj.zone · 3 months ago

Its freely availible with a permissive license, but I dont think that that claim has been verified yet.

Zaktor@sopuli.xyz · 3 months ago

And the data is not available. Knowing the weights of a model doesn’t really tell us much about its training costs.

jacksilver@lemmy.world · 3 months ago

My understanding is it’s just an LLM (not multimodal) and the train time/cost looks the same for most of these.

DeepSeek ~$6million https://www.theregister.com/2025/01/26/deepseek_r1_ai_cot/?td=rt-3a
Llama 2 estimated ~$4-5 million https://www.visualcapitalist.com/training-costs-of-ai-models-over-time/

I feel like the world’s gone crazy, but OpenAI (and others) is pursing more complex model designs with multimodal. Those are going to be more expensive due to image/video/audio processing. Unless I’m missing something that would probably account for the cost difference in current vs previous iterations.

will_a113@lemmy.ml · 3 months ago

The thing is that R1 is being compared to gpt4 or in some cases gpt4o. That model cost OpenAI something like $80M to train, so saying it has roughly equivalent performance for an order of magnitude less cost is not for nothing. DeepSeek also says the model is much cheaper to run for inferencing as well, though I can’t find any figures on that.

jacksilver@lemmy.world · 3 months ago

My main point is that gpt4o and other models it’s being compared to are multimodal, R1 is only a LLM from what I can find.

Something trained on audio/pictures/videos/text is probably going to cost more than just text.

But maybe I’m missing something.

will_a113@lemmy.ml · 3 months ago

The original gpt4 is just an LLM though, not multimodal, and the training cost for that is still estimated to be over 10x R1’s if you believe the numbers. I think where R 1 is compared to 4o is in so-called reasoning, where you can see the chain of though or internal prompt paths that the model uses to (expensively) produce an output.

jacksilver@lemmy.world · edit-2 3 months ago

I’m not sure how good a source it is, but Wikipedia says it was multimodal and came out about two years ago - https://en.m.wikipedia.org/wiki/GPT-4. That being said.

The comparisons though are comparing the LLM benchmarks against gpt4o, so maybe a valid arguement for the LLM capabilites.

However, I think a lot of the more recent models are pursing architectures with the ability to act on their own like Claude’s computer use - https://docs.anthropic.com/en/docs/build-with-claude/computer-use, which DeepSeek R1 is not attempting.

Edit: and I think the real money will be in the more complex models focused on workflows automation.

WalnutLum@lemmy.ml · 3 months ago

Yea except DeepSeek released a combined Multimodal/generation model that has similar performance to contemporaries and a similar level of reduced training cost ~20 hours ago:

https://huggingface.co/deepseek-ai/Janus-Pro-7B

veroxii@aussie.zone · 3 months ago

Holy smoke balls. I wonder what else they have ready to release over the next few weeks. They might have a whole suite of things just waiting to strategically deploy

modulus@lemmy.ml · 3 months ago

One of the things you’re missing is the same techniques are applicable to multimodality. They’ve already released a multimodal model: https://seekingalpha.com/news/4398945-deepseek-releases-open-source-ai-multimodal-model-janus-pro-7b

SplashJackson@lemmy.ca · 3 months ago

Lol serves you right for pushing AI onto us without our consent

SocialMediaRefugee@lemmy.ml · 3 months ago

The determination to make us use it whether we want to or not really makes me resent it.

Ech@lemm.ee · 3 months ago

Hilarious that this happens the week of the 5090 release, too. Wonder if it’ll affect things there.

drspod@lemmy.ml · 3 months ago

Apparently they have barely produced any so they will all be sold out anyway.

chiliedogg@lemmy.world · 3 months ago

And without the fake frame bullshit they’re using to pad their numbers, its capabilities scale linearly with the 4090. The 6090 just has more cores, Ram, and power.

If the 4000-series had had cards with the memory and core count of the 5090, they’d be just as good as the 50-series.

lordnikon@lemmy.world · edit-2 3 months ago

By that point you will have to buy the Mico fission reactor addon to power the 6090. It’s like Nvidia looked at the power triangle of power / price and preformence and instead of picking two they just picked one and to hell with the rest.

Lumiluz@slrpnk.net · 3 months ago

Nah, they just made the triangle bigger with AI (/s)

protist@mander.xyz · 3 months ago

Emergence of DeepSeek raises doubts about sustainability of western artificial intelligence boom

Is the “emergence of DeepSeek” really what raised doubts? Are we really sure there haven’t been lots of doubts raised previous to this? Doubts raised by intelligent people who know what they’re talking about?

floofloof@lemmy.ca · edit-2 3 months ago

Ah, but those “intelligent” people cannot be very intelligent if they are not billionaires. After all, the AI companies know exactly how to assess intelligence:

Microsoft and OpenAI have a very specific, internal definition of artificial general intelligence (AGI) based on the startup’s profits, according to a new report from The Information. … The two companies reportedly signed an agreement last year stating OpenAI has only achieved AGI when it develops AI systems that can generate at least $100 billion in profits. That’s far from the rigorous technical and philosophical definition of AGI many expect. (Source)

skuzz@discuss.tchncs.de · 3 months ago

Almost like yet again the tech industry is run by lemming CEOs chasing the latest moss to eat.

Dupree878@lemmy.world · 3 months ago

Interesting it won’t let you login or signup using a VPN, even set to the correct country

NιƙƙιDιɱҽʂ@lemmy.world · edit-2 3 months ago

Aren’t VPNs illegal in China?

Black History Month@lemmy.world · 3 months ago

Let’s tariff taiwan!

UnderpantsWeevil@lemmy.world · 3 months ago

TSMC just finished building out a foundry in Arizona, so there’s a nativist argument that we don’t need the island’s original facilities anymore.

Horsey@lemmy.world · 3 months ago

Only building outdated chips on an old fab process. And they’re having a hard time hiring Americans to work there.

https://www.glassdoor.com/Reviews/TSMC-Reviews-E4130.htm

ziproot@lemmy.ml · 3 months ago

Tech bros learn about diminishing returns challenge (impossible)

Clent@lemmy.dbzer0.com · 3 months ago

No surprise. American companies are chasing fantasies of general intelligence rather than optimizing for today’s reality.

Naia@lemmy.blahaj.zone · 3 months ago

That, and they are just brute forcing the problem. Neural nets have been around for ever but it’s only been the last 5 or so years they could do anything. There’s been little to no real breakthrough innovation as they just keep throwing more processing power at it with more inputs, more layers, more nodes, more links, more CUDA.

And their chasing a general AI is just the short sighted nature of them wanting to replace workers with something they don’t have to pay and won’t argue about it’s rights.

supersquirrel@sopuli.xyz · edit-2 3 months ago

Also all of these technologies forever and inescapably must rely on a foundation of trust with users and people who are sources of quality training data, “trust” being something US tech companies seem hell bent on lighting on fire and pissing off the yachts of their CEOs.