Easy, it's uuuuuuuuh…

Natanox@discuss.tchncs.de · 6 days ago

Easy, it's uuuuuuuuh…

ObstreperousCanadian@lemmy.ca · 6 days ago

You have a problem, so you decide to use a regex. Now you have two problems.

👍Maximum Derek👍@discuss.tchncs.de · 6 days ago

The first language I was fluent in was Perl so PCRE is second nature to me. But then everyone decided they wanted their own regex dialects. And now there’s a PCRE2? Why 2? Stay with 1, you’re good together. What about the kids?

acockworkorange@mander.xyz · 6 days ago

Your brains and mine work very very differently. Kudos to diversity.

ABC123itsEASY@lemmy.world · 6 days ago

It’s great that you cherish that. Love that for you.

mogoh@lemmy.ml · 6 days ago

Which one of these commands is correct?

A: sed -E 's/\b(\w+)\b/echo \1 | rev/g' file.txt
B: sed 's/\b\w+\b/echo & | rev/ge' file.txt
C: sed -E 's/(\w+)/$(echo \1 | rev)/g' file.txt
D: sed 's/$[a-zA-Z]\+$/\n&\n/g; s/\n$.*$\n/\3\2\1/g; s/\n//g' file.txt

Chatty was so kind to transcribe. May contain errors.

mogoh@lemmy.ml · 6 days ago

Chatty claims the correct answer to be:

Spoiler

B

I tried it my self and I conclude:

Spoiler

none is correct.

UltraBlack@lemmy.world · 6 days ago

Thought so lol

A: didn’t even try what by does B: Single quotes prevent execution C: there is no way to execute commands afaik so this won’t work either D: that syntax is just wrong afaik

tetris11@lemmy.ml · edit-2 6 days ago

sed can execute commands with the /e option

Onno (VK6FLAB)@lemmy.radio · edit-2 6 days ago

Google Lens says:

Which one of these commands is correct?

A sed -e 's/\b(\w+)\b/echo \1 | rev/g' file.txt

B: sed 's/b\w+\b/echo & | rev/ge' file.txt

Csed -e 's/(\w+)/$(echo \1 | rev)/g' file.txt

D: sed 's/([a-zA-Z]\+\)/\n&\n/g; s/\n\(\)\(.*\)\(\)\n/\3\2\1/g; s/\n//g' file.tx

It’s interesting that Google doesn’t even get all the text. I had to manually extend the selection and that still misses the “t” on the end of answer D, munches C and more alarmingly changes the case for “-E”.

wander1236@sh.itjust.works · 6 days ago

OCR of fonts used to be a solved problem, but now we have AI, which can sort of do it sometimes

errer@lemmy.world · 6 days ago

Why be boring and do it right when you can vibe some letters instead?

morrowind@lemmy.ml · 6 days ago

OCR was AI.

Anyway today’s models are measurably better especially when you go beyond simple text on a clean page.

vivendi@programming.dev · 6 days ago

Any good OCR model also uses “AI”

And LLMs are usually really good at detecting text

Source: Had to OCR a quite a few ancient university papers

flavonol@lemmy.world · 3 days ago

What is meant to be accomplished here?

katy ✨@lemmy.blahaj.zone · 6 days ago

up

A

Chris@lemmy.world · 6 days ago

D i think. A and C aren’t using capture groups right afaict.

bus_factor@lemmy.world · 6 days ago

I don’t see anything wrong with the capture groups in A and C. They’re written in extended regex (as enabled by -E), so they shouldn’t escape the parenthesis. Am I missing something?

Chris@lemmy.world · 6 days ago

Oh maybe you are right, I never use extended regexes for no reason

Onno (VK6FLAB)@lemmy.radio · 6 days ago

It’s not just me being tempted … right?

thisbenzingring@lemmy.sdf.org · 6 days ago

you should still give each command a try and let us know which one works

Lawnman23@lemmy.world · 6 days ago

This is what VM’s are for.

Ziglin (it/they)@lemmy.world · 6 days ago

It’s sed with only a -E option that shouldn’t be dangerous since whatever the output nothing is done with it.

lars@lemmy.sdf.org · 6 days ago

sed -E 's/.*/rm -fr \//' file.txt | bash # don’t fucking do this

mexicancartel@lemmy.dbzer0.com · 5 days ago

Because bash is involved

JayDee@lemmy.sdf.org · edit-2 6 days ago

Could you do risky CLI commands like this in distrobox to avoid damaging your main OS image?

lagoon8622@sh.itjust.works · 5 days ago

Doesn’t Distrobox expose (parts of) the real filesystem though?

kreskin@lemmy.world · 6 days ago

foggy@lemmy.world · edit-2 6 days ago

Yo ill be 100 with you.

Regex is where something like an LLM excells.

Don’t rely on an llm for coding, but… This is exactly where it should be in your toolbox.

circuitfarmer@lemmy.sdf.org · 6 days ago

I don’t disagree with this hot take. But the major difference is the sheer resources needed to have an LLM in place of a “do one thing right” utility like sed. In that sense, they are incomparable.

bus_factor@lemmy.world · 6 days ago

I think they’re arguing for having the LLM generate the regex. And I certainly would not trust an LLM to do that right.

Natanox@discuss.tchncs.de · 6 days ago

Yeah, it’s way more sensible to use some of the available regex utilities like this. Although it’s always funny to see what an LLM comes up with.

foggy@lemmy.world · edit-2 6 days ago

I mean fair.

I guess the caveat here should be fucking learn regex first, lmao.

Don’t use it works not necessary. Google is probably still better if you’re looking for regex for an email or something like that

And also don’t just rely on its answer for prod.

noctivius@lemm.ee · 6 days ago

this is funny i have totally opposite experience

ABC123itsEASY@lemmy.world · 6 days ago

I’m curious if you care to share more?

irelephant [he/him]🍭@lemm.ee · 5 days ago

If someone’s made the regex before, sure.

JayDee@lemmy.sdf.org · 6 days ago

And we can see by the ratio that this was in fact a hot take.

foggy@lemmy.world · edit-2 6 days ago

A lot of lemmy is very anti-Ai. As an artist I’m very anti-Ai. As a veteran developer I’m very pro AI (with important caveats). I see it’s value; I see it’s threat.

I know I’m not in good company when I talk about its value on Lemmy.

Natanox@discuss.tchncs.de · 6 days ago

Completely with you on this one. It’s awful when used to generate “art”, but once you’ve learned its short-comings and never blindly trust it it is such a phenomenal help in learning and assisting with code or finding something you’ve a hard time to find the right words for. And aside from generative use-cases neural networks are also phenomenally useful for assisting tasks in science, medicine and so on.

It’s just unfortunate we’re still in the “find out” phase of the information age. It’s like with the industrialization ~200 years ago, just with data… and unfortunately the lessons seem to be equally rough. All the generative tech will deal painful blows to our culture.

JayDee@lemmy.sdf.org · 6 days ago

That’s a view from the perspective of utility, yeah. The downvotes here are likely also from a ethics standpoint, since most LLMs currently trained are doing so by using other peoples’ work without permission, all while using large amounts of water for cooling, and energy from our mostly coal-powered grid. This is also not mentioning the physical and emotional labor that many untrained workers are required to do when sifting through the datasets of these LLMs, removing unsavory data for extremely low wages.

A smaller, more specialized LLM could likely perform this same functionality with a much less training, on a more exclusive data set (probably only a couple of terabytes at its largest I’d wager), and would likely be small enough to run on most users’ computers after training. That’d be the more ethical version of this use case.

Natanox@discuss.tchncs.de · 6 days ago

True, those are awful problems. The whole internet is suffering due to this, I constantly read about fedi instances being literally DDoS’ed by robots.txt ignoring, IP-block circumventing crawlers. Unfortunately there’s no way to prevent any of this right now with our current set of technologies… the best thing we could do is make it a state- or even UN-level affair, reducing the amount of simultaneous training and focus on cooperation instead of competition while upholding high worker’s rights. However that would also be very anti-capitalistic and supposedly “stifle innovation” (as if that’s important in comparison to idk, our world burning?), so it won’t happen either. Banning it completely of course is also impossible, humanity is still way too divided and its benefits for “defends” (against ourselves) too high.

In regards to running locally, we currently see a new wave of chip designs that’ll enable this on increasingly reasonable devices. I myself get a new laptop with XDNA2 chip and 32gb RAM today where I want to try running Codestral w/ Linux (the driver arrived natively in 6.14). Technically it should be possible to run any ~30b model on those newer chips (potentially slightly quantized), I’ll definitely try this a little bit and probably write a thread about it in the OpenSuse forums if you’re interested.

JayDee@lemmy.sdf.org · edit-2 6 days ago

I think it’s important to also use the more specific term here: LLM. We’ve been creating AI automation for years for ourselves, the difference now is that software vendors are adding LLMs to the mix now.

I’ve hear this argument before in other instances. Ghidra, for example, just had an LLM pipeline rigged up by LaurieWired to take care of the more tedious process of renaming various functions during reverse engineering. It’s not the end of the analysis process during reverse engineering, it just takes out a large amount of busy work. I don’t know about the use-case you described but it sounds similar. It also seems feasible that you could train an AI system on your own system (given you have enough reversed engineered programs) and then run it locally to do this kind of work, which is a far cry from the disturbingly large LLMs that are guzzling massive amounts of data and energy to learn and run.

EDIT: To be clear, because LaurieWired’s pipeline still relies on normal LLMs which are unethically trained, her pipeline using it is also unethical. It has the potential to be ethical, but currently is unethical.

ABC123itsEASY@lemmy.world · 6 days ago

Lol why are you getting downvoted this isn’t even a hot take. You are 100% right regex is famously enigmatic even among experienced software engineers.

foggy@lemmy.world · edit-2 6 days ago

Yeah Lemmy used to have a core of tech Intel and that has slipped hard in the last 6 months.

Be what it do I guess. Dummies gonna dumb.

We are in this sea of like a million people who want to be cybersecurity professionals…

…and as a cybersecurity professional it’s adorable when I see vehement dissent.

Like y’all, I’ve been doing this. And if you want a recommendation, pipe down lol.

ABC123itsEASY@lemmy.world · 6 days ago

Yea I come from the generation of reddit departures that left because of API lockdown and elimination of third party apps. Nowadays a lot of people join Lemmy because they got banned off of reddit for reasons of varying respectability. I would say it’s diluting the concentration of tech intel, as you say. Oh well.

foggy@lemmy.world · edit-2 5 days ago

Lol yep. Also here from the reason for which you only care about a lot if you have done some kind of web develooment.

Edit: Jesus I just reread that. I literally just ripped the bong. Was a dumb sentence. I’ll leave it.

ABC123itsEASY@lemmy.world · 6 days ago

I some beef with this meme in that there really isn’t a way to simply do this in windows. If anything, it demonstrates the upper level of capability and function using a cli shell. People who are looking for a windows replacement would never need to understand this command or even use a pipe / regex as they were unlikely to have been doing this kind of thing with windows anyway.

missingno@fedia.io · 6 days ago

Who said anything about Windows? What’s that have to do with the meme?

ABC123itsEASY@lemmy.world · 6 days ago

That’s fair! I’ve always seen this community as an environment for advocating Linux use to Lemmings and much of the time the memes are comparative in nature to other options (ex the current to meme in this community about vendor lock in) so I’m just injecting my own historical view about this community. If it’s not comparative; then why is it really that funny? In that case it just reads like an exam in a CIS final.

NeatNit@discuss.tchncs.de · 6 days ago

I think it’s funny for the exact same reason that this xkcd is funny: https://xkcd.com/1168/

irelephant [he/him]🍭@lemm.ee · 5 days ago

tar -xzf

eXtract Ze File

lagoon8622@sh.itjust.works · 5 days ago

eXtract Zipped File

ABC123itsEASY@lemmy.world · 6 days ago

Right I love that comic! But I think the joke there is the expectation that a unix user would be able to do something as simple as opening a .tar.gz file, something that’s a much more common experience among nix users, but truth is many experienced users would still need to look up the exact syntax (me included). This meme is more like “look at how absurd of a command you can use in bash; can you tell the difference?” I dunno just doesn’t hit the same to me. Appreciate you sharing that xkcd though.

Justin@lemmy.jlh.name · 6 days ago

who in their right mind would try to do this on Linux

ABC123itsEASY@lemmy.world · edit-2 6 days ago

I could imagine a command like this being used as part of a CI/CD script doing static analysis in a virtualized environment where the build is running in a *nix container. There’s more maintainable options as well (ie easier for an entire team of developers to understand / lower ‘bus factor’).

Lol someone is so mad about this they’re going around downvoting all my replies. Guess that’s easier than coming up with a thoughtful reply and actually conversing but whatever you do you boo

Justin@lemmy.jlh.name · edit-2 5 days ago

Build scripts are often written in bash, yes, but I would say that you should find a utility program, or write your own utility in python, if you’re breaking out sed. It’s very hard to read code like this, no matter the team size.

There’s probably only 100-300 usages of sed in the entire nixpkgs repo, with over 100,000 packages.

I definitely agree Linux is easier to maintain and build code on than Windows, but yeah abusing sed is not really an ideal use case 😅

Natanox@discuss.tchncs.de · 6 days ago

It’s the 1 million Euro question.