Is AI inference getting cheaper or more expensive over time?

GamingChairModel@lemmy.world · 1 day ago

I went to the gym and used a forklift to lift the weights.

GamingChairModel@lemmy.world · 1 day ago

But you’re seeing a screenshot of an unmatched order that no driver has claimed yet. I’m saying that unless an actual match is accepted, that’s not really evidence that people in a place don’t tip well, just that some people don’t get their orders filled.

If you never give less than $5, then any order you’re involved in will involve at least a $5 tip. That may not be representative of the orders you’re not involved with.

GamingChairModel@lemmy.world · 2 days ago

I think the user decides how much to tip in advance, and the app conveys that information to potential matches. Orders with low tips tend to sit there unclaimed, because no driver wants to bother with that

I’m not sure if Uber does it that way, but Doordash does.

GamingChairModel@lemmy.world · 2 days ago

I remember reading about a case a few years ago where a warehouse couldn’t figure out which of its workers was just periodically taking shits in random corners of the warehouse. I think I’m starting to understand a different angle to that story, though.

GamingChairModel@lemmy.world · 3 days ago

It’s gonna be so fucking funny when the push to sell silicon that can run local models at 100 watts or less ends up destroying the business models of the companies that built out 100,000,000,000 watts of data centers.

GamingChairModel@lemmy.world · 3 days ago

Yeah, but that’s always been true of paid software licenses for a particular version: it reaches EOL and you have to decide whether to live with the possibility of unpatched known vulnerabilities or pay for an upgrade to a more recent release.

MS Office has been doing this from back in the Windows 3.0 days at least.

GamingChairModel@lemmy.world · 3 days ago

The only solution is to make sure they can’t read data you don’t want shared.

Isn’t that the appropriate guardrail, then? LLM chats and agents and whatever need to be contained with external permissions settings that the LLMs simply do not and can never have the power to override.

In a normal customer service setting with human agents, there are still plenty of examples of what a human agent simply doesn’t have the power to do. Often, they’ll need to escalate to a manager to do things like process refunds not just because they weren’t given social permission to do so, but because they weren’t given technical permissions to do so. LLM agents need to be contained in the same way. Any decent use of agents, human or software, requires carefully designed processes and permissions extrinsic to that agent’s own decisionmaking abilities to make sure that agents don’t do something bad for the company.

GamingChairModel@lemmy.world · 4 days ago

Edit 2: Nevermind. 13th October is the day Microslop has chosen to fuck me: https://support.microsoft.com/en-us/office/system-requirements/end-of-support-for-office-2021

You’ll still get a few years before the software becomes remotely disabled, though. This story about Office 2019 losing functionality follows Office 2019 losing support in 2023. If that’s the rate things go, then maybe Office 2021 will lose functionality either 2 years from now (7 years after release) or 3 years from now (3 years after losing support).

GamingChairModel@lemmy.world · 5 days ago

Passports really should be using four digit years.

GamingChairModel@lemmy.world · 6 days ago

The current maintainer, Andrew Tridgell, is one of the two original authors, and was dragged back into maintaining this in 2024 because rsync is so important and nobody else was stepping up to replace Wayne Davison, who was the primary maintainer from 2002 to 2024.

If this project had people eager to fork it, that probably wouldn’t have happened in 2024.

GamingChairModel@lemmy.world · 8 days ago

jizz oil all over it to make it waterproof

GamingChairModel@lemmy.world · 9 days ago

choose between food and RAM

chips are chips

GamingChairModel@lemmy.world · 10 days ago

Those AI servers are probably a discounted cost of around 3-5k per U (I’m probably low in this estimate but I have a really hard time believing they’re actually paying $20k+ per GPU),

They’re arranging 72 Blackwell GPUs into each server rack, at around a price of $3 million, in a cabinet that is 2236mm x 600 mm x 1068mm. That’s approximately a 7 square foot footprint, so about $430,000 per square foot of server. There’s obviously a need for spacing between servers, but you’re basically underestimating by an order of magnitude.

GamingChairModel@lemmy.world · 14 days ago

In the current environment? Apple shielded itself from price hikes by component suppliers by locking up capacity early. There’s a reason why their CEO came up through the supply chain rather than software or design.

The memory Apple is putting in its devices today are largely priced at prices negotiated years ago. It got deals on CPUs and GPUs of their own design, fabricated by TSMC, packaged with Samsung-fabricated memory in System-in-a-Package form, at volumes that make them nearly impossible to say no to, under contracts that are probably bulletproof even as TSMC and Samsung have others clamoring for their capacity at higher prices.

The A18 Pro in the MacBook Neos is made on TSMC’s N3E node, which started production in 2023 and was probably under contract by 2022. The AI boom largely started happening after, and the memory/storage chip crunch didn’t seem like it would be a problem until 2024 or so.

In an environment like this where there are capacity shortages and companies bidding up the price to absurd levels, companies like Apple are exactly who you’d expect not to be thrown around by price hikes.

GamingChairModel@lemmy.world · 14 days ago

Ah, Vista. My “ready for Vista” laptop finally convinced me to try out this Ubuntu Linux people were talking about, and although I dual booted for a few years, by 2009 I never had Windows on one of my own devices again.

GamingChairModel@lemmy.world · 15 days ago

And, as I understand it, Anthropic hasn’t committed as much spending to building out new data centers, and has setup their operations to be GPU agnostic, so they can keep flexibility between NVIDIA GPUs, Google TPUs, and Amazon Trainium, and play the data center pricing game. Anthropic is better positioned to survive an AI winter (and I believe it’s coming soon).

GamingChairModel@lemmy.world · 19 days ago

Not with AI in particular, but yes with subscription based software generally.

GamingChairModel@lemmy.world · 19 days ago

The economics of it don’t add up and the growth rate of the curve of improvement over time has already significativelly fallen which looking at the historical curves for other technologies is a very strong indication that it’s approaching the limits of how far it will go even though it’s nowhere close to the hype.

Yeah, I’m convinced that they’ve maintained the illusion of continued exponential improvement from 2024-2026 by sneaking in exponential increase in resources (hardware complexity, power consumption), to prop things up past what should have been a plateau.

GamingChairModel@lemmy.world · 19 days ago

AI has an interesting economic trait in that it’s very, very expensive to deploy, and made very fast progress from 2022 to 2024. That caused investors with money to believe that:

Pushing the frontier was going to cost a lot of money. More than any other purported revolutionary tech.
Extrapolation of past improvement meant that whoever was on the cutting edge may end up with a product with a huge paying market.
So whoever wins this race would be rich, and the investment would have been worth it for them.

But since 2024, we’ve seen that the cutting edge got even more expensive much faster than expected, and much of the improvements in performance now come from inference rather than training, which represents a high ongoing cost.

Now, if we extrapolate from that trend line, we’ll see that the market will be much smaller for AI services at the cost it takes to provide that service, and the question then becomes whether the industry can make its operations cheaper, fast enough to profitably provide a service people will pay for.

I have my doubts they’ll succeed, and we might just be looking at the industry like supersonic flight: conceptually interesting, technically feasible, but just a commercial dead end because it’s too expensive.

GamingChairModel@lemmy.world · 19 days ago

ActivityPub is the fediverse protocol, lemmy and piefed are software implementations of that protocol.

In a similar way, email is a protocol, and Gmail and Exchange/Outlook are software implementations of that protocol. You can use Gmail to send email to Outlook users. Different people can administer their own Exchange servers on their own domains.

And some features of the software work best with other people who use the same version of the same software, although most things kinda work between different software. Like how calendar invites sometimes act weird between users of different software, but for the most part the core functions work OK.

So lemmy and piefed (and mbin and sublinks and some others) are different software trying to speak to other fediverse services through the ActivityPub protocol. It mostly works, but some of the details don’t work exactly the same between each type of software.

GamingChairModel@lemmy.world · 1 month ago

Is AI inference getting cheaper or more expensive over time?

GamingChairModel@lemmy.world · 1 month ago

Where the Goblins Came From - OpenAI explains why it has to suppress the tendency of its models to talk about goblins