@AliasAKA

AliasAKA@lemmy.world · 20 days ago

They don’t, but with quantization and distillation, as well as fancy use of fast ssd storage (they published a paper on this exact topic last year), you can get a really decent model to work on device. People are already doing this with things like OpenHermes and Mistral (given, 7B models, but I could easily see Apple doubling ram and optimizing models with the research paper I mentioned above, and getting 40B models running entirely locally). If the start of the network is good, a 40B model could take care of a vast majority of user Siri queries without ever reaching out to the server.

For what it’s worth, according to their wwdc note, they’re basically trying to do this.

AliasAKA@lemmy.world · 20 days ago

Not even a summary of what’s on Wikipedia, usually a summary of the top 5 SEO crap webpages for any given query.

AliasAKA@lemmy.world · 21 days ago

Depends. If they get access to the code OpenAI is using, they could absolutely try to leapfrog them. They could also just be looking at ways to get near ChatGPT4 performance locally, on an iPhone. They’d need a lot of tricks, but succeeding there would be a pretty big win for Apple.

AliasAKA@lemmy.world · 1 month ago

Almost. If you own a share of a company, you own a share of something fungible, namely literal company property or IP. Even if the company went bankrupt, you own a sliver of their real product (real estate, computers, patented processes). So while you may be speculating on the wealth associated with the company, it is not a scam in the sense that it isn’t a non fungible entity. The sole value of crypto currency is in its speculative value, it is not tied in theory or in practice to something of perceptibly equal realized value. A dividend is just giving you return on profit made from realized assets (aforementioned real estate or other company property or processes), but the stock itself is intrinsically tied to the literal ownership of those profit generating assets.

AliasAKA@lemmy.world · 1 month ago

Except, you know, the stock being tied to ownership in a company that sells real goods or services. Definitely problems with how stocks are traded, but they’re quite different from crypto.

AliasAKA@lemmy.world · 1 month ago

Reading the article I’m not sure why I should t use ZFS on a boot drive. The author does, and was able to set up a nice incremental (encrypted) backup solution that was able to get them back up and running relatively quickly.

Only thing I can think is the manual nature of it maybe? I don’t see how btrfs would be better here based on the article unless I missed something perhaps?

AliasAKA@lemmy.world · 2 months ago

I mean you can model a neuronal activation numerically, and in that sense human brains are remarkably similar to hyper dimensional spatial computing devices. They’re arguably higher dimensional since they don’t just integrate over strength of input but physical space and time as well.

AliasAKA@lemmy.world · 2 months ago

Wikipedia link for Easton (and Parkinson) credit for GPS: https://en.m.wikipedia.org/wiki/Global_Positioning_System#:~:text=Bradford Parkinson%2C professor of aeronautics,with the rank of colonel.

Ee times article referencing FHSS and Nikola Tesla:

https://www.eetimes.com/a-short-history-of-spread-spectrum/

AliasAKA@lemmy.world · 2 months ago

While this is true, I also kind of doubt that Reddit isn’t just one mistake away from accidentally deleting an old db and losing the historical data. So it may in fact mess up their ability to sell the data.

Also potential GDPR violations etc if you’re in the EU

AliasAKA@lemmy.world · 3 months ago

I think in general the goal is not to stuff more information into fewer qubits, but to stabilize more qubits so you can hold more information. The problem is in the physics of stabilizing that many qubits for long enough to run a meaningful calculation.

AliasAKA@lemmy.world · 3 months ago

What games do you play in particular that is abysmal on Linux?

AliasAKA@lemmy.world · edit-2 3 months ago

~~Guy~~ Person is a racist troll.

Databricks is in the top 35% of similar companies in terms of diversity. So really I guess if they were trying to say this was an achievement without people of color and diversity I’d guess they just self owned themselves since in fact it’s considered diverse in its field.

Edit: I shouldn’t use gender assumption language even though guy is fairly gender neutral where I am (you guys).

Double edit for a source: https://www.comparably.com/companies/databricks/diversity

AliasAKA@lemmy.world · 3 months ago

It is tiny, but far more parity in terms of arms and the whole being an island thing makes it exponentially harder to invade than say, a country you share a a land border with including roads leading you to where you want to go.

AliasAKA@lemmy.world · 3 months ago

It may be no different than using Google as the search engine on safari, assuming I get an opt out. If it’s used for Siri interactions then that gets extremely tricky for one to verify that your interactions aren’t being used to inform adds and or train an LLM. Much harder to opt out vs default search engine there, perhaps.

LLMs do not need terabytes of ram. Heck you can run quantized 7billion param models on 16gb or less (Bloom, Falcon7B — falcon outperforms models with higher memory by the way, so there’s room here for optimization). While not quite as good as openAIs offerings, they’re still quite good. There are Android phones with 24gb of ram so it’s quite possible for Apple to release an iPhone pro with that much, and run it similar to running any large language model on an M1 or M2 Mac. Hell you could probably fit an inference only model in less. Performance wouldn’t be blazing but depending on the task, it could absolutely be sufficient. With Apple MLX and Ferret coming online it’s totally possible that you could, basically today, have a reasonable LLM running on an iPhone 15 Pro. People run OpenHermes 7B for example which uses ~4.4GB to run, without those frameworks. Battery life does take a major hit, but to be honest I’m at a loss for what I need an LLM for on my phone anyways.

Regardless, I want a local LLM or none at all.

AliasAKA@lemmy.world · 3 months ago

This is a really bad look. It will probably be the case that it will be an opt in feature, and maybe Apple negotiates that Google gives them a model they house on premises and don’t send any data back on, but it’s getting very hard for Apple here to claim privacy and protection (and not that they do a particularly good job of that unless you stop all their telemetry).

If an LLM is gonna be on a phone, it needs to be local. Local is really hard because the models are huge (even with quantization and other tricks). So this seems incredibly unlikely. Then it’s just “who do you trust to sell your data for ads more, Apple or Google?” To which I say neither, and pray Linux phones take off (yes yes I know root an Android and de google it but still).

AliasAKA@lemmy.world · 3 months ago

This should actually work against them. It would be more like “See, we’re not interested in competing, we’d rather maintain monopolies and cartel it up!”

AliasAKA@lemmy.world · edit-2 4 months ago

I suppose that really depends. Are you making a reproduction of Citizen Kane, which includes cinematographic techniques? Then that’s probably a hard “gotta get a license if it’s under copyright”. Where it gets more tricky is something like reproducing media in a particular artistic style (say, a very distinctive drawing animation style). Like realistically you shouldn’t reproduce the marquee style of a currently producing artist just because you trained a model on it (most likely from YouTube clips of it, and without paying the original creator or even the reuploader [who hopefully is doing it in fair use]). But in any case, all of the above and questions of closeness and fair use are already part of the existing copyright legal landscape. That very question of how close does it have to be is at the core of all the major song infringement court battles, and those are between two humans. Call me a Luddite, but I think a generative model should be offered far less legal protection and absolutely not more legal protection for its output than humans are.

AliasAKA@lemmy.world · 4 months ago

I never equated LLMs to intelligence. And indexing the data is not the same as reproducing the webpage or the content on a webpage. For you to get beyond a small snippet that held your query when you search, you have to follow a link to the source material. Now of course Google doesn’t like this, so they did that stupid amp thing, which has its own issues and I disagree with amp as a general rule as well. So, LLMs can look at the data, I just don’t think they can reproduce that data without attribution (or payment to the original creator). Perplexity.ai is a little better in this regard because it does link back to sources and is attempting to be a search engine like entity. But OpenAI is not in almost all cases.

AliasAKA@lemmy.world · 4 months ago

SoRA is a generative video model, not exactly a large language model.

But to answer your question: if all LLMs did was redirect you to where the content was hosted, then it would be a search engine. But instead they reproduce what someone else was hosting, which may include copyrighted material. So they’re fundamentally different from a simple search engine. They don’t direct you to the source, they reproduce a facsimile of the source material without acknowledging or directing you to it. SoRA is similar. It produces video content, but it doesn’t redirect you to finding similar video content that it is reproducing from. And we can argue about how close something needs to be to an existing artwork to count as a reproduction, but I think for AI models we should enforce citation models.

AliasAKA@lemmy.world · edit-2 4 months ago

We need to start poisoning this data. I don’t think the solution is to cut the wires, I think it’s to send bogus data. Just make it so that no matter how I drive, the data is always overwritten that I traveled 5 miles at 30mph average with no hard stops and no hard accelerations. I only ever make that trip. Wanna base my insurance off that? Go for it.

Anyways I lack the technical ability to do this, but wonder if some enterprising person could hack the obd to constantly overwrite the data here.

Again I want to poison this data. It should be illegal, but it’s not. Companies will charge me more if I block it. So the solution is data poisoning imo.

Incidentally we need to be poisoning ALL data brokers and collectors for these types of things.