Chatgpt FOSS and Offline alternative?

0000000 · Oct 6, 2024

treenutz68 so i could use it over the web browser? For free? And it's like chatgpt/ it can answer questions?

r134a · Oct 6, 2024

There is also 'Maid', an app that can use models locally.

Depending on whether you use the f-droid store, below are both links to f-droid and github.

F-droid: https://f-droid.org/packages/com.danemadsen.maid/

Github: https://github.com/Mobile-Artificial-Intelligence/maid

Ggoskm75f · Oct 6, 2024

Cool alternative: duck.ai

Ttreenutz68 · Oct 6, 2024

000000

Yes and yes and yes :)

Bbayesian · Oct 7, 2024

Unleashed.chat

From the folks that protect your coins at coinkite.com

0000000 · Oct 7, 2024

treenutz68 noise. I'll try it. But it can't deannonamoze me right? Grapheneos protects me right?
I mean because i heared that when people use invative applications like spotify, the aplications manage to deannonymize the users. I think it was because of the router the person used but normally it should not be possible because he used vpn.

What do you think?

Ttreenutz68 · Oct 7, 2024

000000

Hmm, I think at most they would be able to track the IP address, cookies or other browser based identifiers. If you techniques that obfuscate IP address and browser based identifiers like Orbot, VPN, and/or specific browsers, you should be able to avoid that type of tracking.

My understanding is the prompts and response data are only stored in the user's browsers so perhaps that helps as well?

Here is a link to their privacy policy:
https://venice.ai/legal/privacy-policy

DDeletedUser161 · Oct 8, 2024

MLC-AI GitHub, Android Release - GitHub

00xAB · Oct 23, 2024

Pocketstar

Hey there! Would you be able to share some example models and the speed that you're getting on G2? I'd love to see it and I've only been able to find like 3 people who have done this on Pixels and only one listed t/s.

Thanks for your time!

r134a · Oct 23, 2024

I can't speak for G2, i have G4 and there are 2 models i use sometimes locally. llama3.2 3b works fairly in terms of speed. Llama 3.1 8b runs :).

I didn't test many models tbh locally as i mostly use my selfhosted ollama instance which is much more performant. I mainly just use these 2 models mentioned in case once i have no internet and use them basically as a small 'wikipedia'.

Pocketstar · Oct 30, 2024

0xAB
My apologies for the delay, I haven't been on the forum for a while.

I use the 4-bit (Q4_K_M) Spicyboros 7b from https://huggingface.co/TheBloke/Spicyboros-7B-2.2-GGUF?not-for-all-audiences=true

Also the 4-bit (Q4_K_M) Silicon maid 7b works decent.
https://huggingface.co/TheBloke/Silicon-Maid-7B-GGUF?not-for-all-audiences=true

Strangely Silicon maid is slow on my Pixel tablet, but Spicyboros is not, and vice versa on the phone wheras Spicyboros is slow but Silicon maid is fast.

I use uncensored models because I don't feel like I want to argue with my phone in order to be able to ask "naughty" questions and get the anwsers for those.

The speed is decent, it is as fast like if a person is typing, it gets the job done, for me it is not an issue.

Ssr967 · Nov 29, 2024

I have been having a play with local LLM chat on a Pixel 7 Pro. I've had a look over this thread and others on the forum for app recommendations. Just based on my experiences playing with this for a couple of hours, and without claiming to be "fair" (I'm sure some problems could be rectified with further playing around, and I haven't gone and tried to download alternate versions or report problems to the authors), here's how it went for me:

Maid

I got this from F-Droid, which was nice. I was able to chat with this if I downloaded the 1b parameter TinyLlama, but it wasn't terribly coherent - probably a limitation of the model. I tried to download Mistral 7b but the download kept getting stuck party way through so I had to give up. I tried to run a local GGUF of Meta Llama 3.1 8B Instruct Q6_K_L but it either didn't work or was so slow I never got a response - given it wasn't even chewing up my battery, I suspect it wasn't working.

Private AI (by FireEdge)

I got this from Aurora Store. It appears to prevent you using larger models or your own GGUFs unless you pay - certainly I kept getting errors about a billing SDK. However, it ran both Gemma 2b and Qwen 2.5 7b from its built-in list of models at very acceptable speeds.

The first time I tried to download Qwen it got wedged somehow and would neither finish the download nor let me delete the partial download. Every time I tried to chat it would just spew out garbage characters. I fixed this by uninstalling the app and reinstalling, and (not sure it was necessary) I made sure to keep tapping the screen to stop it turning off during the download and installation of the model.

The selection of models is limited but this is very easy to use, if you ignore the glitch.

MLCChat

I got the apk from the github release page and downloaded the Gemma 2b model from the link within the app. The performance seemed noticeably bad, much worse than "Private AI" running notionally the same model. My phone was visibly chugging and the whole phone UI was unresponsive. The response to a simple question took a minute or so, whereas Private AI was taking a few seconds. The GUI said "prefill: 0.2 tokens/s, decode: 4.0 tokens/s", for what it's worth.

ollama in termux

I installed termux from Aurora Store and otherwise followed the instructions here. I had to edit one of the files to change "gzip --best" to "gzip -9" but otherwise the instructions worked fine.

The performance with Llama 3.2 3b was bad. Borderline usable, but very bad. I left it answering a simple question, came back ten minutes later and for some reason my phone had essentially crashed. I had a black screen, pressing the power button gave me a menu allowing me to choose Lockdown/Power Off/Restart, but I couldn't get anything else to come up so I forced it to restart. (As a result, I can't tell you which file I had to make the gzip change in.)

This could be a termux problem rather than a ollama one, I don't know. In any case, the performance was bad enough that for running LLMs I'm not that interested, although I think termux could be very useful to me in general.

ChatterUI

The latest beta from github wouldn't let me add a new model (the button seemed unresponsive) but the 0.8.2 release is working like a charm. I don't properly understand what kind of models it wants - on the Models->Show Settings tab it says that supported quantizations are "Q4_0_4_4 Available" and "Q4_0_4_8 Not Available", but I'm used to downloading models from hugging face with quant names like Q4_KM. However, I had a copy of Meta Llama 3.1 8B Instruct Q6_K_L lying around on my PC which I copied over to my phone and ChatterUI is running it at completely usable speeds.

TL;DR

Of the apps I tried here, only "Private AI" and "ChatterUI" performed well. GIven I tried different models on each it's hard to say if one is faster than the other, but they are roughly comparable and streets ahead of the others.

If you want to play around with local LLM on your phone with minimum fuss, try "Private AI" - except for the download glitch I had, it's pretty much install-and-play.

If you want to run arbitrary models or go larger than 7b, ChatterUI is the way to go. Depending on what you want to do, a lot of the fun in the LLM space is playing around with whatever the hot new model is, so ChatterUI has a definite advantage here. And while I didn't build it from source, the fact it is open source (AGPL-3.0) is very nice.

Hharvel · Nov 30, 2024

+1 for Private AI (by FireEdge)
+1 flor LM Studio on Windows

On my P9P, using Private AI, I run the AI Model from Brighteon.ai
https://brighteon.ai/Home/

It is from Mike Adams, aka the Health Ranger, founder of Brighteon...it does skew towards health, but I count that as I plus.

It is from June, but they are working on an update for 2025.

You have to register to download, but it can be run fully offline (deny network to Private AI) in Private AI on my phone or on LM Studio.

Here is a partial list of what is included:

Neo-Dolphin-Mistral 7B V0.1.6
8,973 articles from Mercola.com
26 books on vitamins, minerals, nutrients and natural medicine
18 books on survival, foraging, wild foods, off grid survival skills, bushcrafting
17 books on mainstream medicine, COVID, pharmaceuticals, pesticides and herbicides
Plus all the data from earlier data sets

HTH

BByku · Dec 5, 2024

Anyone got Lite Mistral working in ChatterUI? It's lighweight and fast but rumbles incoherently to itself. Can't figure the instruct sequence. I see it featured on one the screenshots in ChatterUI github.