Monday, January 1, 2024

Open Interpreter on Termux: the Best Mobile ChatGPT Client

Well it turns out that the best ChatGPT mobile app is... The terminal shell. I'll explain.


First some context. If you're like me, you benefit immensely from having access to ChatGPT, specifically with the premium benefits of GPT4 and the "code interpreter / advanced data analysis" functionality,  but you would rather not pay the steep price of $20 a month for rate-limited access.  Cheaper solutions that you might be drawn to take the form of alternative chat bot clients that allow for local LLMs to be used, or at least allow for connecting to the OpenAI API. Directly connecting the the OpenAI API for access to GPT4 should be cheaper than ChatGPT in most cases.¹ I've heard estimates that it's up to 10x cheaper through the API.

There are a handful of alternate chat-bot clients with code-interpreter-like ability that are either very cheap or free, and are basically identical to the ChatGPT experience, so this dream doesn't seem that far away. Of course they will require more tinkering than the frictionless official ChatGPT client, but you get what you pay for. Many would accept a bit more maintenance for that lower cost, as long as all else is equal.

However, the one sticking point is mobile operation. If you're content to use ChatGPT on your desktop alone, this won't be an issue and you probably don't need to read this blog post. But, I have found very few phone apps which provide a ChatGPT-quality experience at a lower price than ChatGPT (though there is a glut of copycat apps at about the same or higher price than ChatGPT). And I haven't yet found any mobile apps that would provide code interpreter capability, despite the fact that the OpenAI API surfaces it alongside the GPT offerings.

Because of this sticking point, at least for now on mobile, the official ChatGPT app really is in a class of its own, making the $20 a month subscription seem almost worthwhile. This is a shame, because it's a high price point for something that I personally think is revolutionary for mobile computation. Writing a python script on the horrid mobile keyboard doesn't sound very feasible, but delegating the bulk of the programming to ChatGPT with code interpreter allows for fairly technical analyses to be executed quickly, with mostly natural language.

But this wouldn't make it into a blog post if the story ended here. Let's save some money on those monthly subscriptions!

One of the best chat bot clients available on desktop is Open Interpreter, a command-line application. UX-wise, a chat bot doesn't require much of a user interface beyond the terminal shell anyhow, and this service provides reputedly some of the best code interpreter abilities outside of ChatGPT, running locally in your machine instead of on the cloud. By default it works with GPT4 over the OpenAI API, but allows configuration with local LLMs too for a zero cost solution. And by virtue of being a service in the terminal shell, it provides flexibility for using it to chain with other local services, or even making network requests. It also seems to have strong community support and trending usage, so we can hope for it to stay current as the AI status quo shifts. Overall, it definitely makes the cut for me in replacing the official ChatGPT web client on my desktop computer.

At first glance that doesn't help with our need for a great ChatGPT mobile app, until you consider that there are several phone apps that provide a very suitable shell environment on mobile. It's almost never the case that a terminal shell is an effective mobile user interface, but it turns out that in this moment, the best+cheapest ChatGPT mobile experience is in fact inside the terminal, using Open Interpreter. 

Termux on Android is the cream of the crop for mobile shells, being open source app with a robust feature set and solid open source community, so this will make for the obvious pairing with Open Interpreter. 

As an aside, you might wonder why there is a solid community for using the command-line on mobile phones: according to my limited poking around on forums, this is mostly folks who don't own "full size" computers, but they still want to be able to do technical work like programming, server administration, or scientific analysis. That's especially relevant for the developing world where Android mobile devices make up the vast majority of computer ownership. Termux also appears to be a favorite of hackers and penetration testers, to get a shell interface into local devices on the go, and also of botfarm owners that benefit from the low-cost and low-electricity-usage of mobile devices for their bot tasks.

But back to the main point.

If you love Open Interpreter for the feature-filled and low-cost experience it provides on desktop, you can bring it onto your phone too! 

And if you love Termux for the general purpose mobile computing capabilities it provides, installing Open Interpreter might really level-up your workflow!

Instructions for Setting up Open Interpreter on Termux

To set up Open Interpreter on Android through Termux you need a few extra steps compared to the desktop setup. This is because of the Google Play Store restriction on file execution and the limited and basic shell environment Termux provides by default. Here's the steps I ended up taking as of December 29th 2023 to set up with GPT4 as the backing LLM.

1. Install the F-droid app from APK.

2. Install Termux through F-droid

    Note: Do not install Termux through Google Play, for reasons

 

3. Open Termux and run `pkg up` to get the shell environment up to date.


4. Install dependencies that are needed to build Open Interpreter: `pkg install python, rust, binutils, binutils-is-llvm, libzmq, matplotlib`


5. Install Open Interpreter with pip: `pip install open-interpreter`


6. Start the program: `interpreter`


7. Follow the Open Interpreter instructions to add your OpenAI API key

    You can get an OpenAI API key through the OpenAI user settings page: Instructions here to setup a key.


At this point, your chat client is running and you can ask away. But let's make this a little bit better of a user experience by storing our API key in our Termux environment so we don't need to enter it every time we start up. And let's create a home screen shortcut to open directly into chat.

For both of these next tasks, Open Interpeter itself can help finish the setup.


8. Ask Open Interpreter to set a persistent environment variable for the API key, and let it take care of the details.

My prompt: `id like to set my environment variable OPENAI_API_KEY so that it's permanently saved beyond this session`  followed by providing it with the API key when it asks, and then allowing it to make the change.

 

9. Ask Open Interpreter how to add a home screen shortcut for a Termux program.

My prompt: `with termux for android, how do you go about adding a home screen shortcut that opens up a shell and immediately runs the command "interpreter"`

Follow Open Interpreter's suggestions, you will need to, as it says, install the Termux:Widget from F-Droid, then it can make the shortcut script for you in the right place, and then you'll be able to add the corresponding widget to your home screen just like you would any other widget.


Next Steps

With Open Interpreter on Android you can do many of the same tasks as you could do through the ChatGPT client, but by paying per request and with access to your local files. Go wild!

If you are a new Open Interpreter user, you'll want to look closely at the Docs to see what's possible with the different commands. In particular, `interpreter --conversations` will be needed to look back at past conversations, and `interpreter -y` can be useful, if risky, to streamline execution permission for code interpreter tasks.

Installing the Termux:API extension from F-Droid will allow you (and Open Interpreter) to access more system commands and allow for using GPS data, taking photos, viewing or making calls and texts, and a few other things. I ended up making a little python script (with the development help of Open Interpreter) that intelligently summarizes all my recent unread text messages, and then reads it out with the built in text-to-speech engine, kind of like what the Humane AI Pin does.

I've only so far worked with the default system message for Open Interpreter, but careful design of the system message could help make conversations smoother, especially if you want to go deeper into getting it fluent with its termux environment, and using things like termux-microphone-record or termux-camera-photo alongside speech-to-text and the interpreter --vision command to script out a quick and dirty custom multimodal mobile AI assistant that could be called up with a shortcut or smart button, further mimicking the Humane AI Pin functionality.

Taking a different direction, using Open Interpreter as a coding assistant will allow expressive scripting and software development on the go with a lot less mobile keypad typing, thanks to the ability to delegate coding and file management tasks to Open Interpreter. It still wouldn't be my preferred way to code, but for folks already coding on their phones I suspect this could be quite powerful.

Happy code interpreting!



¹ I can't find a good source for this besides some folks on forums making estimates. This blog post apparently makes the argument with supporting evidence, but it's pay to view and I haven't actually looked at it. In my experience it tends to be much cheaper for the vast majority of tasks, and more expensive if you're doing the type of task that involves many massive sized request+responses.

No comments:

Post a Comment