Over the past week I’ve been on a mission to determine if locally installed LLM (Large Language Model) tools like GPT4All & Ollama are reliable enough to use for business. I wanted to know:

  • If I could set up my own knowledge base of proprietary data, query the data locally using one of these tools, & get usable answers about that data?
  • As an IT professional, can I set up a laptop with data,  configure a local LLM, and  give it to a “non-techie” person for them to use?
  • Can I to travel with it, & (without internet) use it as a self contained knowledge base of company, industry or proprietary data?
  • If the above is possible, then the data can be portable (like say on a USB or external hard drive), & used on any computer with a local LLM installed?

Are local LLMs ready for prime time?

To be fair up front no one promised me this. None of the local LLM tools I tested promote themselves as a business solution & I’ve never seen or read where anyone has been using them this way.  I also understand that even though they market (and sell) these tools as ready & functional, they still claim they’re “experimental”.

Also, 2 of the tools I tested are free, & that should not be overlooked, nor the time that the developers put in to release them to the public, as well as updates. It is MUCH appreciated.

I’m testing a specific use case to see if they offer any kind of reliability in a business environment, or if we’re jumping the gun & should hold off on trusting them as a local document solution with our most sensitive data.

 

The hardware set up

I’d already played with GPT4All on my refurbished Gen1 Lenovo T14 w/ Ryzen 7, 32GB RAM with integrated graphics. Initially it seemed great, albeit a little slow on some queries, but it worked to do basic Alexa level tasks. 

  • “Tell me a joke”
  • “Suggest some ideas for activities”
  • “Do you have a recipe for Risotto?”

I even set it up to query my own documents & really liked that I could add multiple folder paths & pick & choose. I was impressed & it seemed to work well letting me query things about the documents, & giving me answers based on them. Or so I thought.

Assuming I was having some success I wanted to explore this as an actual business solution. I didn’t want my hardware to hold me back, yet also wanted it to be the type of laptop a business professional could use to run a local LLM, still look professional, & at a price point that IT departments could get behind. NVIDIA graphics card was important since they have added support for use with local LLMs. So I used all those excuses as a reason to purchase this Lenovo LOQ (i7-13650HX || NVIDIA RTX 4060 8G GDDR6), upgraded the storage to 2TB, Memory to 64GB, & installed Linux on it out of the box (I wrote about that here). That last part wasn’t really necessary, but Microsoft knows what they did.

 

GPT4All

As I said, I’d already played with GPT4Alland at the time was impressed. In order to get ready for this experiment I read a few things about making documents machine readable, & even downloaded some of my resources using the machine readable json option. One resource took all night to download. My second data set was a collection of .csv downloads from another project.

So I fired up GPT4All, downloaded a couple of Large Language Models –  Mistral (trained by Mistral AI), Llama 3 Instruct (trained by Meta), & Orca Full (trained by Microsoft). I set up my data folders & started testing.

At first it seemed great! I was asking questions about the information in my documents & it seemed to answer them. That is…until I started asking about details. Then it started making things up. I tried to make the questions as simple as possible, many  times actually giving it the answer from the doc, & it sill kept telling me that as a local LMM it doesn’t have access to local documents. WTF?

It felt like the equivalent of Nikola rolling their self driving trucks downhill to make them appear to be driving, when in fact the only thing that actually worked were the lights.

So I straight out asked it if was it giving answers based on the docs that I’ve shared, or it’s own training. It answered that it was using it’s own training. It had no idea what documents I was talking about, even when I pointed them out by name, path, used hashtags…whatever, it did not see any docs.

I did some reading & found out on Discord that GPT4All /the LLMs only work with limited formats including .txt, .pdf, & .md. I feel like this is important information that SHOULD BE ON THE WEBSITE next to the place where you’re bragging about using local docs.  Anyway, my bad. So  I converted my .csv’s to markdown , forgot about the json files, & reloaded the converted data.

I started querying information from the documents again, & it had no idea what I was talking about.

I’m sorry Dave…

“I apologize for any confusion earlier! As a conversational AI model, I don’t have access to any…data or information about specific…documents. My previous response was an error on my part.”

I must have seen different versions of that 100 times. So I figured maybe I need to dumb it down, use exact match words that are in the document & maybe it will recognize the terms. I cleaned up simplified the docs even more, & changed headers & terms to the clearest language possible.  I queried the data again, & it seemed to answer before I again realized it was answering based on training data & not the information in my documents.

At this point I was certain this was not a unusable option for business, but I wanted to see if I could get this thing to work at all.  I disconnected & reconnected the path to my docs multiple times. Finally on one attempt when I selected which data to query I saw a progress bar that indicated it was actually scanning the docs. I’d done this many times before, this was the first time it actually indicated that it was doing something. I didn’t know to look for that, because I had no idea what that looked like.

Is it working…?

It proceeded to answer the next 3 softball questions correctly, but couldn’t answer follow up questions. Then it went back to having no idea what I was talking about, & being completely oblivious to any documents or local data.

The really infuriating part is that it was indicating that it was searching my documents folder.  I’d also checked the box to make it cite the source, & it would list my documents as the source of it’s answers, but the actual answers would have nothing to do with the data in them.

I tried the other local LLM models…Llama3, Orca…no difference.  I even created & uploaded simple dummy data in plain text with simple words in it, it still couldn’t do it.

I went back to some of the YouTube videos I’d watched about this and  realized none of them actually showed anyone doing anything terribly serious or important.  Sure, they played with them doing Alexa level stuff, asking the same kind of questions you would ask an Encyclopedia Britannica.

Well…sort of…

Technically you are running an LLM locally. You can choose various models, & ask them questions based on it’s already trained data. All that stuff actually works, & it’s really kinda awesome if you into just playing with these tools & not using them to get any real work done. 

This is also free & open to the world & I do recognize how it can look absurd to be critical of something that’s free. However, these have been promoted as local document solutions so I think the criticism is fair.

Using this tool to query your local documents is not a reliable function that I could suggest to anyone else, nor use for any business or professional reasons. At least not yet.

Moving on.

OpenWeb UI via Pinokio

I’d always intended on checking out OpenWeb UI & Ollama, so no time like the present. I’d heard about Pinokio from a guest on This Week In Tech (by the way, support the Twit Network) a few weeks back, but never played with it. It seemed like an easier installation than the OpenWeb UI instructions, so I gave it a try. 

Installation wasn’t bad, & with minimal fiddling I got Ollama started & fired up the Pinokio interface.  First off, it’s a great interface. I wouldn’t exactly call it user friendly for non-techies but once you get a feel for it, it does make it easy to play with various LLMs & AI models & makes it easy to download & install them.  So I downloaded a couple of additional language models, uploaded my now simplified docs & started querying.

It was a bust right out of the gate

I could never get this one to work at all. It never recognized any of my docs, even when I uploaded the doc with the question, pointed to the answer inside of the doc, named the doc specifically, & basically just asked it to tell me one word inside the doc, it still could not answer based on the data provided.

I won’t bore you will all the ways I tried to troubleshoot this thing to get it to work as advertised, but the bottom line is that as a reliable tool for local documents, it’s unusable.

Just like with GPT4All, all the videos about it demonstrated using the different models as is. From what I’ve seen so far that part all works. You definitely can play with A LOT of different AI & LLM models. And again, free!  Kudos to the developers for even putting it together, & making it available to the public. I will def continue to use it & watch as it develops.

But for now, the local document reliability is a bust.

Is it me?

Even if I were doing something wrong & needed to spend days figuring it out, that’s not usable for a normal person who doesn’t have the time, inclination or knowledge to do that.   I couldn’t give this to a business client, let them travel with it, & be assured that it’s going to work when they got to their destination.  Again, that was never promised, but I just wanted to see if it were possible.

It is not.

ChatGPT is still the king

ChatGPT & I go way back. I have been a plus subscriber since the beginning, & have dabbled at building Custom GPT apps that have served me well to help develop marketing ideas, creative writing, formatting my resume’s in every way possible, & they’ve help me pass the CompTIA Sec+, & study for the CySA+ (taking soon). ChatGPT works.

So I thought I’d try it for sh*ts & giggles to see what it could do for this use case.

I created a custom GPT, gave it it’s instructions for what I wanted it to be able to do with the data, gave it resources it could reference & uploaded my docs.

Right out of the gate the first thing it offered to do was reformat/reorganize the data so that it was formatted correctly for easy retrieval. It then systematically went through each document & formatted it in the way it wanted them. This must be new. I’ve used docs with it before & it’s never done that.

I then asked it based on the data, what kind of industry or role specific reports  it could help me create, & it listed 17 different ways it could take my data to create detailed reports & insights based on it.

I mixed & matched some of the suggestions, asked it to correlate data across documents, & create a specific report based on those parameters..and it did it..and all the data was correct!  It also recommended other insights that it could provide based on the data.

THIS is the functionality I was looking for. Even half of this would have been great from the other tools. But of course there are drawbacks.

ChatGPT is damn near unusable during the day.

This is not a secret. It’s slow. It times out, shuts down, loses your chats, or just stalls for long periods of time.

GPT Plus seems even slower, or worse, just when you get a rhythm going…Uh Oh! You’ve used up your allotment & have to come back later. That’ll be $20 please!

Its like a dysfunctional relationship. The few times that you are able to use it successfully, it’s so wonderful that you let all the previous frustrations slide. ChatGPT & GPT Plus just do not have the resources to keep up with it’s user base, & is not reliable as a business productivity tool.

Additional issues include:

  • There’s a 10 document limit. Not sure of the size limit of each doc.
  • Uploading docs to ChatGPT isn’t private, & you definitely should not upload any company information or proprietary data to it.
  • You need the internet to use it, defeating the purpose of searching for a locally run tool that can be used at all times.

So then I thought…

What about the option to use GPT3.5 or 4 with GPT4All?

Could I get things to work locally that way? It sounded good in theory, but we’d still have the same privacy problem that the functionality is not happening on your device, all of of your chats are being sent to OpenAI’s servers.

 

OK. Then what about Co-Pilot? It’s basically GPT-4, right?

  • Even before this Recall nonsense, Microsoft has been increasingly collecting telemetry on it’s users & their usage.  I wouldn’t want my company docs anywhere near Windows or Microsoft’s servers. (THIS is what they did, by the way).
  • Cannot be used offline & you would be communicating with Microsoft servers in “the cloud”.  Given that Microsoft products & Windows continues to be the biggest attack surface on the planet, this is not a safe option for your private data.
  • The ads all over Windows & Edge are a real turn off. I can’t ignore that & trust my proprietary information & documents to a company that proclaims to be a business solution, yet shows such poor judgement with the amount of tracking scripts it runs under the con of “personalized ads”. (THIS is also what they did).

And here lies the problem

  • This tech is only really useful on other people’s cloud & those clouds are controlled by a few players who are in the data collection business & cannot promise, nor provide absolute security of your data.
  • Even if data could be isolated & that could be governed by a TOS, companies CHANGE THEIR TOS ALL THE TIME…literally changing the terms of the deal after you’ve agreed to it, giving themselves permission to use the data as they wish, & there’s absolutely no oversight into this practice. There’s probably a TOS change from some company or service in your inbox right now.
  • Furthermore, even if all the right promises were made, the TOS was legally written in stone, & your data was promised to be in an impenetrable vault on their proprietary servers that only you had access to, there is absolutely NO WAY TO EVER VERIFY THIS.

For this technology to be really useful it is imperative that people (and businesses) be in control of their own data at all times. The only way to know for sure that you & only you see & control your data, is to keep it on your devices or your own storage. Everything else is hope. You’re hoping that it really is safe, & they really aren’t secretly using it as they see fit.

These companies are creating this awesome technology, but just like everything else they create…and I mean EVERYTHING…they have not figured out how to make them secure.  At the same time they are also still “training” these models so they have this insatiable need to collect & feed them with more & more human data. As long as this is the reality I wouldn’t put my hands or my data anywhere near their mouths.

My wish list for what that looks like in the future

For now it appears the only way to have a searchable, self contained knowledge base of proprietary or industry specific knowledge is to train it yourself.  That’s obviously out of reach for most people so we’re left to using what we’ve always used…local device & document search tools that suck, & doing manual assessments of that data.

However, given how fast things are moving in this space I have confidence that we will get a reliable local LLM tool that actually works, & we’ll get it soon. We’re honestly almost there.

My wish list for what that looks like includes:

  • For it be actually be local. All processing is done on device.
  • I’d be OK if the model requires a little more hardware resources than what they’re trying to do now. I mean it would be great if it ran on min specs, but lets get it right first then quantize it.
  • It should have the option of connecting to the internet for more research,  & information resources such as industry regulations, public databases, & websites. It’s great that it can run without the internet, but taking away the option completely severely limits it’s functionality & keeps every LLM as basically a limited parlor trick, leaving users to figure out which model is safe, & trained in what they need..
  • It would be nice if you could use APIs to connect to trusted knowledge bases to pull information, but the processing is still done on device.
  • Free & open source would be great, but I understand development costs money. I would pay a fair amount for an open source solution that actually works.

The promise of regular people running local LLM & AI models with their own data, even on spec’ed up hardware, is a little premature.

We ain’t there yet.

Additional reading

It is imperative that we get this right if these tools are ever going to be useful for academics, researchers or the average business exec that needs to crunch their own data without having to trust the security of someone else’s cloud.  What good are all these tools if they can only do parlor tricks like recipes,  jokes, & homework? We could already do that with search..back when search engines were good (you know what you did Google).

Here are a few security related stories that I’ve been following in this space.