A Rough Guide to Private AI: How To Buy, Build, and Use LLMs and RAGs.

For a lit­tle under $8,000 you can have your own very capa­ble pri­vate AI at home that you can feed data to and get answers from. 

It’s not hard to do, it’ll put you out at the cut­ting edge, and I think this offers sig­nif­i­cant advan­tages to those who fol­low through with it. This guide talks about what the machine con­sists of, how to put it togeth­er, what you might use it for, and what is still fantasy.

This is writ­ten for us semi-nor­mal peo­ple inter­est­ed in AI. We aren’t going to build our own LLM from scratch and we may not code for a liv­ing. We’ve used Chat­G­PT or Per­plex­i­ty and think it’s neat, and we can copy/paste com­mands into a CLI. We may have nev­er built our own PC but that’s fine; I had­n’t either.

We’re will­ing to read a bit and muck about, but in gen­er­al we don’t have time to devel­op deep domain exper­tise; we just want some­thing that most­ly works and puts us out at the fore­front of the AI revolution.

You can also spend less if you want. My bud­get was $8k so that’s what I used. $3k is prob­a­bly as low as you can go and get rea­son­able per­for­mance, but I’m sure some wiz­ard out there will let me know in the com­ments that if I just re-ion­ize tooth­paste and flame-char duct tape to exact­ly 417.314 fahren­heit then smelt my own iron, I could do the whole thing for $5.

I did­n’t do that, just spent the mon­ey, enjoyed build­ing an excel­lent thing, and called it “Mon­stra”.

The GPUs are a lit­tle too close togeth­er to run for long term com­fort, but for rel­a­tive­ly short tasks the thing is awesome.

Yes, you can run an LLM on your home com­put­er right now; it’ll just be a lot slow­er than a pur­pose-built mon­ster. “Mon­ster” is rel­a­tive; for a home set­up this is on the “over­built” side. Com­pared to top of the line mas­sive racks of A100s or H100s, this thing is puny.

Ok, so what do you get out of build­ing your own per­son­al AI machine? 

  • Pri­va­cy in your queries. Maybe you don’t want Ope­nAI to know about the prod­uct you’re build­ing that you need help coding.
  • Relat­ed to this, you get the abil­i­ty to use an AI trained on data that you might not want to share, like tax records or ship­ping infor­ma­tion for every sale your com­pa­ny has ever made. 
  • You get flex­i­bil­i­ty in set­ting it up exact­ly how you want it; how much “horse­pow­er” you use, how cre­ative you want your respons­es to be, and spe­cif­ic things to focus on.
  • You get to exper­i­ment on the cut­ting edge, and find out how much you can actu­al­ly do and how much is still you fan­ta­siz­ing about hav­ing your own dry­sine queen.
  • You can also par­tic­i­pate in rent­ing out spare capacity.

I’ll start with what I’ve been exper­i­ment­ing with, then we’ll get into the details of the build and how to set it all up. This is not (by a long shot) all you could do, these are just some of the prob­lems I’d like to solve for myself.

Training Plans (race prep, overall fitness, etc)

A few years ago, I hired a pri­vate high-end coach to help me build a train­ing plan for a hike ‘n fly paraglid­ing race I was prepar­ing for. This is a for­mer Olympian who ran strength and con­di­tion­ing at Red Bull. I’ll admit it; it was prob­a­bly a lot more coach­ing horse­pow­er than I needed.

He charges $800 a month for what I want­ed, which is a cus­tomized plan with unlim­it­ed calls to an elite coach and alter­ations to con­tin­u­al­ly make the changes required of any train­ing plan. We spoke every day, it was awe­some, and he was worth every pen­ny. Still, $800 is $800.

Now, that’s on the expen­sive side. If you want excel­lent coach­ing you could hire the folks over at Evoke Endurance, who charge $350/month for a sim­i­lar service. 

If you had a pri­vate AI, you could take PDFs of all the books that Scott John­ston over at Evoke has writ­ten, tran­scribe all the Evoke (and Uphill Ath­lete) pod­casts, add in what­ev­er oth­er con­tent then you want, then feed it to your AI and let it help you devel­op your own cus­tom train­ing plan.

If you want­ed to be a lit­tle entre­pre­neur­ial, you could open that up as a ser­vice for oth­ers. It won’t be quite as easy as I’m mak­ing it sound, as you’ll need a RAG and nerdy vec­torDB skills, but with Any­thingLLM you can prob­a­bly pull this off for your­self with some elbow grease.

Let me be clear here: As of July 2024, the AI won’t be as good as a pro­fes­sion­al coach. In fact, when you start it may be worse than a bad coach. Still, I’m *guess­ing* that’ll change by Decem­ber of 2024 (the ol’ “6 months in AI is a new cen­tu­ry” rule). That’s a guess and I’m like­ly to be wrong.

LiveChat Agent

Since 2009, my wife and I have run Paleo Treats which has both an online store and a brick and mor­tar shop. Since 2014 we’ve used LiveChat to chat with folks who vis­it the web­site. If you’ve ever been on the PT site and opened up a chat, you were talk­ing to me or my wife. That’s cool, but…we turn off the chat when we sleep, and through­out the day when we have oth­er things to do.

LiveChat offers an AI ser­vice called Chat­Bot that starts at $52/month. With our Home­AI set­up, we can replace that with a cus­tom agent and have more con­trol (and learn way more about set­ting those up) to give super cus­tomized chat expe­ri­ences to folks who want to buy gluten, grain, and dairy free desserts online. 

Taxes

Hav­ing been in cryp­to since 2014 or so, it’s always a dis­cus­sion with our tax pre­par­er every year; “What are these token-things? Why are you still doing them? There’s not any great guid­ance on this…”

Because tax­es are about mon­ey and mon­ey is so per­son­al, I’m not sure I want to be upload­ing all my finan­cial docs to every tax site promis­ing “We do it with AI!” out there. 

I’d much rather feed my pri­vate AI the US tax code (which you can find here, Title 26), give it my last 10 years of returns, then work through the cur­rent year with the over­watch of an AI know­ing that even­tu­al­ly, as long as I put in my data cor­rect­ly, I’ll be able to do my tax­es far more effi­cient­ly with my own per­son­al con­sul­tant that knows every aspect of my finances.

Again, this won’t be easy or per­fect (or maybe even that help­ful) right now, but if I start work­ing on this now, I’m guess­ing by the next time I need to sub­mit tax­es it’ll be a but­ton push, a few ques­tions, and per­haps the best tax prep expe­ri­ence I’ve ever had.

Our cur­rent tax pre­par­er charges about $6,000 per year to inte­grate 3 busi­ness­es, per­son­al tax­es and my var­i­ous cryp­to adventures. 

Inventory & Operations

Again, as part of the Paleo Treats busi­ness, we’re always order­ing ingre­di­ents and con­sum­ables, doing pro­duc­tion runs, then pack­ing and ship­ping out items. I’ve built a series of Python scripts to help keep track of costs, but the abil­i­ty to start to inte­grate all of our Oper­a­tions data into an LLM and ask “How many pack­ages have we been ship­ping to Sug­ar­land Texas this month, and how does that com­pare to last year” or “Take a look at this giant csv of every pack­age we’ve shipped over the last 10 years and tell me which city gets the most pack­ages” etc etc starts to make find­ing oppor­tu­ni­ties much more efficient.

Yes, I know you can do that with Excel or Google sheets. I think the abil­i­ty to talk to a per­son­al and secure AI starts to change the game for small busi­ness­es who aren’t nec­es­sar­i­ly data wiz­ards but still have legit­i­mate ques­tions about oper­a­tions opportunities.

Funzies — Grinding a Vanity Solana Address

I stum­bled on a post by Nick Frost­but­ter on how to “grind” a Solana van­i­ty address using a built-in com­mand in the Solana CLI (Com­mand Line Inter­face, which is where nerds go to talk to com­put­ers.). That par­tic­u­lar set of instruc­tions uses your CPU. With the machine I’d built, the CPU was pret­ty pow­er­ful (gear list below), but the whole rea­son I got into this was I want­ed the GPU capa­bil­i­ty for LLMs, so…why not use the GPUs?

My bud­dy nos­mas­ter found this Github repo on how to use GPUs to grind, and after a full week­end of fid­dling about with var­i­ous set­tings, I man­aged to grind out a gking... Solana address in about 45 min­utes. I know, I know, it’s good for almost noth­ing and you don’t get a mnemon­ic phrase (the “twelve words”), just a bare­ly deci­pher­able pri­vate key, but it was a fun use of Monstra.

The Build, Part 1: Understand What Everything Does

Before we get to a parts list, I thought it’d be use­ful to have an overview of what every­thing does. It was all a mys­tery to me when I start­ed, and just know­ing what each thing does has been help­ful in grasp­ing the over­all concept.

Every com­put­er we build for AI will have 8 con­stituent parts:

CPU — Cen­tral Pro­cess­ing Unit: This is the “brain” of the thing.
GPU — Graph­i­cal Pro­cess­ing Unit: Fun­da­men­tal­ly, this is what gives the AI its horsepower.
RAM — Ran­dom Access Mem­o­ry — Think of this as the work­ing mem­o­ry of the device.
Stor­age — This is the long term mem­o­ry. Not as fast as RAM, but way more of it.
Moth­er­board — The coor­di­na­tor of all the oth­er parts, and the phys­i­cal struc­ture they all plug into.
Pow­er Sup­ply — Mak­ing sure every­thing stays on with­out flickering.
Case — Big enough and airy enough to keep every­thing pro­teced and cool.
Cool­er — Fans to make sure the CPU specif­i­cal­ly is cooled off.
Addi­tion­al Cool­ing Fans — Dri­ves air through the whole case to keep all parts cool.
Oth­er nerd tools — Ther­mal paste, screw­driv­er, cable man­age­ment, etc.

The GPUs are crit­i­cal for mak­ing the LLMs actu­al­ly work (at a rea­son­able speed). The CPU man­ages over­all work­flow, and the RAM sup­ports large datasets. Every­thing else sup­ports that. Of course, you can scale up (or down) from here.

Ok, so with the basic stuff out of the way, what should you buy and how does it go together?

The Build, Part 2: Buy & Assemble Your HomeAI

I’ll start by say­ing that I’m impa­tient and Ama­zon is easy, so I bought every­thing there. I’m sure if you hunt around and have some patience, you can save your­self some money.

All up it ran just under $8k. Of course, if you want some­one to build it for you, Giga­byte is offer­ing sim­i­lar builds start­ing at $11k. You can also check out a Google sheet I put togeth­er for dif­fer­ent “size” builds from $3k — $8k. The “V3” on that list is Monstra.

I ordered every­thing on a Mon­day and, being Ama­zon, it was all there by Thurs­day includ­ing hiccups.

Assem­bly was more or less straight­for­ward, though I had sig­nif­i­cant help from Vor­tal, nos­mas­ter, and Rob who are all part of the mys­te­ri­ous advanced tech explo­ration group called Apple­Fuck­ers over on the GK Discord.

Each part comes with plen­ty of guides and man­u­als, and from my lim­it­ed expe­ri­ence, if you just fol­low the Moth­er­board man­u­al it’ll walk you through every­thing you need to do.

Still, if this is your first build, I’d find a com­put­er nerd and have ’em help you with it; many things are not obvi­ous, and hav­ing an expe­ri­enced guide to walk you through updat­ing the BIOS (if that’s even need­ed) is super helpful.

What To Expect The First Week

First off, know that you’re an exper­i­menter. Noth­ing will work out of the box; every­thing requires the abil­i­ty to use the CLI to do basic install stuff. 

Still, if you can search the inter­net, copy/paste, and use Chat­G­PT to get you going, you’ll be up and run­ning your own pri­vate AI in just a few days.

It won’t lit­er­al­ly take that long, as installing these things once you know how is about a ten minute job, BUT…learning how to do it, what works and what does­n’t, and what you should and should­n’t pay atten­tion to takes time. 

Here’s what I loaded up and used the first week:

Each of those is its own “fla­vor” of AI, or in the case of Open­We­bUI and Any­thingLLM, a way to inter­act with an LLM.

About that RAG

If you want to add in your own con­tent (like the US Tax Code or your ship­ping info or the con­tents of a book), you’ll use a RAG, or Retrieval Aug­ment­ed Gen­er­a­tion. I’ll start off by say­ing this: The *con­cept* of a RAG is easy; add in the cus­tomized thing you want your AI to ref­er­ence when answer­ing your ques­tions. The *appli­ca­tion* of get­ting the thing to work is the sticky bit.

Any­thingLLM makes it pret­ty easy to add in doc­u­ments, though it’s not ultra clear how to assign them to the spe­cif­ic model/thread you’re work­ing on. Open­We­bUI is good for a few docs, but not so good with a ton of docs, and does­n’t real­ly give cita­tions (the way, say, Per­plex­i­ty does). 

The use of RAGs is still well ahead of “easy”. The data you feed the AI has to be clean and well struc­tured, and frankly, most of us humans are walk­ing around with such a mess in our heads regard­ing how we orga­nize and under­stand our own data it can be a mind-ben­der to trans­late that to your machine. 

Still, it’s been VERY use­ful for my own think­ing to move through the steps of intro­duc­ing data to any of the LLMs I’ve worked with.

Next Steps

Rather than giv­ing you a step by step of exact­ly what to do with each LLM, I’d encour­age you to explore this AI jun­gle on your own. Know that oth­er folks (like me!) are out in the jun­gle with ya and we’re all learn­ing how to inter­act with our own pri­vate AI in our own way.

So far, of all the things I want­ed to do, none have come out the way I want. I’d pret­ty much expect­ed that. Still, I’m get­ting exact­ly what I want­ed in the form of hands-on expe­ri­ence work­ing with and around a per­son­al AI and see­ing both the oppor­tu­ni­ties and the limitations.

I found the fol­low­ing two YouTube videos help­ful when I was start­ing out:

Host All Your AI Locally

How To Use Your Local AI (Tech­no­Tim vid)

Self Host­ed AI That’s Actu­al­ly Useful

If you’re exper­i­ment­ing with this as well I’d love to hear about it! If you buy a bunch of parts and need help, jump in the GK Dis­cord serv­er, there’s usu­al­ly some­one awake and will­ing to point you in the right direction.

See ya on the other(AI) side!


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.