Local code completion - TabbyML

havoc · December 2023

Another quick & dirty notes write-up an AI topic...

Code completion in vscode similar to GitHub CoPilot - except local and free.

You'll need a windows PC capable of running WSL and a vaguely modern nvidia graphics card (a weak one is fine just can't be ancient). I'm using Win11 but Win10 should work too

Install nvidia driver (just the normal gaming one), install vscode.
WSL2

Follow MS instructions on this. If it complains about virtualisation not being enabled - check your bios to enable the right option VT-x, VT-d, AMD-V or whatever applies to your board.

At this stage you should be able to spin up a basic ubuntu WSL instance. We don't actually need it - we just want to know if WSL2 works fine. Type Ubuntu into windows search and set that up. Might take a bit of time. Confirm that you're running WSL 2 in powershell:

PS C:\Users\havoc> wsl -l -v
  NAME                   STATE           VERSION
* Ubuntu                 Running         2

WSL comes in 1 and 2 flavour. We need 2.

Install docker desktop - when it asks select WSL not hyperV
Restart
Go to powershell and run "docker run hello-world" - just to confirm that we've got docker up and running
Next we want to know if docker can talk to our GFX via CUDA. This command should show details about your card if all is well.

docker run --rm --gpus all nvcr.io/nvidia/cuda:latest nvidia-smi

Confirm that you've got nothing running on port 8080 like ahem sabnzbd. If you do move the other thing. Moving TabbyML to another port seems a little flakey
Run the Tabby docker image command off here

https://tabby.tabbyml.com/docs/installation/docker/

Open browser navigate to http://localhost:8080 - if top left says Swagger then you're good
Back to VS code, install extension TabbyML
Bottom right the Tabby icon should be black/system colour instead of orange. You may need to point it at http://localhost:8080 in the extension settings but probably not
Set up a main.py and see if it code completes. e.g. given green comment it suggests the grey

Default model is the smallest / lightest so depending on card you may won't to select another (in step 8). I'd suggest sticking to 7B models or smaller even if you have a 3000/4000 generation card. Completion speed matters for experience.

You may also want to delete all the random images we pulled for testing. Use docker images to list them docker rmi [ImageID] to remove. You may need to stop & remove the container before removing image. docker ps -a command shows all containers, running or not

WSL seems pretty fragile & not immediately apparently how to unfk it with the terrible W11 UI so including that too:

Search for "optional features" (think it's add features in W10). Look for the search box that says "Installed features". Type gibberish in there. That reveals (?!?) a "More Windows features" button. Click that. Untick Hyper-V, Virtual Machine Platform, Windows Hypervisor Platform, Windows Subsystem for Linux. Apply & restart. Now do same except adding them back & restart. Unsure if all 4 are needed for WSL - probably not but that's what worked for me. WSL is based on hyperV so flushing both seemed safer