I've recently been interested in setting up llamas on windows, using KoboldAI and Silly Tavern as the medium to hook into it. The main thing I found was that fully offloading the model on to your VRAM is an insane boost in performance and the sort of 'end goal' you want to have with loading models. Having high RAM is a bit of a red herring since the ram speeds are meh in comparison. Even if it's just 1 layer you'll feel the difference.
For 7b models, you'd need at minimum 6gb vram. And even then, you'd probably need a model of about 3gb large because not all 7b models are born equal.
So is this something your software will do automatically? In terms of figuring out context sizes, layers, blasbatchsizes, etc, to determine the optimal loadout for the best speeds? Are different quantasized versions of models available depending on VRAM availability?
Hey! That's a good question and seems like something we should support, though right now we just have some hard-coded presets. Happy to chat more about it in our Discord if you're interested: https://discord.gg/kXuK7m7aa9
Got this shit tagged for when the windows or android release comes out. I'd love to use an A.I program for my story boarding without having to pay 9.99 to get more than like 3 messages. Will actually donate too if it does.
Thanks for your support, I actually recently made some progress on getting the build working. No date yet, but I also really want a Windows build and am very sorry for the delay :(
I know I feel quite bad, everyone is waiting :( It is still my #1 priority, but I've had some issues with getting the libraries to compile (specifically linking them into HammerAI), and things in life have gotten crazy. I will have it out as soon as I can though, sorry everyone.
In case anyone is reading this and is interested, I'd be happy to bring on any contributors! We will keep desktop and web free forever, but I was thinking we could build a mobile version with some paid features and then split profits between contributors.
It is my top priority right now! But have run into some issues with MLC-LLM that are taking more time to get working than I originally thought. The app itself runs fine on Windows though, just not yet the AI chat part. If you join the Discord I plan to post in there as soon as it's ready for early testers.
Following this with interest for a Windows release.
I love the fact you included a screenshot of someone instantly going for the "can I kiss you" with their waifu, you know your target demographics that's for sure.
I just haven't yet finished up the work to support it! There is no technical reason why it can't work on Windows, and I definitely want Windows support as soon as possible.
This seems awesome, so i wanted to try the web version, whenever i tried a character, it wanted to download the AI model but it displayed the message "Cannot find adapter that matches the request", is there a way for me to manually download the model or maybe i forgot to enable something on my browser? (I use Google Chrome)
I think it really was a problem with my Google Chrome, cause i tried with Microsoft Edge (Good god...) and it seems to be working normally, i'll have to check what's wrong with it later, but in case you'd like some info, my GPU is an Nvidia GeForce RTX 3060 TI and my Chrome version was 117.0.5938.63 (it says it's the most recent), which what i read online, was supposed to support WebGPU
← Return to app
Comments
Log in with itch.io to leave a comment.
Hi all, excited to share that Windows and Ubuntu are out in beta!
You can download them from Itch directly or on https://www.hammerai.com/desktop. If you have any issues or suggestions, please feel free to join the Discord and let me know there: https://discord.gg/kXuK7m7aa9
Enjoy!
so excited to finally be able to try it out
Good luck with Windows Release!
Also cant wait to try it
Thanks, it's out now!
I've recently been interested in setting up llamas on windows, using KoboldAI and Silly Tavern as the medium to hook into it. The main thing I found was that fully offloading the model on to your VRAM is an insane boost in performance and the sort of 'end goal' you want to have with loading models. Having high RAM is a bit of a red herring since the ram speeds are meh in comparison. Even if it's just 1 layer you'll feel the difference.
For 7b models, you'd need at minimum 6gb vram. And even then, you'd probably need a model of about 3gb large because not all 7b models are born equal.
So is this something your software will do automatically? In terms of figuring out context sizes, layers, blasbatchsizes, etc, to determine the optimal loadout for the best speeds? Are different quantasized versions of models available depending on VRAM availability?
Hey! That's a good question and seems like something we should support, though right now we just have some hard-coded presets. Happy to chat more about it in our Discord if you're interested: https://discord.gg/kXuK7m7aa9
Is 16Go of RAM enough ?
Mm short answer is that I don't know. But is this for Web or Desktop? If Desktop, what computer are you using?
Windows, Web, Opera gx
Got it. So I would say try it and see? But Windows Desktop is now also out so maybe that will work?
Got this shit tagged for when the windows or android release comes out. I'd love to use an A.I program for my story boarding without having to pay 9.99 to get more than like 3 messages. Will actually donate too if it does.
Thanks for your support, I actually recently made some progress on getting the build working. No date yet, but I also really want a Windows build and am very sorry for the delay :(
No problem bud. This stuff is a lot of work, and I appreciate the effort you're putting in. Good luck!
Thanks! Finally got it out if you want to try :)
Sounds good my dude.
I know I'm the 10,000th person to ask, but when is it coming to windows?
I know I feel quite bad, everyone is waiting :( It is still my #1 priority, but I've had some issues with getting the libraries to compile (specifically linking them into HammerAI), and things in life have gotten crazy. I will have it out as soon as I can though, sorry everyone.
In case anyone is reading this and is interested, I'd be happy to bring on any contributors! We will keep desktop and web free forever, but I was thinking we could build a mobile version with some paid features and then split profits between contributors.
Okay well it took two months beyond that, but it's out now!
when are we going to get a windows release?
It is my top priority right now! But have run into some issues with MLC-LLM that are taking more time to get working than I originally thought. The app itself runs fine on Windows though, just not yet the AI chat part. If you join the Discord I plan to post in there as soon as it's ready for early testers.
I guess the real answer is.. today!
Following this with interest for a Windows release.
I love the fact you included a screenshot of someone instantly going for the "can I kiss you" with their waifu, you know your target demographics that's for sure.
You found the easter egg 😂
Windows is now out!
oh and um, why doesn't not work with windows (for now?). please explain to me, i would like to know
I just haven't yet finished up the work to support it! There is no technical reason why it can't work on Windows, and I definitely want Windows support as soon as possible.
It's working now!
dam one of the few moments where apple users have something cool that window users don't
True haha. But Windows is out now, so it's even again xD
This seems awesome, so i wanted to try the web version, whenever i tried a character, it wanted to download the AI model but it displayed the message "Cannot find adapter that matches the request", is there a way for me to manually download the model or maybe i forgot to enable something on my browser? (I use Google Chrome)
Hmm, that is unexpected.
It looks like these issues: https://github.com/mlc-ai/web-llm/issues/105#issuecomment-1594835134 & https://github.com/mlc-ai/web-llm/issues/128#issuecomment-1595151465, which are "likely because your env do not support the right GPU requested(due to older mac) or browser version" or "likely mean that you do not have a device that have enough GPU RAM".
Could you try going to https://webgpureport.org/ and seeing what is says?
Also if you'd like to join the Discord it would be great to continue this conversation there: https://discord.gg/kXuK7m7aa9
I think it really was a problem with my Google Chrome, cause i tried with Microsoft Edge (Good god...) and it seems to be working normally, i'll have to check what's wrong with it later, but in case you'd like some info, my GPU is an Nvidia GeForce RTX 3060 TI and my Chrome version was 117.0.5938.63 (it says it's the most recent), which what i read online, was supposed to support WebGPU
Yada yada, same as all the other comments. Looks awesome.
Today is the day...

let me know when this is on windows
Will do!
let me know when this is on windows (1)
It finally happened :)
We're out now!
i always loved playing AI dungeon back in the day, i'm looking foward for this to come to windows!!!
We will let you know!
Today is the day!
Also would love to know when this comes to windows! Super awesome to see :D
Sounds good we'll let you know!
We're out on Windows now!
yes, can you also reply to me once it hits windows?
Will do!
We're out on Windows now!
let me know when this is available for windows.
Will do!
We're out on Windows now!
I'll let you know when it does!
Well it took a long time, but it's out on Windows now!