• 0 Posts
  • 1 Comment
Joined 3 years ago
cake
Cake day: July 2nd, 2023

help-circle
  • lexiw@lemmy.worldtoHomelabAm I getting ripped off?
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    10 hours ago

    If it’s for a single low frequency workflow that GPU is enough, but you will be limited to small models which are mostly useless unless fine-tuned for your use case. If it’s serving users for the entire company with a big enough model to be useful you would need 192-384GB of VRAM. So a server between $20k and $40k.

    The server will require maintenance, and somebody will have to develop the workflow and integration with your data.

    It’s also important to know what they want to do, a basic embedding model for semantic search would work, agents not so much.