• 0 Posts
  • 58 Comments
Joined 2 years ago
cake
Cake day: June 20th, 2023

help-circle










  • Generally power supplies are the most electrically efficient at 20-60% utilization, so there’s no issue with over-provisioning power, other than the (generally minor) upfront extra cost, which might very well pay for itself in the first months/years of usage. I’ll take a look and see what I can find on those sites.

    Edit: okay, trying to shop through google translate / currency calculator is actually aids so I’m gonna teach a man to fish instead. This is what I should have done from the start anyway.

    Power supply: Anything from a decent brand, at basically anything >450W. a 650W or 850W is totally fine if it’s at a decent price. They only draw the power they need, they don’t just constantly pull 850W if the downstream components aren’t calling for it.

    CPU: 12400 is a fine cpu for what you’re doing. You’ll transcode at 720p no problem, 1080p maybe a single stream in real-time. I wouldn’t bank on more than that. Only downsides here are the relatively shallow core counts if you ever expanded into other workloads. Without access to used xeon boards/cpus, it might be a reasonable choice though. What I would say is look for something older but with more cores/threads if you can. For example, a 10900 or even 10700k would probably be a better server cpu than a 12400.

    Memory: DDR4 platforms are a great way to save money, as long as you aren’t planning on expanding to inferencing on cpu. Get as much as you can. 32-64gb of ddr4 should be dirt cheap, especially if you find a cheap motherboard with 4 memory sockets.

    Motherboard: If you want this thing to be versatile, you want 2x pci-e slots. Old gaming full-sized ATX boards are the way to go here. 1 slot for an HBA, 1 slot for a GPU, and that should be all you need. Bonus for as many open sata sockets as possible. 6-8 is pretty typical on 10th-12th gen gaming ATX boards.

    GPU: gpus will be much more efficient at transcoding than an igpu, especially from older intel CPUs. A 1050, 2060, 3050, basically anything from the 10-series onward has a decent nvenc encoder that would work well with plex/jellyfin. My goto is generally old workstation cards, I use a p620 myself and it handles a single 4k encode job no problem. I’m not sure if they’re viably purchasable anywhere in your area, but I’d definitely look out for a P620, P1000, or T400. Great value in those cards.

    Drives/HBA: there are inexpensive LSI HBA cards to expand how many drives you can attach to a system if you need them, all you need is a spare pci-e slot and a place to physically mount the drives. The cheapest way to start here is to look for a motherboard with 4-6 sata slots and use those. Hardware raid is functionally dead these days in the real world, just use zfs or mdadm under linux to create an array with your desired level of resiliency/capacity.

    Once you’ve priced out what it would cost to buy all of this new, look for prebuilt gaming PCs and office PCs that might be able to be expanded to fit these requirements. Prices look kind of steep on those markets you listed, but I’m sure something exists if you look hard enough.





  • Anecdotally, I use it a lot and I feel like my responses are better when I’m polite. I have a couple of theories as to why.

    1. More tokens in the context window of your question, and a clear separator between ideas in a conversation make it easier for the inference tokenizer to recognize disparate ideas.

    2. Higher quality datasets contain american boomer/millennial notions of “politeness” and when responses are structured in kind, they’re more likely to contain tokens from those higher quality datasets.

    I haven’t mathematically proven any of this within the llama.cpp tokenizer, but I strongly suspect that I could at least prove a correlation between polite token input and dataset representation output tokens



  • RE: backups, I’d recommend altering your workflow. Instead of taking an image of a box, automate the creation of that box. Create a bash script that takes a base OS, and installs everything you use fresh. Then have it apply configuration files where appropriate, and lastly figure out which applications really need backup blobs to work properly (thunderbird, for example). Once you have that, your backups become just the data itself. Photos, documents, etc. Everything else is effectively ephemeral because it can be reproduced through automation.

    Takes a lot less space, is a lot more portable. And much better in scenarios where something in your OS is broken or you get a new computer and want to replicate your setup.



  • For people with “that one game” there is a middle ground. Mine is Destiny 2 and they use a version of easy anticheat that refuses to run on Linux. My solution was to buy a $150 used Dell on eBay, a $180 GPU to be able to output to my 4 high-res displays, and install Debian + moonlight on it. I moved my gaming PC downstairs and a combination of wake-on-lan + sunshine means that I can game at functionally native performance, streaming from the basement. In my setup, windows only exists to play games on.

    The added bonus here is now I can also stream games to my phone, or other ~thin clients~ in the house, saving me upgrade costs if I want to play something in the living room or upstairs. All you need is the bare minimum for native-framerate, native-res decoding, which you can find in just about anything made in the last 5-10 years.


  • “Open source” in ML is a really bad description for what it is. “Free binary with a bit of metadata” would be more accurate. The code used to create deepseek is not open source, nor is the training datasets. 99% of “open source” models are this way. The only interesting part of the open sourcing is the architecture used to run the models, as it lends a lot of insight into the training process, and allows for derivatives via post-training