Is there any way to save storage on similar images?

pe1uca@lemmy.pe1uca.dev · 11 days ago

I can’t give you the technical explanation, but it works.
My Caddyfile only something like this

@forgejo host forgejo.pe1uca
handle @forgejo {
	reverse_proxy :8000
}

and everything else has worked properly cloning via ssh with git@forgejo.pe1uca:pe1uca/my_repo.git

My guess is git only needs the host to resolve the IP and then connects to the port directly.

pe1uca@lemmy.pe1uca.dev · 16 days ago

I’m not saying to delete, I’m saying for the file system to save space by something similar to deduping.
If I understand correctly, deduping works by using the same data blocks for similar files, so there’s no actual data loss.

pe1uca@lemmy.pe1uca.dev · 16 days ago

Is there any way to save storage on similar images?

pe1uca@lemmy.pe1uca.dev · edit-2 30 days ago

I had a similar case.
My minipc has a microSD card slot and I figured if it could be done for a RPI, why not for a mini PC? :P

After a few months I bought a new m2nvme but I didn’t want to start from scratch (maybe I should’ve looked into nix?)
So what I did was sudo dd if=/dev/sda of=/dev/sdc bs=1024k status=progress
And that worked perfectly!

Things to note:

both drives need to be unmounted, so you need a live OS or another machine.
The new drive will have the same exact partitions, which means the same size, so you need to expand them after the copy.
PS: this was for a drive with ext4 partitions, but in theory dd works with the bytes so it shouldn’t be an issue what fs you use.

pe1uca@lemmy.pe1uca.dev · 2 months ago

Text to speech is what piper is doing.
What I’m looking for is called voice changer since I want to change a voice which already read something.

That’s exactly what I want: “the thing in the Darth Vader halloween masks” but for linux, preferably via CLI to ingest audio files and be able to configure it to change the voice as I want, not only Darth Vader.

pe1uca@lemmy.pe1uca.dev · 2 months ago

I don’t want to manage piper voices, I can handle that directly in my file system as I only have a few.
The issue is none of the ones I’ve found are good for me, so what I need is something to change the voice once it has been generated by piper.

pe1uca@lemmy.pe1uca.dev · 2 months ago

I haven’t completely looked into creating a model for piper, but just having to deal with a dataset is not something I look forward to, like gathering the data and all of what this implies.

So, I’m thinking it’s easier to take an existing model and make adjustments to fit a bit better on what I would like to hear constantly.

pe1uca@lemmy.pe1uca.dev · 2 months ago

Any good linux voice changer?

pe1uca@lemmy.pe1uca.dev · 2 months ago

Check the most upvoted answer and then look into tubearchivist which can take your yt-dpl parameters and URLs to download the videos plus process them to have a better index of them.

pe1uca@lemmy.pe1uca.dev · 2 months ago

I only had to run this in my home server, behind my router which already has firewall to prevent outside traffic, so at least I’m a bit at ease for that.
In the VPS everything worked without having to manually modify iptables.

For some reason I wasn’t being able to make a curl call to the internet inside docker.
I thought it could be DNS, but that was working properly trying nslookup tailscale.com
The call to the same url wasn’t working at all. I don’t remember the exact details of the errors since the iptables modification fixed it.

AFAIK the only difference between the two setups was ufw enabled in the VPS, but not at home.
So I installed UFW at home and removed the rule from iptables and everything keeps working right now.

I didn’t save the output of iptables before uwf, but right now there are almost 100 rules for it.

For example since this is curl you’re probably going to connect to ports 80 and 443 so you can add --dport to restrict the ports to the OUTPUT rule. And you should specify the interface (in this case docker0) in almost all cases.

Oh, that’s a good point!
I’ll later try to replicate the issue and test this, since I don’t understand why OUTPUT should be solved by an INPUT rule.

pe1uca@lemmy.pe1uca.dev · 2 months ago

Well, it’s a bit of a pipeline, I use a custom project to have an API to be able to send files or urls to summarize videos.
With yt-dlp I can get the video and transcribe it with fast whisper (https://github.com/SYSTRAN/faster-whisper), then the transcription is sent to the LLM to actually make the summary.

I’ve been meaning to publish the code, but it’s embedded in a personal project, so I need to take the time to isolate it '^_^

pe1uca@lemmy.pe1uca.dev · 2 months ago

I’ve used it to summarize long articles, news posts, or videos when the title/thumbnail looks interesting but I’m not sure if it’s worth the 10+ minutes to read/watch.
There are other solutions, like a dedicated summarizer, but I’ve investigated into them and they only extract exact quotes from the original text, an LLM can also paraphrase making the summary a bit more informative IMO.
(For example, one article mentioned a quote from an expert talking about a company, the summarizer only extracted the quote and the flow of the summary made me believe the company said it, but the LLM properly stated the quote came from the expert)

This project https://github.com/goniszewski/grimoire has in it’s road map a way to connect to an AI to summarize the bookmarks you make and generate at 3 tags.
I’ve seen the code, I don’t remember what the exact status of the integration.

Also I have a few models dedicated for coding, so I’ve also asked a few pieces of code and configurations to just get started on a project, nothing too complicated.

pe1uca@lemmy.pe1uca.dev · 2 months ago

What am I doing with iptables?

pe1uca@lemmy.pe1uca.dev · 2 months ago

Ah, that makes sense!
Yes, a DB would let you build this. But the point is in the word “build”, you need to think about what is needed, in which format, how to properly make all the relationships to have data consistency and flexibility, etc.
For example, you might implement the tags as a text field, then we still have the same issue about addition, removal, and reorder. One fix could be have a many tags to one task table. Then we have the problem of mistyping a tag, you might want to add TODO but you forgot you have it as todo, which might not be a problem if the field is case insensitive, but what about to-do?
So there are still a lot of stuff you might oversight which will come up to sidetrack you from creating and doing your tasks even if you abstract all of this into a script.

Specifically for todo list I selfhost https://vikunja.io/
It has OAS so you can easily generate a library for any language for you to create a CLI.
Each task has a lot of attributes, including the ones you want: relation between tasks, labels, due date, assignee.

Maybe you can have a project for your book list, but it might be overkill.

For links and articles to read I’d say a simple bookmark software could be enough, even the ones in your browser.
If you want to go a bit beyond that I’m using https://github.com/goniszewski/grimoire
I like it because it has nested categories plus tags, most other bookmark projects only have simple categories or only tags.
It also has a basic API but is enough for most use cases.
Other option could be an RSS reader if you want to get all articles from a site. I’m using https://github.com/FreshRSS/FreshRSS which has the option to retrieve data form sites using XMLPath in case they don’t offer RSS.

If you still want to go the DB route, then as others have mentioned, since it’ll be local and single user, sqlite is the best option.
I’d still encourage you to use any existing project, and if it’s open source you can easily contribute the code you’d have done for you to help improve it for the next person with your exact needs.

(Just paid attention to your username :P
I also love matcha, not an addict tho haha)

pe1uca@lemmy.pe1uca.dev · 2 months ago

I can’t imagine this flow working with any DB without an UI to manage it.
How are you going to store all that in an easy yet flexible way to handle all with SQL?

A table for notes?
What fields would it have? Probably just a text field.
Creating it is simple: insert “initial note”… How are you going to update it? A simple update to the ID won’t work since you’ll be replacing all the content, you’d need to query the note, copy it to a text editor and then copy it back to a query (don’t forget to escape it).
Then probably you want to know which is your oldest note, so you need to include created_at and updated_at fields.
Maybe a title per note is a nice addition, so a new field to add title.

What about the todo lists? Will they be stored in the same notes table?
If so, then the same problem, how are you going to update them? Include new items, mark items as done, remove them, reorder them.
Maybe a dedicated table, well, two tables, list metadata and list items.
In metadata almost the same fields as notes, but description instead of text. The list items will have status and text.

Maybe you can reuse the todo tables for your book list and links/articles to read.

so that I can script its commands to create simpler abstractions, rather than writing out the full queries every time.

This already exists, several note taking apps which wrap around either the filesystem or a DB so you only have to worry about writing your ideas into them.
I’d suggest to not reinvent the wheel unless nothing satisfies you.

What are the pros of using a DB directly for your use case?
What are the cons of using a note taking app which will provide a text editor?

If you really really want to use a DB maybe look into https://github.com/zadam/trilium
It uses sqlite to store the notes, so maybe you can check the code and get an idea if it’s complicated or not for you to manually replicate all of that.
If not, I’d also recommend obsidian, it stores the notes in md files, so you can open them with any software you want and they’ll have a standard syntax.

pe1uca@lemmy.pe1uca.dev · 2 months ago

I was juggling like that, I had most of my files in NTFS so I could read them in windows, even for files read only by Linux programs.
Most programs were able to read from any part of the file system, but for those with strict paths I used symlinks.

But I haven’t had any use for Windows lately so I decided to delete all but one NTFS partition and this last one is only 256GB with 100GB free.
The rest of the data I moved it to ext4 and btrf partitions.

pe1uca@lemmy.pe1uca.dev · edit-2 10 months ago

What's a good gethomepage-like project to show different type of information on a screen? Not only for deployed services

pe1uca@lemmy.pe1uca.dev · 1 year ago

How to store and retrieve my secrets in a linux server?