I’ve found that all the web archiving software I’ve encountered are either manual (you have to archive everything individually in a separate application) or crawler-based (which can end up putting a lot of extra load on smaller web server, and could even get your ip blocked).

Are there any solutions that simply automatically archive web pages as you load them in your browser? If not, why aren’t there?

I could also see something like that being useful as a self-hosted web indexer, where if you ever go “I think I’ve seen this name before”, you can click on it, and your computer will say something like “this name appeared in a news headline you scrolled past two weeks ago”

OQB @kayzeekayzee@lemmy.blahaj.zone

  • lunchbox2287@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    14 hours ago

    Have you looked at hunchly - https://hunch.ly/ ? It indexes sites as you browse and let’s you search for content within them - and only the sites you hit, so its not crawling the rest of a domain. It’s built for investigations, really, but it sounds like it does what you’re describing. Its not free, unfortunately, and uses an installed program coupled with a browser extension to do what it does.