dmillar a day ago

Many criminal records, petty or otherwise, are public record. When archived, expunged or dismissed infractions never truly become that. A traffic violation or other petty misdemeanor from 20 years ago, that has been expunged from official record, can show up on a background check because companies archive public data. So, there is a flip side to this.

  • overfeed a day ago

    Public data is incompatible with secrecy. Expunged records still appear in newspapers archives if the local reporter on the Crimes beat captured the proceedings. IMO, "expunged" means removed from Official court records - not from the public memory, including newspapers, archived websites, police blotters and prosecutors' files.

  • InvOfSmallC 16 hours ago

    The fact that you get it out from your criminal record doesn't mean they get forgotten. Think about a paper writing about your crime. That will be public and archived forever.

nla a day ago

Best thing I ever heard from the head of archives at the BBC:

Once you format shift, you will always be format shifting.

Keep your originals whenever you can.

  • rippit 7 hours ago

    As someone who spent the last 2 days figuring out how best to digitise my father's old Hi8, Digital8 and MiniDV tapes, I take umbridge with this!

    Keep originals if you can, but make copies ASAP, as close to lossless as possible. Don't depend on the right hardware being around in the future.

  • pjc50 10 hours ago

    I can see the value in this, but .. originals, and the gear to read them, do not last forever. Plus for many formats the act of reading puts wear on the physical artifacts. So if you want to actually use the information, you have to format shift it to digital in the first place. And then you're back to the same question as the rest of us, how to maintain the bits.

  • anitil 18 hours ago

    I don't understand this phrase, are you able to explain it?

    • bell-cot 13 hours ago

      Guess: If properly stored (physically), good-quality paper documents and photographs will last for centuries. But as soon as you digitize them - you're now chained to the treadmill of maintaining/upgrading/migrating digital archiving systems. Compared to keeping the old-fashioned Archive Storage Room dry (and fire-free), that's 100X the labor and expense. Forever.

      • wizzard0 9 hours ago

        A lot of paper archives and libraries burned just recently in LA.

        • bell-cot 7 hours ago

          True.

          But from fire-resistant storage cabinets, to concrete-lined file rooms, to underground archives, the tech to make archives ~99.5% fire-proof is more than a century old. And if you add redundant storage sites for the high-value stuff...

          Vs. anything digital is far more vulnerable to digital malice.

Damogran6 a day ago

Hypothetically: -Government leader says they're nuking data -Mad rush to back up data through other means -Government leader declares they've 'transferred the cost of maintaining data out of government, thus making for a smaller, more efficient, government'

I hate everything about this.

  • krunck a day ago

    There is inherent inefficiency in government accountability efforts. I'm ok with that.

  • riku_iki a day ago

    In general it makes sense to shift this part to business, if data is valuable, there will be market and services. Probably problem is how fast they nuked without grace period.

mikrl a day ago

How does this relate to dox?

Let’s say an individual posted identifying or incriminating information online, inadvertently or intentionally, in a public place.

Then a third party decides to store it, and possibly make it accessible to others.

If the original self doxxing user then pulled the original dox, but was unable to scrub the rest, would that information still be considered public, or would it be private? Was it ever truly public? Or private for that matter?

  • ziddoap a day ago

    If you intentionally post something publicly, it's public. Full stop.

    The tricky part is dealing with inadvertent or malicious (i.e. some other party), posting of private information to a public space. That's really hard to deal with on multiple levels.

    For one, the archives would retain the information and scrubbing it is effectively impossible.

    Secondly, legitimate things which should remain public (i.e. were posted publicly, are of public interest, etc.) can be argued to have been inadvertently or maliciously posted. So you need some way to moderate and create rulings for each individual case, which quickly becomes untenable due to the sheer volume of information being posted and the inordinate amount of time required to investigate vs. post.

  • calebio a day ago

    That's a really good question.

    In my head, I'm imagining someone early in the morning posting a flyer up on a bulletin board downtown.

    Throughout the day many folks walked by and took photos of the flyer with their cell phone.

    At the end of the day, the original person came back and removed the flyer.

    IMO, at the time that the folks took the photo of the flyer, that flyer was public information. It remains public information even after the flyer is removed[0].

    This isn't a great analogy of mine, and has plenty of holes, but was interesting to me after I read your comment. I know it was in the context of doxxing, but I think it's pretty interesting philosophically.

    I think something similar applies to photos taken of other people in public spaces. Both the person who took the photo and the subject of the photo are no longer in that physical public space, but the actions took place within that space.

    I think something similar applies to digital "public spaces". But what does a public space even mean in the context of walled gardens[1], etc.

    [0] you then run into the question of what happens if someone posts non-public information, publicly? [1] are digital walled garden communities that different from physical communities that gate access, whether free or paid. Whether information shared within those contexts are public or private is an interesting thread as well.

  • sixothree a day ago

    Which data set are you thinking this might apply to?

Teever a day ago

I made this related submission[0] recently but it was flagged.

This stuff is very important to talk about so I hope that this submission by rbanffy isn't also flagged.

[0] https://news.ycombinator.com/item?id=43543075

  • hsuduebc2 a day ago

    I agree. I do not understand how this is perceived as an political issue and thus got flagged.

    Climate change is perceived for some reason politically too and not get flagged so often.

  • donnachangstein a day ago

    No it isn't. It's merely a cause du jour for data hoarders to justify their hobby in light of this Chicken Little hysteria.

    30 years ago it was thought collecting every issue of magazines like TV Guide was important. No one even knows what that is anymore.

    No one is ever going to look at 99% of this data. In the meantime, send more hard drives for my NAS!!

    • hermannj314 a day ago

      My wife takes thousands of photos every year, when my daughter was young she took even more.

      When we were moving out of our apartment there was damage to a door hinge that we never noticed when we moved in but that had definitely been there from the onset of our two years of living in that apartment.

      Guess what? I had a photo from the day after we moved in of that door hinge in a state of damage! Not because we took the photo for that intention, but because my daughter was playing in the hallway and my wife snapped a photo and it just happened to capture the damage. Saved me several hundreds of dollars in repair costs from my landlord.

      You are right, 99% of the data will never be looked at. But do you know what the 1% is today? I'm guessing you don't.

      • donnachangstein a day ago

        Your example of personal family photos is in no way comparable to storing terabytes of essentially unindexed data for which one has no detailed knowledge about, under the notion that the government is somehow lighting a match to everything, and they're going to save it.

        The government doesn't delete anything. It might be moved or inaccessible to the public but that data is somewhere in perpetuity.

        It's one of the most deranged larps I've ever seen, then they pat each other on the back on BlueSky, desperately wanting to be a part of something.

        These people envision themselves as folk heroes when what they really need to do is go outside and touch grass.

        • spookie 20 hours ago

          > The government doesn't delete anything. It might be moved or inaccessible to the public but that data is somewhere in perpetuity.

          If the government is democratic and values integrity? Sure.

          Otherwise I wouldn't bet on it. My own country's history books and my parents' own life stories have already warned me about how fickle democracy is. No democratic country is free from that fact. Some think "checks and balances" ought to be enough to prevent it, but I wouldn't be so sure.

        • nancyminusone a day ago

          If it's inaccessible to the public, it might as well be deleted. What's the difference? If you can't get it, you don't have it.

        • m2024 18 hours ago

          [dead]

    • dreamworld a day ago

      It might be of some interest to cultural historians in the future. But I think it makes more sense to take sample+curated data. But in any case if we can afford it, eh why not.

      • rbanffy a day ago

        We don't know now what to curate for the future. We should preserve as much of everything we can - we don't know what will be important in 50, or 500 years.

        Case in point: retrocomputing is my hobby. I buy, restore, preserve, and use old computers. Most of them are home computers, because business computers go directly from the office to the recycling facility or the landfill. Unless someone deliberately preserved, say, a Burroughs B-25 desktop, or the similar from Data General, they are gone.

        • Suppafly a day ago

          My son is into retrocomputing, mostly using older hardware I have from when I was younger, and we have a stack of old compaq desktops where you can't access the bios because it requires a specific floppy that is nearly impossible to find online. This is 486/pentium era stuff, the older stuff is even harder to find.

          • rbanffy 9 hours ago

            I've been looking for a DEC terminal with Sixel, Tektronix and ReGIS graphics for a while, with zero success. They weren't rare at all - they were a massive success, and, yet, it seems almost all ended up in a recycling facility or an e-waste dumpster. Many other terminals emulated them and expanded on their feature set.

    • peppermill a day ago

      I think the data being discussed is quite a bit different than old TV Guides...

      • NoMoreNicksLeft a day ago

        I was, believe it if you wish, thinking about old TV guides just this morning and wondering how one would even go about archiving those. Most of the stumbling blocks for taking apart the glued binding for scanning have been figured out, of course, but for any given week there may have been as many as 60 or 70 editions (for each television market, I think). None of these have proper ISSN numbers as far as I'm aware, and other than the listings they can be visually indistinguishable. Then there is the challenge of finding those, and not knowing whether this or that edition is missing (from time to time, the company would create new additions for new regions, or fold old ones back into some other are) along with even parsing the content. Many of these tv shows aren't on themoviedb or thetvdb, and if the shows are, then there won't be episode listings (there were 6000 Donahue talk show episodes, after all). On top of all of that, you can't necessarily know what was on tv at a given time and day, with federal government preemptions, commercials, unreported last-minute rescheduling, etc.

        But I can also see why people might want to keep more interesting data, like when the Federal Cheese-Sniffing Agency moved offices back in 1982 and they have meticulous records of the 483 filing cabinets that had to be moved from the original location to their new home in Furrytown, Pennsylvania.

      • zorpner a day ago

        I wonder if those would be useful in identifying the potential contents of specific Marion Stokes tapes (my understanding is that they're sorted, but are only labeled with channel and date/time and are being archived slowly): https://libwww.freelibrary.org/blog/post/5393

    • thowawatp302 a day ago

      I’ve had the idea of recreating tv channels on my plex server by using tv guide data from the late 90s early 00s

      The insurmountable part of that project would be getting the guide data.

      You don’t know what other people will want in the future

      • Teever 20 hours ago

        That's a great idea.

        There's are sites that stream old content with a old tube tv UI wrapped around the video frame but they don't have all the commercials and they don't follow the old schedules like you suggest.

        I've got a friend who has hoarded digitized copies of VHS recordings of old cartoons from that era complete with the commercials, so the content is definitely out there.

hsuduebc2 a day ago

I wonder. Maybe for this would be blockchain actually usefull technology?

  • jefurii 4 hours ago

    git-annex is not exactly blockchain but because of the way it operates -- storing files by their hashes, the whole Git commit structure -- it gives you several useful things: It becomes easy to clone repositories while guaranteeing that clones are identical. It also becomes easy to ensure that files are not tampered with.