Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
Get HideMyAss! VPN, PC Mag's Top 10 VPNs of 2016 for 55% off for a Limited Time ×
Google Technology Your Rights Online

The Difficulty In Getting a Machine To Forget Anything 79

An anonymous reader writes: When personal information ends up in the analytical whirlpool of big data, it almost inevitably becomes orphaned from any permissions framework that the discloser granted for its original use; machine learning systems, commercial and otherwise, end up deriving properties and models from the data until the replication, duplication and derivation of that data can never hoped to be controlled or 'called back' by the originator. But researchers now propose a revision which can be imposed upon existing machine-learning frameworks, interposing a 'summation' layer between user data and the learning system, effectively tokenising the information without anonymising it, and providing an auditable path whereby withdrawal of the user information would ripple through all iterations of systems which have utilized it — genuine 'cancellation' of data.
This discussion has been archived. No new comments can be posted.

The Difficulty In Getting a Machine To Forget Anything

Comments Filter:
  • Or (Score:5, Insightful)

    by penguinoid ( 724646 ) on Tuesday September 22, 2015 @12:24PM (#50575721) Homepage Journal

    Or, you could "accidentally" keep the data, and sell it.

  • by elwinc ( 663074 ) on Tuesday September 22, 2015 @12:26PM (#50575747)
    Imagine if we owned our personal information as a form of intellectual property? Big corporations have gotten pretty good at protecting their intellectual property rights. Maybe it's time for us ordinary folks to own our personal information. Then we could license it to companies for particular uses, but they wouldn't have the right to sell it without our permission.
    • You should move to the EU, we actually have something like that.
      • by Nutria ( 679911 )

        we actually have something like that.

        Is that what France is fighting Google over?

        • In essence, yes. If one of their citizens wants to use their right to be forgotten, then the French government want that to be worldwide. But then imagine a Russian official trying to hide a controversial article about himself.

          It's the same kind of debate when the US want Apple to backdoor iChat for wiretaps. If you can coerce them into doing it, then so can a less democratic countries where Apple have business...

          • by Nutria ( 679911 )

            I knew that lesincompetent (2836253) was mistaken when he wrote that.

            Taking the moral high ground is great, but only when it conforms to reality. Otherwise, it's just B.S. posturing.

            • I knew that lesincompetent (2836253) was mistaken when he wrote that.

              Taking the moral high ground is great, but only when it conforms to reality. Otherwise, it's just B.S. posturing.

              True. And reality is that once the data is out there, then there is no real way to pull it out. People will have it in off-line archives, etc; and once it leaves national boundaries then all bets are really off.

              For instance, a country/company could just put something to try to get all the information and then watch for the notices. When a notice comes, they archive it instead of deleting it and if the person is of enough influence to someone (them or a client) they could sell the data out for blackmail.

              • And the blackmailer just has to request that their deeds are forgotten. This way nobody knows about all the blackmailing that is going on! I think the next step is profit, right?
            • I was not referring to the "right to be forgotten". That's bullshit. We tech people know that.
              Once the beans are spilled there's no way to put them back wherever they came from.
              I was referring to privacy laws.
              • by Nutria ( 679911 )

                Laws are great, but computers get hacked, data gets stored elsewhere around the world, etc, etc.

                • I can defend myself on the net, i am more concerned about corporations which (have to) have my real info. They must be strictly kept in check.
      • You would think US would be the first country in the world to make that a law, given to respect individuality has always been a core value of ours. We instinctively believe that our private information is owned by us as individuals. What we do have are FOIA and Privact Act requests from any agency which will just about everytime will be treated with suspicion, not to mention it can easily be denied. So what we really have is a mechanism for an agency to audit the individual raising suspicion, erm I mean
    • I'd like to see them get licenses for everything they publish.
    • by jellomizer ( 103300 ) on Tuesday September 22, 2015 @12:45PM (#50575913)

      We do Own our personal information, but we usually sell it in trade of the electronic services you want to use.

      You find there is value in Google Internet searching, then your payment is knowing your searches would be part of google marketing,

      There is that news website that you don't want to pay for, well those adds will pay for the services.

      You don't need to use these consumer services on the internet. So you can keep your personal information to yourself.

      • by Nutria ( 679911 )

        IOW, if what you get is free, then you are the real product.

        • No, what you get is still the product (or service). You are the real payment.
          • by Nutria ( 679911 )

            But they're selling you to 3rd parties.

            • Ah, yes, you're correct... we are initially the payment for a product or service, but then become the product for a third party.
              I was mainly referring to the OP's comment

              we usually sell it [our personal information] in trade of the electronic services you want to use

            • That information isn't me. I'm much more complex that what can be deduced from that information. It isn't even a copy of me.
              • by Nutria ( 679911 )

                I'm much more complex that what can be deduced from that information.

                They have a *lot* of data about you, and accurately infer *lots* more from the connections you make.

              • That information isn't me. I'm much more complex that what can be deduced from that information. It isn't even a copy of me.

                Procrustes had a solution for that.

            • But they're selling you to 3rd parties.

              More precisely, they're selling space on your screen to third parties.

    • That would work about as well as laws that stop people from sharing copyrighted material.

      In other words, they won't work at all, but you'll see some token enforcement attempts.
    • by SeaFox ( 739806 )

      Maybe it's time for us ordinary folks to own our personal information. Then we could license it to companies for particular uses, but they wouldn't have the right to sell it without our permission.

      LOL. The TOS for any service will simply we amended to say that by giving them the information we grant them an irrevocable license to the data and give them the right to sell it. This will be presented as an "update" to the Terms of Service that 95% of people will agree to without actually reading, the the remainder? Well, if you don't like it, no Facebook for you!

    • by Sloppy ( 14984 )

      Imagine if we owned our personal information as a form of intellectual property

      Ok, try doing that. Next time you're about to transmit your information to someone else, stop. Either don't send it at all, or send them cyphertext instead.

      If Amazon wants to know how to descramble your zip code, they're going to have to make some kind of deal with you, wereby they become bound to the terms and conditions that you specify. I just hope that prior to making that deal, you don't get too impatient waiting for your

    • by Anonymous Coward

      The people who would benefit from your idea do not care enough to apply the necessary political pressure to make this happen.

      The people who benefit from the current state care quite a lot about it, and also have significant resources with which they can apply the necessary political pressure to keep things the way they are.

      So, your idea is doomed.

    • Is for someone with a legal background and an axe to grind to start a case where their personal information is deemed confidential and personal property just as all corporate identities claim that their information is confidential and their property.

      When corporations want to be treated like people it's deemed ok, so time to turn the tables.

    • Why do you think some people copyright and trademark their name?

      We really should force companies to sign NDA's when we license our personal (bio) data to them.

  • I made a copy ... (Score:4, Insightful)

    by PPH ( 736903 ) on Tuesday September 22, 2015 @12:26PM (#50575751)

    ... of the database on archival optical media. What now?

    • My guess is you'd have to destroy it.

      • by Anonymous Coward

        have to

        So this is protocol is based on the honor system. Like the evil bit or do not track header. Or those photos I promised never to show anyone else.

  • by gstoddart ( 321705 ) on Tuesday September 22, 2015 @12:29PM (#50575773) Homepage

    Without laws enforcing it, even if you had a mechanism none of those corporations would follow it.

    They seem to think it is their right to buy and sell our information.

    Even if you had laws enforcing it, I bet half of them would lie and keep it anyway. The shady assholes feeding the "big data" industry have far too much money at stake to ever allow constraints on how they use "our" data.

    They'd just pay off the politicians to pass laws clarifying it's their data, they're entitled to it, and we don't get a vote.

    Just like always.

    • There is something else: no one would EVER use a scheme like the proposed, because if you don't keep the originating data and you anonymize properly you can always have plausible deniability and you always can say "your data is not a part of our database".
    • This is not about enforcement. It is about being able to build in this functionality.

      Why opt in?

      Well how about protests where bogus data enters the stream, and conclusions are invalidated? The original data is not always available, and reprocessing might be time prohibitive. Cancelling specific data points is needed.

      Other possibilities too, I'm simplifying to try to limit these off topic replies.

      Now you can rant about how this will be abused, while us academics ignore you. It's about being able, not the imp

  • that there's plenty of room for Hillary server jokes here.

  • by Bookwyrm ( 3535 ) on Tuesday September 22, 2015 @12:40PM (#50575869)

    A system needs to be able to remember what it is supposed to forget in order to make sure it is forgotten.

    Imagine a waiter robot that is supposed to go into a room and make sure it gets everyone's order:
    a) Enters room, goes from person to person, asks drink preferences.
    b) John Doe tells robot: "I don't want you to track my preferences. Forget everything about me!"
    c) Robot obeys and continues on.
    d) Prior to exiting the room, the robot verifies it has gotten everyone's preferences.
    e) Robot sees John Doe. Robot has no record of John Doe because it has forgotten everything about John Doe. The robot must get the preferences of everyone in the room.
    f) Robot asks John Doe for his drink preferences.
    g) Goto b).

    The systems have to remember that they aren't allowed to (re)learn the data that they are supposed to have forgotten, which means they cannot completely forget things - the information is always there.

    • by Fwipp ( 1473271 )

      Only if "forget everything about me" includes the fact that it's been asked to forget you. I can see cases where people can say "Write down in your book never to store information about me" and have that be useful. Yes, the datapoint that you requested to be forgotten is not of no value, but it's likely better than them remembering what kind of weird porn you're into.

      Alternately, it's not a hassle to keep up this loop if John Doe has a passive signal to the robot to keep it from verbally asking him each ti

    • by mlts ( 1038732 )

      The trick is to tag an expiration date on all info. John Doe tells the robot about his drink preferences, and the robot will retain those preferences either until the drinks are served and the tab closed, or until there is a certain point in time, where the drink preference info is flagged to expire. Every so often, a garbage collector task runs, purges all robot preferences that are expired and not flagged for retention [1].

      In general, expiration timestamps might be something to have in a database row, b

    • They've mostly solved this problem for junk mail [wikipedia.org], wouldn't something similar work here?

    • That's where Do Not Track type concepts would work if they were respected. The robot doesn't need to know who John Doe is, or remember a previous conversation in order to see his T-shirt says "Do Not Track" and respect his wishes.

  • Nuke it from orbit. It's the only way to be sure.
  • by xxxJonBoyxxx ( 565205 ) on Tuesday September 22, 2015 @12:53PM (#50575965)

    FWIW, this paper on Bitcoin-like email blockchains appears to really be TFA: http://web.media.mit.edu/~guyz... [mit.edu]

    I think if providers just held on to "Message IDs" (e.g., http://forensicswiki.org/wiki/... [forensicswiki.org]) they'd have most of this capability today. I'm not sure what blockchains bring to the table here other than authenticity, and that doesn't seem to be the issue here.

  • by pla ( 258480 ) on Tuesday September 22, 2015 @12:56PM (#50575993) Journal
    TFA doesn't really deal with the problem of deleting personally identifiable information, so much as aggregate statistics derived from personal data.

    And in that context, I far, far prefer that they can't remove my contribution from their aggregates (although I do opt out of personalized collection whenever possible).

    Why, you might ask? Simple - I lie to companies that ask me for information. A lot. I do my damnedest to poison their databases to the greatest extent possible. Now why on Earth would I want to make it easy for them to redact the "facts" that I own a Veryron and a solid gold iWatch despite living in a cardboard box beneath a highway overpass?

    Sometimes, the box of chocolates has Ex-Lax in it.
    • by Anonymous Coward

      It might work better if you make your fake facts "typical". Going over the top makes you an outlier, and the algorithms no doubt try to filter out the outliers.

      • by GuB-42 ( 2483988 )

        You mean they won't do massive ad campaigns in Afghanistan for people born on January 1st?

  • by bjdevil66 ( 583941 ) on Tuesday September 22, 2015 @01:07PM (#50576087)

    How could this be done - some form of meta-tagging EVERYTHING in the digital realm with some kind of signature - without having some master database to reference it by? What could possibly go wrong with a universal, non-anonymous Big Brother - I mean, Big Data - system like that?

    The only positive to come out of a system like this would be for making it more valuable for the data owners as a resellable commodity.

  • ...the thousands of tapes that were generated from backing up the systems that housed that data, prior to it being cancelled.
  • by Anonymous Coward

    About 10 years ago, I opened a business.... the business was created for a proposal, that never developed.

    Since that time, about once a week, I get a phone call "Can I speak to (title) of (business)." (Those requesting donations and investment are the most annoying). A couple times I've tried to track down who is selling my information, but was not able to get anywhere meaningful.

    Just the other day, received a notice form the state: They want to collect taxes for the business. (They were previously i

  • Comp Sci professor once told me computers are female.. because Even the smallest mistakes are stored in long term memory for possible later retrieval

1 Sagan = Billions & Billions

Working...