Privacy

From ScribbleWiki: Analysis of Social Media

Jump to: navigation, search

(Annotator: Yimeng)

Related page: Social Networking and Privacy (annotated by Sachin)

Contents

Views towards Privacy of Social Media

The Figure 1-7 shows some survey results on privacy issues of social media, which explained in the report to OCLC 2007 (Sharing, Privacy and Trust in Our Networked World)

Figure 1. Improper use of personal information online
Figure 1. Improper use of personal information online
Figure 2. Top Privacy Concerns
Figure 2. Top Privacy Concerns
Figure 3. Percentage of people who remain anonymous
Figure 3. Percentage of people who remain anonymous
Figure 4. Percentage of people who provide true information when registering onlin
Figure 4. Percentage of people who provide true information when registering onlin
Figure 5. Percentage of people who think it is very important to have the ability to remain anonymous
Figure 5. Percentage of people who think it is very important to have the ability to remain anonymous
Figure 6. Percentage of people who think it is very important to have the ability to control personal information
Figure 6. Percentage of people who think it is very important to have the ability to control personal information
Figure 7. Percentage of people who specify who can view their pages
Figure 7. Percentage of people who specify who can view their pages

Conclusions

1. Around 40% of people would like to remain anonymous on social media or social networking sites

2. Most people provide their true personal information while registering

3. Most people think it is important to have the control of personal information online

4. Around 11% of people whose privacy information have been improperly used.






Re-identification

Re-identification is usually done by Linkage of datasets with explicit identifiers with datasets without explicit identifiers through common attributes. Examples of datasets without explicit identifiers include public data which are made anonymous by users, public data by research groups (after suitable anonymizing), public data from government agencies (census).

One example of re-identification: Sweeney (2002) uses two dataset: one is the medical data records of Massachusetts public by Group Insurance Commission of Massachusetts without explicit identifiers, another is the Voter register list of Massachusetts available for purchase with only 20$, which includes all detailed information about an identifier. Both of these two datasets includes ZIP, gender and birth date. Research showed that 87% of the 248 million people in 1990 U.S. census are likely to be uniquely identified based only on their 5-digit ZIP, gender, and birth date. Therefore, it is possible to identify the medical records data of people by linking these two datasets using ZIP, gender and birth date.

Another example: Gross and Acquisti (2005) performed re-identification using face recognition to link the data on Flicker with identifier, and data on Friendster, which does not include explicit identifiers.

Papers for Re-identification

1. K-anonymity: A Model for Protecting Privacy, Sweeney, 2002

2. You Are What You Say: Privacy Risks of Public Mentions, Frankowski et al. SIGIR 2006

3. Information Revelation and Privacy in Online Social Networks Gross and Acquisti, Workshop-Privacy in Electronic Society 2005 (annotated by Sachin)

4. Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography Backstrom et al, WWW 2007

5. Privacy-Preserving Data Mining Agrawal and Srikant, SIGMOD 2000.

6. Privacy Preserving Mining of Association Rules Evfimievski et al, KDD 2002

7. Maintaining Privacy in Association Rule Mining Rizvi and Haritsa, VLDB 2002

8. State-of-the-art in Privacy Preserving Data Mining Verykio et al, SIGMOD 2004

9. Sharing, Privacy and Trust in Our Networked World, report to OCLC 2007.

Views
Personal tools
  • Log in / create account