Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/107287
Citations
Scopus Web of Science® Altmetric
?
?
Type: Conference paper
Title: Privacy-Preserving Internet Traffic Publication
Author: Guo, L.
Shen, H.
Citation: Proceedings of the 15th IEEE International Conference On Trust,Security And Privacy In Computing And Communications,10th IEEE International Conference on Big Data Science and Engineering, 13th International Conference on Embedded Software and Systems (2016 IEEE Trustcom/BigDataSE/ISPA), 2016, pp.884-891
Publisher: IEEE
Issue Date: 2016
Series/Report no.: IEEE International Conference on Trust, Security and Privacy in Computing and Communications : [proceedings].
ISBN: 9781509032051
ISSN: 2324-898X
2324-9013
Conference Name: 15th IEEE International Conference On Trust,Security And Privacy In Computing And Communications,10th IEEE International Conference on Big Data Science and Engineering, 13th International Conference on Embedded Software and Systems (2016 IEEE Trustcom/BigDataSE/ISPA) (23 Aug 2016 - 26 Aug 2016 : Tianjin, China)
Statement of
Responsibility: 
Longkun Guo, Hong Shen
Abstract: As machine learning (ML)-based traffic classification develops, Internet traffic data is published in public to serve as test data. Although the IP addresses therein are anonymized, it is given explicitly which data belongs to an identical user. Then using the information, an adversary can identify a user from the anonymized users. The paper first gives a k-anonymity method to reduce the probability of information leak to P/k, where P is the probability of information leak without k-anonymity. Assume the number of the flows belonging to an IP address follows Normal distribution, the information loss is shown (μ2+σ2)/(kμ2+σ2), where μ and σ are respectively the mean and the variance of the Normal distribution. Later, random noise is added to further reduce the probability of information leak to P/k2, with an expected distortion rate of approximately 2d+log k-log|X|, where d is the number of dimensions and |X| is the number of the vectors. At last, real-world Internet traffic data is used to evaluate the utility of the anonymized traffic data. According to the experimental results, the k-anonymized noised data can be clustered with an overall accuracy rate close to the state-of-the-art results for non-anonymized traffic data.
Keywords: Privacy preserving; traffic classification; clustering; k-anonymity
Rights: © 2016 IEEE
DOI: 10.1109/TrustCom.2016.0152
Grant ID: http://purl.org/au-research/grants/arc/DP150104871
Appears in Collections:Aurora harvest 3
Computer Science publications

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.