Re-Identity - Algorithms and methods for tracking users

No explicit questions like "how do I hack xxx.com" please!
Post Reply
User avatar
DNR
Digital Mercenary
Digital Mercenary
Posts: 6114
Joined: 24 Feb 2006, 17:00
18
Location: Michigan USA
Contact:

Re-Identity - Algorithms and methods for tracking users

Post by DNR »

I know these are hard to read subjects, but some of you read at this level. -DNR
--

http://cryptome.org/trails1.pdf
ten page PDF on :
Trail Re-Identification:
Learning Who You Are From Where You Have Been


"
This paper provides algorithms for learning the identities of
individuals from the trails of seemingly anonymous information
they leave behind. Consider online consumers, who have the IP
addresses of their computers logged at each website visited.
Many falsely believe they cannot be identified. The term “re identification”
refers to correctly relating seemingly anonymous
data to explicitly identifying information (such as the name or
address) of the person who is the subject of those data. Re identification
has historically been associated with data released
from a single data holder. This paper extends the concept to “trail
re-identification” in which a person is related to a trail of
seemingly anonymous and homogeneous data left across different
locations. The 3 novel algorithms presented in this paper perform
trail re-identifications by exploiting the fact that some locations
also capture explicitly identifying information and subsequently
provide the unidentified data and the identified data as separate
data releases. Intersecting occurrences in these two kinds of data
can reveal identities. For example, an online consumer may visit
50 websites and purchase at 5 and another may visit 30 sites and
purchase at 7. Shared visit logs provide unidentified data.
Exchanged customer lists provide identified data. The algorithms
presented herein re-identify individuals based on the uniqueness
of trails across unidentified and identified datasets. The
algorithms differ in the amount of completeness and multiplicity
assumed in the data. Successful re-identifications are reported for
DNA sequences left by hospital patients and for IP addresses left
by online consumers. These algorithms are extensible to tracking
collocations of people, which is an objective of homeland defense
surveillance."

...

The REIDIT algorithms provide deterministic methods for
learning who (by name or explicit identity) has been where. The
methodology involves constructing trails across locations from
small amounts of seemingly anonymous or innocuous evidence
the person has been there. Trails are also constructed on places
where the person has left explicit information of their presence.
Identifying uniqueness and inferences across these two sets of the
trails relates information about where the person has been to who
they are.
--EOF

Also see:
http://www.eusflat.org/publications/pro ... 104-04.pdf
Towards the use of OWA operators for record linkage
"
Record linkage is used to establish links between
those records that while belonging to two different
files correspond to the same individual. Classical
approaches assume that the two files contain some
common variables, that are the ones used to link
the records...
Re-identification algorithms are one of such tools.
They are used to identify the structures that are
shared by several files or databases.
Record linkage algorithms are one of the most important re-identification
tools. Their goal is establish which records give
information on the same individual.
..
Betrayed By My Shadow: Learning Data Identity via Trail Matching
"
A single location’s releases appear unrelated; however, when multiple locations make such releases of information, common patterns in the data trails of two types of data can be used to discover relationships between them. The algorithms presented herein differ in the amount of completeness and multiplicity assumed in the data. We report experiments and successful re-identifications of IP addresses to online users and households. This work provides a foundation for several new research directions, including the development of methods for learning identity and additional information across disparate datasets, as well as a foundation for methods that enable data holders to share information with guarantees of anonymity. "
..

http://privacy.cs.cmu.edu/dataprivacy/p ... index.html

Web logs
Given a set of websites in which each site provides a weblog, which is a list of IP addresses recorded from machines visiting the website, how can the people who are using the machines be identifed?
Answers: see Trail re-identification of on-line consumers using IP addresses.
--EOF

http://idtrail.org/content/view/799
LESSONS FROM THE IDENTITY TRAIL Anonymity, Privacy and Identity in a Networked Society

check out:

Chapter 22. Exit Node Repudiation for Anonymity Networks 1.11 Mb
by JEREMY CLARK, PHILIPPE GAUVIN, AND CARLISLE ADAMS
http://www.idtrail.org/files/ID%20Trail ... err_22.pdf

Chapter 23. TrackMeNot: Resisting Surveillance in Web Search 948.30 Kb
by DANIEL C. HOWE AND HELEN NISSENBAUM
http://www.idtrail.org/files/ID%20Trail ... err_23.pdf

--
-
He gives wisdom to the wise and knowledge to the discerning. He reveals deep and hidden things; he knows what lies in Darkness, and Light dwells with him.

User avatar
Ghostface_Killah
forum buddy
forum buddy
Posts: 22
Joined: 10 May 2009, 16:00
14
Location: my sand box
Contact:

Post by Ghostface_Killah »

Would never work completely because some people put out mis-information on purpose.

So those databases can be full of crap.

User avatar
DNR
Digital Mercenary
Digital Mercenary
Posts: 6114
Joined: 24 Feb 2006, 17:00
18
Location: Michigan USA
Contact:

Post by DNR »

actually some of these profiling programs did not even consider the content or data sent - but looked at behavior and computer/network signatures left behind. Servers frequently get your machine nfo, like OS and browser type to better serve their customers. That can be your signature and how you access the internet, where you go, what you do oneline can still make a profile of who you are.

DNR
-
He gives wisdom to the wise and knowledge to the discerning. He reveals deep and hidden things; he knows what lies in Darkness, and Light dwells with him.

User avatar
DrVirus
Fame ! Where are the chicks?!
Fame ! Where are the chicks?!
Posts: 383
Joined: 16 May 2007, 16:00
16
Contact:

Post by DrVirus »

Ghostface_Killah wrote:Would never work completely because some people put out mis-information on purpose.

So those databases can be full of crap.
If I were you I would look at using the info other way around. There's a reason they say keep your friends close and enemies closer. Need I say more ?

Post Reply