The Internet Knows Everything: MIT PersonasWeb

Remember in the mid-90′s when people were absolutely paranoid about the Internet, believing that the government knew everything about us and that these “hackers” might be able to steal our information?

Those AOL trial-goers would drop dead if they saw the new project being led by MIT‘s Aaron Zinman, titled “Personas” (alternatively “Personas Web” or “PersonasWeb“. Not sure which one it is, exactly).

What Is Personas?

The page aims to be a data-mining utility aimed at showing just how much the Internet knows about you.

Pulling information from a private(?) database of mined information presumably from crawled web pages, it categorizes all resulting information into labels including but not limited to “Sports”, “Online”, “Legal”, “Illegal”.

The web page is simply one static HTML web page, using extensive JavaScript and AJAX to make remote queries to the main servers where the information is processed. All they ask for is your first and last name, and the engine then dumps everything it knows about you back into the page via AJAX requests.

How Accurate Is It?

I tried my name (first and last), and it didn’t yield very much. Then again, I never put my full real name on the Internet in the first place, so I have little to worry about.

Individuals with a fairly heavy online presence, such as The Coffee Desk’s own Anthony Cargile, yield much more information. It even gets more interesting when individuals have duplicate names, such as in Anthony’s case and others with generic names.

But as far as how extensive its database is, I’ll be the first to tell you: it’s nowhere near Google-sized. In fact, it’s pretty limited.

Anthony’s name only brought up results from the very bottom of this website, and a firefighter award someone with the same won in the state of Texas. Nevermind the fact that he has two whole websites with his name plastered all over it, and a public Twitter bio.


Try it here: MIT Personas

Tips: try your own name, then those of celebrities and generics like “John Smith” to see the varying result set.


Should We Be Worried?

Fret not, fellow tin foil hat wearing friends – this project is nowhere near a useful datamining utility until it gets better crawlers or gains direct access to Google’s index.

Compared to other datamining tools like Google, Facebook and other social media searching, this site looks like nothing more than a small hobby, size-wise.

And from the looks of it, that description isn’t too far off. It is merely an experimental educational tool.

But be aware: if Persona’s hidden index continues to gain intelligence via more in-depth crawling and other enhancements, then this could spell trouble for those whom value privacy. It may already spell out trouble for people fitting such as description as it stands now.

About the Author

mark

Mark (who wishes to keep his last name private) is currently employed as a system administrator for a company in his hometown. He has extensive experience in both networking and programming, and has designed many scalable and high-availability networks. Mark can easily be described as the go-to guy for building quality networks and data centers. He is now well-known for his very humorous posts here at The Coffee Desk. This bio has been corrected for our reader Nigles. I hope he feels special now.

Visit Website

There are no comments yet, add one below.

Leave a Comment

Your email address will not be published. Required fields are marked *

*