Authorship Attribution for Social Media Forensics

Sudeep Shrestha; Shubham SenGupta; Prema Ale; Vaishnavi V Naik

Authors

Sudeep Shrestha Brindavan College of Engineering, Bangalore, India
Shubham SenGupta Brindavan College of Engineering, Bangalore, India
Prema Ale Brindavan College of Engineering, Bangalore, India
Vaishnavi V Naik Brindavan College of Engineering, Bangalore, India

Keywords:

Authorship attribution, Computational linguistics, Forensics, Machine learning, Social media

Abstract

The veil of anonymity provided by smart phones with pre-paid SIM cards, public Wi-Fi hotspots, and distributed networks like Tor has drastically complicated the task of identifying users of social media during forensic investigations. In some cases, the text of a single posted message will be the only clue to an author’s identity. How can we accurately predict who that author might be when the message may never exceed 140 characters on a service like Twitter? For the past 50 years, linguists, computer scientists and scholars of the humanities have been jointly developing automated methods to identify authors based on the style of their writing. All authors possess peculiarities of habit that influence the form and content of their written works. These characteristics can often be quantified and measured using machine learning algorithms. In this article, we provide a comprehensive review of the methods of authorship attribution that can be applied to the problem of social media forensics. Further, we examine emerging supervised learning based methods that are effective for small sample sizes, and provide step-by-step explanations for several scalable approaches as instructional case studies for newcomers to the field.

Downloads

Download data is not yet available.

Authorship Attribution for Social Media Forensics

Authors

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

Make a Submission