CSI Computer Science: Your coding style can give you away
If you’ve been programming for any length of time, no doubt you’ve developed your own coding style. Every developer has preferences not only for things like spacing (e.g,, spaces vs tabs), naming styles (e.g., CamelCase vs. snake_case) and commenting, but also how he or she implements certain types of functionality. New research now shows that a developer’s coding style is a type of fingerprint, which can be used to identify who wrote an anonymous piece of code with a high degree of accuracy.
Researchers from Drexel University, the University of Maryland,the University of Goettingen, and Princeton have developed a “code stylometry,” which uses natural language processing and machine learning to determine the authors of source code based on coding style. Their findings, which were recently published in the paper “De-anonymizing Programmers via Code Stylometry,” could be applicable to a wide of range of situations where determining the true author of a piece of code is important. For example, it could be used to help identify the author of malicious source code and to help resolve plagiarism and copyright disputes.