SOURCE CODE THEFT
~ Kirtisinh Vaghela, Intern, Seth Associates
Keywords- Cyberlaw, cybersecurity, Information technology Act, source code
- Abstract:
This article focuses on the fundamental technical aspects of source code, its creation, the tools used for the creation of the same, along with the case laws pertaining to source code theft. It also proposes solutions to prevent source code theft and is written in a format that is understandable for readers with no programming background.
___________________________________________________________________________
- What is a source code?
Programmers create source code, which is basically a computer program written in a way that humans can understand. It’s made up of functions, which are blocks of reusable code, descriptions that list terms and their definitions, methods that are collections of statements used to carry out specific tasks, and other operational statements.
An example, if you type something in Windows Notepad and save it as a text file (.txt). Well, that text file becomes the source code.
When a C compiler compiles the source code (.txt file), it produces an object code file, which is in the form of an executable file (.exe file).
A tool known as a “C compiler” transforms high-level C language code into machine-readable format. Languages with one code type such as JavaScript, do not use them.
You can use a text editor, a visual programming tool that helps developers describe processes in human-readable words, an Integrated Development Environment (IDE), that provides facilities for developing software, or a set of software development tools bundled into one installable package to write source code.
Humans can understand the source code for C programming, as it is in a human-readable form. Yet, the computer’s processor needs this code changed into machine language by a C compiler to read it. We call the resulting output “object code.” Object code consists of binary numbers (0,1). We can link the object code to produce an executable file (.exe) that performs specific program functions.
___________________________________________________________________________
- Source code licensing
Source code can be open source or proprietary, depending on the license terms. Two things keep proprietary source code secret:
- The developer wants to protect intellectual property.
- To prevent outsiders from changing the source code and making the program unstable or more vulnerable to attacks.
Page | 1 |
On the other hand, open-source software lets anyone change the source code to enhance the program by adding different developers’ ideas. Proprietary software licenses often stop a person from altering the source code.
- Source code theft
In the case of Syed Asifuddin and Ors. v. State of Andhra Pradesh and Anr., in the High Court of Andhra Pradesh, Tata Indicom’s staff members hacked into the CDMA digital phones belonging to Reliance Infocomm Ltd. due to the loss suffered by them because of the Reliance’s eye-catching tariff plans. The court observed-
“Under Section 63 of the Copyright Act, 1957, a computer program will be considered as an original work according to Section 2(o)(ffc) and Section 14 of the Act.”
However, the petitioner counsel argued that the telephone handset will not come under the definition of Section 65 of the Information Technology Act, 2000.
The Andhra Pradesh High Court held that Defendants had committed source code theft by tampering the phone handsets attracting Section 65 of the Information Technology Act, 2000.
___________________________________________________________________________
The concept of open-source code and propriety source code was discussed in the case of Aleynikov v. Goldman Sachs Grp. (2009), tried in the New York State Court, where Aleynikov, a programmer for Goldman Sachs was found guilty of copying source code from the investment house claims who wanted to recover open-source code from the program files and argued that only 32 megabytes of about 1.2 gigabytes of code were proprietary.
But he was later acquitted because the New York State Court held that –
“tangible” means “the manifestation of a thing in the physical world. Computer code does not become tangible merely because it is contained in a computer.”
However, Goldman Sachs alleged he also copied, compressed, encrypted, and renamed the files suggesting that his intent exceeded merely recovering open-source files. He allegedly deleted the encryption program when the task was complete and attempted to delete any log files that would have shown signs of his activities.
Later,the Court of Appeals held that intangible property such as source code does not constitute stolen “goods,” “wares” or “merchandise” under the NSPA.
__________________________________________________________________________
- Proposed Solutions
To protect against the most common types of source code leaks, companies should focus their efforts on the most common source of those leaks. Insiders often make errors, leading to security breaches and data leaks, but developers can also sometimes turn malicious and steal code. The following are some of the measures that the company may take to prevent source code leaks:
- Access Control
Page | 2 |
The first step would be to authenticate the user using the IDE. So, when the user launches the IDE, a login screen will appear for him to enter credentials (username and password). If he is logged in for the first time, he has to get registered and his hash will be stored along with his username in a database. If he is already registered, his entered hash will be matched with the hash that is stored in the database. If both match, he will gain access, or else will be asked to re-enter the credentials or exit the login window. Tools like Data Loss Prevention software can also be deployed to detect and alert teams of any suspected data loss of source code.
- Protecting source codes with Patents and Copyrights
Depending on laws of a particular jurisdiction, Patents and Copyrights will not be able to prevent source code theft, but they confirm the ownership of the source code, which gives the developer the rights for the same.
- Behavioural analysis
In Tesla Inc. v. Khatilov (2021), tried in the United States District Court, Northern District of California, the company sued Alex Khatilov, who worked there for less than a week before allegedly copying the code from the company’s WARP Drive backend system to Dropbox. The company detected the threat when Khatilov began copying a significant amount of data outside the company’s network.
However, this can be prevented by automating the Security Information and Event Management’s (SIEM) ability to learn normal behavioural patterns and using additional machine learning techniques to enable companies to create behavioural profiles of different roles in the development process – minimizing human work and automatically capturing behaviour changes. The resulting security controls are more accurate and more flexible. This type of security software is often referred to as user and entity behavioural analytics, or UEBA.
The US District Court pronounced Temporary Restraining Order against Khatilov from 22nd January, 2021 to 5th February, 2021 along with an Evidence Preservation Order.
- NDA and Legal agreements
Using Non-Disclosure Agreements and other legal documents to ensure the confidentiality of the data by the employees is recommended.
- Encryption
Encryption is the method of locking up data with the use of cryptography. It is a method in which the message is encoded in a layout that cannot be understood without a key. Encryption, in simple words, is scrambling the data to a certain extent, and that extent is understood to be the key.
Illustration: A programmer writes a text “Hello” and encrypts it. The encryption is set to 2 alphabets after each letter of that word. Therefore, the word “Hello” when encrypted will read as “Jgnnq”, and the encryption key here is “2” to decrypt the text back to its original form.
Encryption requires the use of a cryptographic key (which is a set of mathematical values) that both the sender and recipient of the encrypted message agree upon. Secure encryptions will use keys that are secure enough to not be decrypted or get broken by brute force (guessing the key).
___________________________________________________________________________
- Conclusion
Page | 3 |
The source code is the most basic code upon which a program is built. Both source code and object code are entitled to be Intellectual Property as long as they’re unique under the Copyright Act, as well as the TRIPS agreement, which India is a part of. To protect the code from internal as well as external threats, as observed in the above-discussed cases, where mostly the source code was leaked by one of the insiders/employees. Hence, one must adopt the best security measures to prevent loss of proprietary and confidential information such as source code.
Bibliography
Publications:
- Steven B. Lipner, “Security and Source Code Access: Issues and Realities”, published in February 2000,
- Noor Yasin, Abdul Ghafoor Abbasi, and Muhammad Shoaib, “An Architecture for Source Code Protection”, published on 20th May 2016,
<https://www.researchgate.net/publication/282400013_An_Architecture_for_Source_Code_Protection>
- Attique Ahmed, Muhammad Naeem, “Analysis of Most Common Encryption Algorithms”, published on 2nd March 2022, IJEACS, Empirical Research Press Ltd.
<http://ijeacs.com/Files/Other/IJEACS-V04I02/22030402003.pdf>
- Opentext, Cybersecurity, “Guide to protecting Source Code”, published on 6th February 2021,
<https://www.microfocus.com/media/webinar/guide-to-protecting-source-code-wp.pdf>
Web Articles:
- Rohit Ranjan Praveer, “From general to specific: Source Code comparison”, published on 20th September 2019,
<https://metacept.com/from-general-to-the-specific-source-code-comparison/>
- Scott Wallask, “Source Code”, published in January 2023,
<https://www.techtarget.com/searchapparchitecture/definition/source-code>
Page | 4 |