Researchers from North Carolina State University and IBM Research have developed a new natural language processing tool that businesses or other customers can use to ensure that software developers have a clear idea of the security policies to be incorporated into new software products.
Specifically, the research focuses on access control policies (ACPs), which are the security requirements that software developers need to bear in mind when developing new software. For example, an ACP for a university grading program needs to allow professors to give grades to students, but should not allow students to change the grades.
“These ACPs are important, but are often buried amidst a lengthy list of other requirements that customers give to developers,” says Dr. Tao Xie, an associate professor of computer science at NC State and co-author of a paper on the research. These requirements are written in “natural language,” which is the conversational language that people use when talking or corresponding via the written word.
Incomplete or inaccurate ACP requirements can crop up, for example, if the customer writing the ACP requirements makes a mistake or doesn’t have enough technical know-how to accurately describe a program’s security needs.
A second problem is that programmers may misinterpret some ACP requirements, or overlook them entirely.
In collaboration with IBM Research, Xie’s research team has developed a solution that uses a natural language processing program to extract the ACP requirements from a customer’s overall list of requirements and translate it into machine-readable language that computers can understand and enforce.
After the ACPs are extracted, they can be run through Access Control Policy Tool (ACPT) – also developed in Xie’s research team in collaboration with the National Institute of Standards and Technology (NIST) – which verifies and tests the ACPs and determines whether the ACP requirements are adequate to meet the security needs of the program.
Once the ACP requirements have been translated into machine-readable language, they can also be incorporated into a policy-enforcement “engine” in the final software product – which ensures that ACPs cannot be overlooked by programmers.
“In general, developing a program that understands natural language text is very challenging,” Xie says. “However, ACP requirements in software documents usually follow a certain style, using terms such as ‘cannot be edited’ or ‘does not have the ability to edit.’ Because ACPs tend to use such a limited number of phrases, it is much easier to develop a program that effectively translates natural language texts in this context.”
The paper, “Automated Extraction of Security Policies from Natural-Language Software Documents,” will be presented Nov. 13 at the 2012 International Symposium on the Foundations of Software Engineering (SIGSOFT’12/FSE-20) in Cary, N.C. Lead author of the paper is Xusheng Xiao, a Ph.D. student at NC State. Co-authors include Xie, Dr. Amit Paradkar of the IBM T.J. Watson Research Center and Dr. Suresh Thummalapenta of IBM Research India. The research was supported by the National Science Foundation, the U.S. Army Research Office, NIST, and the National Security Agency Science of Security Lablet.
Note to Editors: The study abstract follows.
“Automated Extraction of Security Policies from Natural-Language Software Documents”
Authors: Xusheng Xiao and Tao Xie, North Carolina State University; Amit Paradkar, IBM T.J. Watson Research Center; Suresh Thummalapenta, IBM Research India
Presented: SIGSOFT’12/FSE-20, Nov. 13, Cary, N.C.
Abstract: Access Control Policies (ACP) specify which principals such as users have access to which resources. Ensuring the correctness and consistency of ACPs is crucial to prevent security vulnerabilities. However, in practice, ACPs are commonly written in Natural Language (NL) and buried in large documents such as requirements documents, not amenable for automated techniques to check for correctness and consistency. It is tedious to manually extract ACPs from these NL documents and validate NL functional requirements such as use cases against ACPs for detecting inconsistencies. To address these issues, we propose an approach, called Text2Policy, to automatically extract ACPs from NL software documents and resource-access information from NL scenario-based functional requirements. We conducted three evaluations on the collected ACP sentences from publicly available sources along with use cases from both open source and proprietary projects. The results show that Text2Policy effectively identifies ACP sentences with the precision of 88.7% and the recall of 89.4%, extracts ACP rules with the accuracy of 86.3%, and extracts action steps with the accuracy of 81.9%.