May 2021: Identifying Vulnerable GitHub Repositories in Scientific Cyberinfrastructure

The scientific cyberinfrastructure community heavily relies on public internet-based systems (e.g., GitHub) to share resources and collaborate. GitHub is one of the most powerful and popular systems for open source collaboration that allows users to share and work on projects in a public space for accelerated development and deployment. Monitoring GitHub for exposed vulnerabilities can save financial cost and prevent misuse and attacks of cyberinfrastructure. Vulnerability scanners that can interface with GitHub directly can be leveraged to conduct such monitoring. This research aims to proactively identify vulnerable communities within scientific cyberinfrastructure. We use social network analysis to construct graphs representing the relationships amongst users and repositories. We leverage prevailing unsupervised graph embedding algorithms to generate graph embeddings that capture the network attributes and nodal features of our repository and user graphs. This enables the clustering of public cyberinfrastructure repositories and users that have similar network attributes and vulnerabilities. Results of this research find that major scientific cyberinfrastructures have vulnerabilities pertaining to secret leakage and insecure coding practices for high-impact genomics research. These results can help organizations address their vulnerable repositories and users in a targeted manner.

Speaker Bio: Dr. Sagar Samtani is an Assistant Professor and Grant Thornton Scholar in the Department of Operations and Decision Technologies at the Kelley School of Business at Indiana University (2020 – Present). He is also a Fellow within the Center for Applied Cybersecurity Research (CACR) at IU. Samtani graduated with his Ph.D. in May 2018 from the Artificial Intelligence Lab in University of Arizona’s Management Information Systems (MIS) department from the University of Arizona (UArizona). He also earned his MS in MIS and BSBA in 2014 and 2013, respectively, from UArizona. From 2014 – 2017, Samtani served as a National Science Foundation (NSF) Scholarship-for-Service (SFS) Fellow.

Samtani’s research centers around Explainable Artificial Intelligence (XAI) for Cybersecurity and cyber threat intelligence (CTI). Selected recent topics include deep learning, network science, and text mining approaches for smart vulnerability assessment, scientific cyberinfrastructure security, and Dark Web analytics. Samtani has published over two dozen journal and conference papers on these topics in leading venues such as MIS Quarterly, JMIS, ACM TOPS, IEEE IS, Computers and Security, IEEE Security and Privacy, and others. His research has received nearly $1.8M (in PI and Co-PI roles) from the NSF CICI, CRII, and SaTC-EDU programs.

He also serves as a Program Committee member or Program Chair of leading AI for cybersecurity and CTI conferences and workshops, including IEEE S&P Deep Learning Workshop, USENIX ScAINet, ACM CCS AISec, IEEE ISI, IEEE ICDM, and others. He has also served as a Guest Editor on topics pertaining to AI for Cybersecurity at IEEE TDSC and other leading journals. Samtani has won several awards for his research and teaching efforts, including the ACM SIGMIS Doctoral Dissertation award in 2019. Samtani has received media attention from outlets such as Miami Herald, Fox, Science Magazine, AAAS, and the Penny Hoarder. He is a member of AIS, ACM, IEEE, INFORMS, and INNS.

Jeannette DopheideMay 25, 2021