Environmental and social governance

Student: Alex Chae
Supervisor: Thomas Marlow
The summer project focused on developing a comprehensive dataset of earnings call transcripts of S&P 500 companies from 2010 to 2022, which can be used to identify the trends in discussion of ESG strategies between the company and its shareholders. The project started by downloading the earnings call files from the Factiva database and developing a detailed guideline on how to query the database to get the most relevant information. The downloaded files were processed using Python to extract relevant information, such as the quarter and year of the earnings call, the location of the company’s headquarters, and the industry in which the company operates. The dataset contains more than 10,000 earnings call transcripts of 181 companies in the financial, consumer goods, real estate, utilities, and energy sectors.
The dataset is also a result of an innovative approach to expanding textual data. By using Open AI’s GPT 3.5 model, I performed a zero-shot Named-Entity Recognition (NER) technique to extract the names of companies from earnings call transcripts. The list of company names was assumed to be the participants of the earnings call, who are most likely to be the shareholders of each company. Therefore, the dataset compiled allows the research team to examine the extent to which ESG is discussed by companies, any patterns in the discussion that evolved over, and the influence of shareholders in changing the course of the discussion. The summer project was finalized with a documentation of the summary statistics and visualizations of the contents of the dataset, including how many earnings call transcripts are present for each company, how many transcripts there are before and after the 2015 Paris Agreement, and the list of companies that are represented in the data.
Adversarial Attacks

Student: Abdullah Suri
Supervisor: Mohammad Shafique
In the rapidly evolving landscape of machine learning, the pursuit of accuracy has yielded remarkable progress. However, the vulnerability of these models to subtle manipulations, known as adversarial attacks, raises concerns about their robustness. Adversarial attacks involve introducing minor changes to input data to mislead machine learning models. These attacks exploit model vulnerabilities, often revealing unexpected weaknesses. Understanding adversarial attacks is crucial for enhancing the security of machine learning systems across domains.
The project focused on studying and implementing various adversarial attacks. To prepare for this, several programming tasks were implemented, such as LeNet on the MNIST dataset, the Fast Gradient Sign Method (FGSM) attack, which perturbs input data using gradients of the loss with respect to the input and building on FGSM, the extension of the implementation to include the Projected Gradient Descent (PGD) attack. This required a deeper understanding of the concepts involved. The subsequent phase involved studying advanced attack strategies like EOTPGD, TPGD, UPGD, PGDRS, PGDRSL2, and Onepixel. Executing these attacks required thorough investigation through research papers, as well as an examination of the torch attack library’s implementation, dissecting it line by line. The outcomes were then compared against the torch attack library results. The final task was centered around preparing ResNet and VGG models. Implementing two versions of each model, where a comprehensive comparison was conducted to gain insights into their respective performance.