Leveraging AI to Solve the ESG Challenge

Data Strategist Corner Series

Trent Bradberry

Christina Ho

Date Published:
April 19, 2021

Peter Drucker, who was recognized as the world’s greatest management consultant, famously said, “If you can’t measure it, you can’t improve it.” This quote is relevant for addressing the Environmental, Social, Corporate Governance (ESG) challenge.

The History of ESG

ESG are non-financial factors used to assess the sustainability and societal impact of an investment. It represents a fundamental shift from the 20th to the 21st century in how social responsibility is perceived to impact a company’s financial performance.

Historically and for most of the 20th century,

the prevailing theory was that social responsibility negatively affects a company’s financial performance. An American economist Milton Friedman, a Nobel prize winner in 1976, was the leading voice on the shareholder or stockholder theory, noting that a company’s sole responsibility was to its shareholders.
step image

Toward the end of the 20th century,

the notion of “triple bottom line”, coined by author and a serial entrepreneur John Elkington, began to emerge and gain momentum. Elkington identified this new framework whereby a company’s value is determined by social and environmental factors, in addition to the traditional financial factors.
step image

At the beginning of the millennium,

in response to the growing ESG investment market, some institutions began to develop related products and services. Research was also being published to challenge the assumption that social responsibility was a cost that yielded no financial returns.
step image

In 2006,

the United Nations launched the Principles for Responsible Investing (PRI) which is dedicated to achieving a sustainable global financial system for long term value creation. Since then, the number of signatories has grown to 3000, representing over $100 trillion in assets under management.
step image

ESG Reckoning

Unfortunately, such commitment has not resulted in substantive improvement in responsible investing. In a recent Harvard Business Review blog “An ESG Reckoning Is Coming”, authors Michael O’Leary and Warren Valdmanis rendered the following critical assessment:

According to research last year, investors who signed onto the United Nations principles did not improve the social and environmental performance of their investments. According to the researchers, signatories “use the PRI status to attract capital without making notable changes to ESG.” Similarly, signatories to the Business Roundtable statement have performed no better than other companies in protecting jobs and worker safety during the pandemic.

The authors suggested that standardized and required public disclosures would lay the foundation for further accountability. Currently, the Securities and Exchange Commission (SEC) does not mandate ESG reporting for U.S. companies. As such, performance metrics are not standardized and comparable across companies. SEC has been under pressure from the investor community to require ESG disclosures.

On February 24, 2021, Acting SEC Chair Allison Herren Lee announced that the agency is taking action to update its 2010 guidelines on climate-related disclosures in public company filings. In subsequent weeks, the SEC also announced a series of other ESG-related priorities and changes. These steps are clear signals that changes are coming.

Possible Solutions

The ultimate challenge of ESG is summarized in Dr. Drucker’s quote at the beginning of this blog. Although there is already a significant amount of information (and data) provided by the companies, investors do not have the tools and ability to efficiently and effectively extract reliable and comparable insights to measure the performance of a company as it relates to ESG.

In the section below, we explore how natural language processing (NLP) can solve this daunting problem both from an investor and regulator perspective.

Tackling the Lack of Transparency in ESG Reporting

One way to promote accountability through greater transparency of a company’s ESG status is by revealing relevant insights through analytical methods aimed at processing descriptive, natural language from documents about a company. Over the past decade, NLP has made great strides within the domains of Artificial Intelligence and Machine Learning. In short, NLP uses computer algorithms to efficiently manipulate, summarize, classify, and extract information from text and speech. Applying NLP to text containing ESG information can offer descriptive insights for making informed decisions about a company’s sustainability, and it can do so at a faster rate than a human analyst.

In the domain of ESG, textual sources are available along a continuum of transparency. Three sources of ESG text ordered from least to most transparent are:

Self-Reported Sustainability Assessments

SEC-Related Documents

News Articles

Let’s explore how NLP techniques can be applied to each source.

Self-Reported Sustainability Assessments

A company’s sustainability assessment is often published on its website. While the report can provide detailed ESG-related information, it can also be biased because it is voluntary and unregulated. Companies may avoid reporting negative information.

Nonetheless NLP can be used to automate tasks such as summarization, keyword extraction, topic modeling, document classification, or sentiment analysis. These types of techniques are especially useful when one has sustainability assessments from multiple companies and wants to process them together to reveal trends and outliers. For example, topic modeling could reveal that most companies report on topics related to COVID-19 and employee health, yet a few companies avoid reporting on this topic. Text summarization could then be applied to quickly assess these outliers’ overall messaging.

SEC-Related Documents

SEC-related documents, in theory, are more transparent and regulated. Required Form 10-Ks may contain ESG information that a company discloses to the SEC. A company’s annual meeting proxy statements, also required by the SEC, are also a good source of potential ESG disclosures. In addition, shareholder resolutions can be a valuable source of the types of ESG issues raised by shareholders.

As with self-reported sustainability assessments, NLP techniques such as topic modeling and text summarization can be used to extract insight from these documents. In fact, recent NLP work on shareholder resolutions has reported revealing topics related to emissions and energy, boards, regulations, politics, governance, product management, and accountability.

While these SEC-related text sources are typically more transparent and reliable than self-reported sustainability reports, there is increased scrutiny around them, leading the SEC to recently announce the formation of an ESG task force to consider a more well-defined regulatory framework.

News Articles

The most transparent text source is news articles. Not only is news more transparent, but the frequency at which timely information is provided allows NLP to be used to monitor ESG-related events as they are reported. In addition to applying NLP to news articles in the manner described in the previous two sections, two reported examples exist where state-of-the-art techniques for language understanding were applied to automatically classify news articles into at least 20 ESG-related categories such as physical impacts of climate change, employee diversity and inclusion, business ethics, etc. (Mukherjee, 2020; Nugent et al., 2020).

Elder Research has developed a proof-of-concept called the News Analyzer that scrapes news articles from the Internet and uses these NLP technologies to filter them by relevance and sentiment before displaying them to users. This technology could be extended into the ESG domain to offer clients informative monitoring of breaking ESG-related news about companies.

ESG Regulation Development

In addition to deriving insights about companies to monitor their ESG status, NLP could aid in the development of disclosure standards. These NLP techniques can reveal ways in which companies may evade disclosure of adverse behavior.

Consider a scenario where text classifiers are leveraged in a process to collect news articles about companies’ environmental violations. NLP can be used to compare these companies’ SEC-related documents and self-reported sustainability assessments against a group of non-offenders’ documents to reveal distinguishing characteristics that could inform new regulations to prevent future circumvention of the system.


The current lack of ESG standardization, comparability, and transparency makes it difficult for individual investors, portfolio managers, and government agencies to fully understand a company’s sustainability. However, NLP, a core competency of Elder Research, offers techniques for data-driven monitoring and regulation development that could dramatically change the present reality for the common good.

Related Blog Posts

Elder Research and our partner HData have published blog posts focused on NLP trends and NLP in Regulatory Technology (RegTech) which will provide more detailed information about NLP.

Trends in Natural Language Processing

This article reviews traditional machine learning methods used in deep learning and new trends like Transfer Learning and Transformers.

Continue Reading
Natural Language Processing for RegTech: Uncovering Hidden Patterns in Regulatory Documents

This article highlights how Natural Language Processing  helps regulatory agencies, regulated enterprises, and markets understand unstructured regulatory documents without countless hours spent researching, reading, and analyzing.

Continue Reading


Do you have effective data strategy?

Having an effective data strategy is one of the strongest indicators of future analytics success. Need help developing your organization's data strategy?
See How We Can Help