Skip to main content

Deep Dive on GHG Data Sources

By March 12, 2025GHG Assessment11 min read

Goals of this article:

  1. Provide an overview of the different types of GHG data sources
  2. Show where GHG data sources can be found
  3. Explain how GHG data sources can be used by organisations
  4. Present the challenge of combining various data sources

As we explored in our previous blog articles, greenhouse gas (GHG) assessments rely on a combination of direct measurements, models, and standardized methodologies to evaluate GHG emissions of value chains across different scopes. In this context, understanding and accessing the many existing data sources is essential for reliable GHG emissions assessments. This article thus provides an overview of different types of GHG data sources, where they can be found, how they can be used, and finally, explains the challenges of checking their comparability.

1. What are the types of GHG data sources that can be used for GHG assessments?

GHG data sources can be broadly categorised based on the scope of GHG emissions they describe and the methodology they follow. Direct GHG emissions data often come from internal company measurements (for scope 1) and from data sources that describe the direct GHG emissions related to the purchase of different energy sources (scope 2). Scope 3 GHG emissions, on the other hand, are different since they  are typically based on data sources that describe the sum of all GHG emissions for various supply chains (i.e. “indirect” GHG emissions). 

The data sources that provide scope 3 GHG emissions can be divided in two general categories: 

  1. Physical basis: Life cycle assessment (LCA) databases are key examples of data sources that describe GHG emissions on a “physical” basis (i.e. per unit of mass, energy, or volume) for the input or output products, services or systems of an organisation. This type of database provides high granularity information that is based on “bottom-up” models of value chains, which are sometimes limited in their description of the components of value chains (i.e. limited boundary on the model of the value chain).
  2. Economic basis: Environmentally Extended Input-output (EE-I/O) databases describe GHG emissions per unit of economic value ($) for various sectors of the economy. While less detailed (i.e. “top-down” model of human activities), they are typically more comprehensive in the consideration of all components of value chains and they offer a basis for average sector-scale estimates of GHG emissions.

Data sources for scope 3 GHG emissions can also be classified by how specific they are on the representation of activities, products, services or systems in different contexts. The table below presents such data sources in order of relevance to offer a representative picture of GHG emissions for specific value chains. 

CategoriesSources of DataExamplesUse
Organisation-specific studies and their related datasetsBased on Governmental Standards and RegulationsDatasets made by following the guidelines of EU PEF or ISO 14067Ensuring compliance of specific legislative requirements from different regions in the world
Based on standards from Corporate ConsortiumDatasets made by following the guidelines of standards like PACT, TfS, and Catena-X, GHG protocolEnsuring data-sharing compliance that streamline industry collaborations with common guidelines
Background DatabasesGovernmental or International AgenciesIPCC, BEIS/Defra, EPA, ADEME, UNFCCC, UBA, UVEKAverage data recognized by many stakeholders and governmental organisations that created GHG assessment standards
LCA or EE-I/Oecoinvent, SPHERA//GaBi, Exiobase, ESU database, CEDAProvides GHG emission factors for representations of average supply chain models for products, services and systems around the world
Sector specificPlastics Europe,World Food LCA Database, Japan Iron and Steel Federation (JISF), WorldsteelProvides GHG emission factors for products and services of specific sectors of the economy typically for some regions of the world
AI-ToolsVarious companies that specialize in AI-based models of value chainsMakersite, Google Cloud Datasets, Amaru, Provides specific evaluations of GHG emission factors built on information of background databases.

As shown in the table above, several databases compile broad datasets for various industries, including ecoinvent, Sphera, and initiatives such as GLAD. Organizational sources, such as the IEA, IPCC, and GHG Protocol, offer GHG emissions data that is validated for specific assessment standards. AI-driven tools, including Makersite, use artificial intelligence to generate GHG emission factors for specific organisations. Data-sharing initiatives, such as PACT, CATENA-X, and TfS, facilitate the exchange of GHG emissions data between companies in a standardized format. Government-defined values, such as the Environmental Product Declarations (EPDs), and the Product Environmental Footprint (PEF), establish recognized methodologies for companies to evaluate scope 3 GHG emission factors for specific products, services and systems.  Each of these sources differs in level of detail, data format, and applicability, which can hinder their integration into GHG assessments for organisations.

2. Where can we find GHG emission factors to calculate a company’s scope 3 emissions?

The data sources mentioned in the previous section can be accessed in different ways.

  • For organisation-specific data, the organisation is typically the direct provider of the created GHG emission factors. This information might be provided in some reports or tables that can be defined as “unstructured” data or the company will have chosen a specific digital format for the shared files (i.e. “structured” data). In both cases, it is important to foster a clear communication between organisations to streamline the use of their GHG emission factors.
  • For average background databases of GHG emission factors, various repository managers are the key points of contact (e.g. ecoinvent, GLAD initiative, climatiq). The information they provide is offered in various “structured” digital formats that can be accessed either with a connection to an API or by downloading the files that constitute their databases.

You can find here a comprehensive list of data sources for various GHG emission factors (mainly for scope 3 data). This is by no means a complete list of data sources, but it should offer many examples that are of relevance to carry out GHG assessments of organisations. Any data provider that would like to be added to this list can reach out and we will gladly include them. 

3. How can I use this information in my GHG assessment?

The procedures to access and use GHG emission factors from various data sources can be described with 3 broad categories that require different levels of expertise and effort.

  • Manual extraction and expert support: GHG emission factors can be manually retrieved from reports or datasets from organisations to be applied in any GHG assessment for organisations or for carbon footprint calculations. This method, though straightforward, requires significant expertise to ensure proper use that aligns with the rules of GHG assessment standards. It may also require some back and forth between the user and the creator of the report/dataset to fully understand the “unstructured” data, which can take substantial amounts of time.
  • Using Life Cycle Assessment (LCA) software: Tools such as Brightway, SimaPro, OpenLCA and Sphera’s solutions can integrate various environmental databases for in-depth impact modelling and adaptations to rules of different assessment standards. While this is a more useful option than manual extraction, accessibility remains limited due to the relative complexity and cost of these digital tools.
  • Using web based dashboard: Businesses are now increasingly relying on sustainability management platforms that can streamline GHG emission assessments in an integrated manner. Such platforms typically use their own data formats to seamlessly integrate third-party datasets. This can be seen as a much simpler option to carry out GHG assessments even with limited expertise. With that being said, such platforms are typically aligned with selected GHG assessment standards and cost of access can sometimes be challenging for smaller organisations.

The connection between these procedures/tools and the data repositories that have been presented in the previous section depends on the used formats and metadata. Indeed, if the repositories are providing “unstructured” data description in reports or datasets, then their use in digital solutions can be limited or impossible. In addition, GHG emission factors that are not contextualized with relevant metadata can be difficult to align with tools that follow rules of specific GHG assessment standards.

4. Can I use all data sources together to get a comprehensive assessment?

In an ideal world, GHG emissions factors from different sources would be fully consistent, both in their data formats and how they respect the rules of GHG assessment standards. Such a situation would allow seamless integration and combination in digital tools or platforms to provide a more comprehensive model of value chains. Unfortunately, inconsistencies between datasets are currently the norm and pose significant challenges. 

At the technical level, the lack of harmonization between formats of GHG data sources means that digital solutions have to be built with many different strategies to transfer all relevant information on a common basis without losing some important metadata. This is a challenge that cannot be easily resolved since data providers have used different formats for a long time and they would need to spend substantial resources to bring all their data into a common format.

At the conceptual level, scope 3 GHG emission factors from different data sources can also be based on various models that are not aligned. For instance, they can present: 

  • Variations in system boundaries: Some data sources describe value chains from the  cradle-to-grave perspective (i.e. from natural resource extraction to waste treatment) , while others use a cradle-to-gate perspective (i.e. from natural resource extraction to manufacturing of a product).. Combining the GHG emission factors of such data sources would thus be unfair to products that are modelled with a more comprehensive cradle-to-grave perspective.
  • Differences in modelling assumptions: Data sources can currently make diverging modelling choices to fit with the methodological requirements of a specific GHG assessment standard. This can mean that activities from value chains with co-products can be described with different allocation rules, that some GHG emissions might not be considered or that the attribution of GHG emissions to recycling activities might differ. Combining data sources that make different choices on these modelling aspects would thus be inconsistent and potentially unfair.
  • Temporal misalignment: Data sources are periodically updated but at different rates that depend on the human resource that data managers have and the rate at which organisations are sharing updated descriptions of their value chains. This can mean that GHG assessments based on different versions may yield different results.

Additionally, significant limitations in combination arise when different data sources provide models of value chains with different levels of activity aggregation. A lack of granularity can indeed hinder reliable connections between different parts of value chains that are made by different organisations.

For organisations conducting GHG assessments, these discrepancies highlight the need for careful selection of GHG emission factors from different data sources that can align with the requirements of the assessment standard that they want to follow. At the moment, this typically means that a reliable combination of GHG data sources can only be done with the support of environmental experts.

As the field of environmental data continues to evolve, achieving greater interoperability between GHG emissions data sources will be key to advance reliable streamlined  GHG reporting. Future blog posts will explore how modelling choices affect GHG emission assessments and what steps can be taken to improve data consistency across reporting frameworks to ensure fairer comparison between the scope 3 GHG emission factors of different products, services and systems.

 

Leave a Reply