Hi everyone! I’m excited to share the results of my Laidlaw Leadership & Research Project: "Bridging Data Gaps: Utilizing NLP to Extract Tables from Sustainability Reports."
In today’s fast-paced world, sustainability reports play an important role in tracking and communicating companies' progress toward Environmental, Social, and Governance (ESG) goals. However, these reports often contain dense tables embedded within inconsistent layouts, making data extraction a challenge. My project aimed to tackle this issue by developing an algorithm that accurately detects and extracts tables, transforming them into structured formats ready for analysis.
Key Outcomes
- High Performance: The algorithm achieved a recall of 93%, ensuring the majority of tables were successfully identified while minimizing the risk of missed data.
- Enhanced Accessibility: By converting complex table structures into accessible formats like CSV, this tool enables stakeholders—investors, regulators, and businesses—to make data-driven decisions more efficiently.
Why It Matters
Efficiently extracting sustainability data aligns with the United Nations Sustainable Development Goals (SDGs) by promoting transparency and accountability in corporate ESG reporting. This work not only simplifies data access but also helps companies and stakeholders assess sustainability performance, track progress, and drive meaningful change.
What’s Next?
Moving forward, I aim to refine the algorithm, expand its application to larger datasets, and adapt it to handle even more diverse report layouts. Ultimately, this project bridges the gap between raw ESG data and actionable insights, contributing to a more sustainable future.
🖼️ Above, you’ll find the poster summarizing my methodology, results, and future goals. I’d love to hear your feedback or answer any questions you might have about the project!
Please sign in
If you are a registered user on Laidlaw Scholars Network, please sign in