By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Cookie Policy and Privacy Policy for more information.

Enhanced data extraction from industrial drawings

Enhanced data extraction from industrial drawings

Data extraction for large EPC company.


A large EPC wanted to improve accuracy and reduce the time its workforce spent working with engineering drawings in order to optimize its cost-estimation process. Improper cost estimation throughout the industry leads to cost overrun, delayed schedules, scope-creep, and many other project-related failures. A proper estimate minimizes overall risk, promotes good work practices, and ensures a greater level of customer satisfaction.


To tackle this problem, we developed a machine learning application that effectively extracts information from Piping & Instrumentation diagrams (P&IDs) and other engineering drawings by leveraging deep learning models along with the computer vision techniques. The product, DataSeer, automatically detects and identifies all instances of relevant symbols and associated text. This is done in a rotation and scale-invariant way, across diagrams of different styles and thus reduce fatigue-induced human error as well as overall processing time. Through its human-guided search-and-review feature, the customer’s user can provide symbol examples or text patterns for search, while allowing for as much human inspection time as needed.

Compared to a 100% manual approach of information extraction by a process engineer, DataSeer reduces the time taken to extract line numbers by more than 90% and valves and instruments by more than 70%. The performance varies depending on diagram complexity. The machine learning system also allows the customer’s process engineers to extract information from diagram raw images with up to 90% reduction in total processing time. Our customer now uses DataSeer in brownfield engineering projects or during bid phases where P&ID diagrams are typically not available in CAD source formats.


Using DataSeer the EPC firm expects to reduce their process time in identifying material counts by 90% and accuracy errors to less than 1%. This increased efficiency and accuracy will result in lower direct costs for producing bids. Additionally, the increased accuracy and reduced time to develop a bid mitigates the inherent risk in the bid amount, especially when lump-sum pricing is needed. Improving confidence in the estimates empowers the company to make more effective bids and ultimately be awarded more projects.