Securing Your Web Site to Pass OWASP ZAP Scans with CSP Generation

Problem:

Cyber Security Compliance with CSPRecently more than one client of mine approached me to help them become cyber security ready and certified for which part of the process was to pass OWASP ZAP scans having no medium or high risk items. One of these items was the inclusion of a CSP header (content security policy header). These headers authorize every style and script on your site or included from an external source on your site with the goal to prevent any third-party code injections and continuously validate the integrity of your web site’s codebase.

The CSP framework does have the option for unsafe-hashes or unsafe-inline directives, which will allow you to insert a CSP header without specifying the hashes for every single style or script however this is not an acceptable practice when it comes to likes of hostedscan.com or a desktop OWASP ZAP scan using either of these methods will result in a one or more high risk items being identified.

The process becomes more complicated when your web site is developed using a framework such as WP Bakery or Elementor within a WordPress environment. It is more complicated because there is a plethora of styles and scripts included throughout the headers and footers of the site as well as in various locations in line in the web site. So the problem compounded because every time we changed the site we had to use the Google Chrome console to generate the hashes and meticulously cut and paste them into our Apache .htaccess file (for every page on the site!) in order to pass the scan successfully. This process was not only cumbersome and time consuming but was prone to error due to the fact that the. htaccess CSP header needed to be on one line.

Solution:

Therefore the solution was to develop the CSP generator.ca – which is versatile and will crawl through either an RFC compliant sitemap.xml or from a starting URL on your web site. It will systematically go through each page and identify the styles and scripts, both in line and externally and create the necessary CSP headers for your web site. Longer term we anticipate the CSP generator being useful as an API in a continuous integration framework where it can generate the CSP headers automatically and eventually other headers during your web site’s deployments.

If you’re interested in learning more about how the process works or for some investigation and potential remediation of your website do not hesitate to contact us.

You can also evaluate the free demo version for yourself at https://cspgenerator.ca

How We Used AI to Help Create CSP Generation Tool: Developer Perspective

The initial approach was to  leverage experience from a previous project where we had to extract email addresses from websites, by crawling the site and parsing the DOM to find the necessary information.

However, we quickly ran into a significant roadblock. The initial approach of simply fetching the HTML content of a page was not sufficient. Many modern websites are heavily reliant on JavaScript to render their content, and the initial crawler was not able to execute the JavaScript. This meant that we were not getting the complete and final DOM, which in turn led to incorrect CSP hashes being generated.

To solve this problem, we realized we needed a way to get the fully rendered HTML of a page, just as it appears in a user’s browser. This led us to the idea of using a headless browser. A headless browser is a web browser without a graphical user interface, which can be controlled programmatically. By using a headless browser, we could instruct it to load a page, wait for all the JavaScript to execute, and then extract the final HTML.

This is where AI, specifically Gemini, played a crucial role in the development process. While we were familiar with the concept of using a headless browser, we had not previously implemented  it in a PHP environment. Gemini was instrumental in several ways:

PHP Syntax and Libraries: We were able to ask Gemini for help with PHP syntax, best practices, and how to use relevant libraries like php-webdriver and symfony/panther to control a headless Chrome browser. This saved a significant amount of time that would have otherwise been spent reading documentation and watching tutorials.

Server-Side Execution: Running a headless browser on a local development machine is one thing, but deploying it to a server and running it as a background process presented a new set of challenges. Gemini provided valuable insights and code examples on how to manage the ChromeDriver process, handle potential errors, and ensure the headless browser runs reliably in a server environment and is persistent on the server, even after a reboot.

Code Generation: Gemini was able to generate high-quality code snippets for various parts of the application, from the initial setup of the headless browser to the extraction of specific DOM elements. This not only accelerated the development process but also helped us write more robust and efficient code.

Ideation and Problem-Solving: Beyond just providing code, Gemini was a valuable partner in the creative process. When stuck on a particular problem, we could describe the issue to Gemini, and it would often provide fresh ideas and alternative approaches that we hadn’t considered. This collaborative process helped us to arrive at better solutions more quickly.

Conclusion

Combining AI with the right skill set can truly make work go faster and help accomplish the goals of your project. AI has made the development of this project very efficient.

Although the initial approach of simple DOM parsing proved to be insufficient – the switch to a headless browser provided a robust solution. The use of AI, particularly Gemini, was a game-changer. It bridged the gaps in the PHP development, provided practical solutions to complex problems, and ultimately enabled us to build a better application in a fraction of the time it would have taken otherwise.