Most organizations can tell you whether their firewalls are healthy. Fewer can prove every allow rule is inspected, logged, owned, and still required.
The gap between those two things is where audits become painful. Multiple firewall admins, emergency changes at 2am, quarterly reviews that turn into archaeology digs, vendor access rules that were “temporary” in February and are still there in October. Nobody disabled them because nobody noticed they were still there. No alert fires when a rule that was supposed to be temporary quietly becomes permanent.
The monitoring stack shows green: CPU fine, sessions normal, no drops. But that tells you the firewall is running, not whether it is enforcing what you think it is enforcing.
There is a better way. Encode the requirements.
If your security baseline lives in a Word document or a PDF, it is a suggestion. If it lives in pytest, it can fail a pipeline.
TL;DR#
| |
The Principle: Security Requirements Should Be Executable#
Every security team has a baseline. It usually sounds something like this: every allow rule must log to the SIEM, internet-facing rules must have inspection profiles attached, zone protection must be applied everywhere, exceptions must have owners and expiration dates.
Written down, those are good intentions. Encoded as tests, they are enforcement.
The PAN-OS XML API returns the full running config as XML. Python’s xml.etree.ElementTree parses it. pytest turns assertions into structured pass/fail output with machine-readable results. None of these are exotic tools. The combination is a lightweight Policy-as-Code pipeline that runs in minutes and costs nothing except the time to write the first test.
The test suite does not patch configs, create rules, or modify anything. It reads the running config and reports violations. The firewall admin still fixes them manually. The automation catches them before the quarterly review does.
Control Catalog#
Each row below is a security requirement. The test column is the executable version of it. The compliance column maps it to a standard so audit teams have a reference they can cite:
| Requirement | pytest Control | Compliance Relevance |
|---|---|---|
| All allow rules log to SIEM | test_allow_rules_have_log_forwarding | SOC 2 CC6.1, PCI-DSS 10.2 |
| Allow rules have security profiles | test_allow_rules_have_security_profile_group | NIST CSF DE.CM-1 |
| No unrestricted internet allow | test_no_unrestricted_allow_from_internet | CIS PAN-OS Benchmark |
| Zone protection applied | test_zone_protection_profile_applied | PCI-DSS 1.3 |
| Critical rules still exist | test_critical_rule_exists | Change detection |
| Explicit deny for untrust zone | test_deny_all_exists_for_untrust_zone | Defense in depth |
| Service object naming standard | test_service_objects_follow_naming_convention | Operational hygiene |
Seven controls. Each one represents a class of drift that is invisible to monitoring but immediately visible to an auditor.
The Setup#
You need a read-only API user on the firewall. Never run tests with admin credentials. Tests should assert, never modify. A read-only key limits blast radius if it leaks and makes it obvious the credential should never be used for anything except reading.
| |
Store credentials as environment variables:
| |
This same XML API pattern appears in other PAN-OS automation work. If you have read How I Got Every Device Named in My Firewall Logs, the approach is identical.
The client wraps the PAN-OS XML API with two methods: op() for operational commands and config() for config retrieval by XPath:
| |
scope="session" matters here. Without it, pytest creates a new client per test. Session scope reuses one connection across all tests, which is 14 fewer API handshakes per run.
One important distinction: type=config&action=show reads the active running config, what the firewall is actually enforcing right now. Use action=get if you want to validate candidate config before a commit. For drift detection, show is what you want.
The Controls#
| |
Smoke: Reachable and Running a Supported Version#
| |
Control: Explicit Deny Rule Covers the Internet Zone#
PAN-OS has an implicit deny at the bottom of every rulebase. An explicit deny rule shows intent, enables custom logging profiles, and survives zone renaming. If it disappears after a config change, this test catches it:
| |
Control: Zone Protection Profiles Applied#
pytest parametrization lets one function cover every zone. One test function, five zones, five pass/fail results with distinct names in the output:
| |
| |
Control: Critical Rules Still Exist#
Rules that enable core infrastructure should be present after every change window. If someone accidentally deleted a rule or renamed it, this catches it before the next traffic complaint:
| |
Control: No Unrestricted Allow from the Internet#
| |
Control: Allow Rules Have Security Profile Groups#
An allow rule with no security profile group forwards traffic with App-ID enforcement but zero Content-ID inspection. No antivirus scan. No vulnerability protection. No URL filtering. The traffic is identified and allowed, but not inspected. This control flags every allow rule operating without a profile group attached:
| |
Control: All Allow Rules Forward Logs to SIEM#
A rule that does not ship logs to your SIEM is invisible to threat detection. The session happens, the traffic flows, and the SIEM never sees it:
| |
Control: Service Object Naming Convention#
Every service object should follow tcp-PORT or udp-PORT. A test enforces this so typos or legacy names get flagged before they spread:
| |
The Exception Model#
No real organization runs zero exceptions. Vendor migration windows, legacy protocol incompatibilities, time-bounded access for third parties: legitimate exceptions exist. The problem is not the exceptions themselves. The problem is when exceptions are undocumented, unowned, and never expire.
The solution is to make exceptions explicit. They live in a file, have owners, have tickets, and have expiration dates. When the expiration date passes, the exception stops working automatically. No manual cleanup required.
| |
Load it in the test suite:
| |
Update test_allow_rules_have_security_profile_group to respect exceptions:
| |
When the expiration date passes, the exception entry no longer suppresses the failure. The test starts failing again on its own. No one has to remember to clean it up. The exceptions.yaml file, committed to git, also becomes documentation. Audit teams can see every known exception, who owns it, what ticket authorized it, and when it was supposed to end.
This turns “we know about it” into something documentable: a time-bounded, owner-assigned, ticket-referenced exception that expires automatically.
What It Found on the PA-440#
I run this control catalog against a PA-440 running PAN-OS 11.2.11. On the first run, 11 passed and 3 failed. The controls found real gaps:
| |
Finding 1: 14 allow rules with no security profile group. Rules handling WireGuard tunnels, SSH jump connections, name resolution, and Cloudflare Tunnel traffic were forwarding packets with App-ID enforcement but no Content-ID inspection. Not all of these are misconfigured, some are deliberately infrastructure-to-infrastructure rules where inspection adds overhead and limited value. But the control surfaced all of them in one pass. The ones touching external traffic got profile groups added. The rest got documented as explicit exceptions with owners and expiration dates. Before this ran, neither list existed.
Finding 2: Allow rules missing log forwarding. Several rules were not shipping session logs to the SIEM. Locally buffered logs meant alerts could fire inside the firewall but never reach centralized analysis. Fixed by attaching the log forwarding profile to each affected rule.
Finding 3: tcp-all service object. This is a built-in PAN-OS service representing all TCP ports. It does not follow tcp-PORT convention because it has no specific port. Added to the allowlist in the test. The naming control still catches anything else that does not conform.
The first finding is the one that matters. Before this test, there was no visibility into which allow rules were operating without inspection profiles attached. The control found it in under 30 seconds.
Running the Controls#
| |
CI Integration#
The controls are most useful when they run automatically. The pattern below triggers from Semaphore after every config backup job completes:
| |
The --junit-xml flag is important. The XML report becomes audit evidence: timestamped, structured, showing which controls passed, against which firewall, at what time.
| |
Store the report as a CI artifact. Attach it to SOC 2 evidence packages. Reference it in PCI-DSS firewall review documentation. Instead of manually assembling a spreadsheet of what you checked and when, the pipeline generates it automatically on every run.
Every passing run is a dated attestation that the baseline was verified. Every failing run is an alert before the auditor finds it.
Scaling to Enterprise: Panorama Fleet#
This pattern runs against one firewall. The same approach scales to an entire managed fleet via Panorama.
The XML API is identical across single devices and Panorama. Targeting a specific managed firewall uses a target parameter with the device serial number:
| |
Parametrize the test suite over a list of serial numbers. One test run covers every branch firewall. One CI job generates one report per device. test_allow_rules_have_security_profile_group becomes a compliance sweep across every managed device in the organization.
One note on Panorama: be aware of pre-rulebase and post-rulebase distinctions when querying managed device configs. Rules pushed from device groups live in pre/post rulebase paths, not the local vsys rulebase. Adjust the XPath accordingly if your fleet relies heavily on Panorama-pushed policy.
Get the Code#
The full control suite is available on GitHub, including synthetic XML fixtures so you can run the demo without a live firewall:
github.com/mareox/panos-pytest-baseline
| |
Runs immediately. No PA-440 required.
Takeaways#
- Drift is invisible until the audit. Executable controls make it visible in seconds.
- Requirements in a PDF are suggestions. Requirements in pytest can fail a pipeline.
- Use a read-only API user. Tests assert, never modify. A scoped key limits blast radius.
- The exception model turns “we know about it” into documented, time-bounded, owner-assigned evidence. Expired exceptions automatically start failing again. No manual cleanup needed.
- CI output is audit evidence. The JUnit XML report is a timestamped attestation of what was verified and when. Stop assembling that spreadsheet manually.
A PA-440 at home is a bit much. But it turns out “which of your allow rules are missing inspection profiles” is a question worth being able to answer in 30 seconds, whether you manage one firewall or a hundred.