trailofbits.python.lxml-in-pandas.lxml-in-pandas

Author
unknown
Download Count*
License
Found usage of the $FLAVOR
library, which is vulnerable to attacks such as XML external entity (XXE) attacks
Run Locally
Run in CI
Defintion
rules:
- id: lxml-in-pandas
message: Found usage of the `$FLAVOR` library, which is vulnerable to attacks
such as XML external entity (XXE) attacks
languages:
- python
severity: ERROR
metadata:
category: security
cwe: "CWE-611: Improper Restriction of XML External Entity Reference"
subcategory:
- vuln
confidence: HIGH
likelihood: MEDIUM
impact: MEDIUM
technology:
- pandas
description: Potential XXE attacks from loading `lxml` in pandas
references:
- https://lxml.de/FAQ.html
license: CC-BY-NC-SA-4.0
pattern-either:
- patterns:
- pattern: pandas.read_html($IO)
- pattern-not: pandas.read_html(**$KWARGS)
- patterns:
- metavariable-pattern:
metavariable: $FLAVOR
patterns:
- pattern: ...
- pattern-not: |
"bs4"
- pattern-not: |
"html5lib"
- pattern-either:
- pattern: pandas.read_html(..., flavor=$FLAVOR, ...)
- patterns:
- pattern-inside: |
$KWARGS = {..., "flavor": $FLAVOR, ...}
...
- pattern: |
pandas.read_html(**$KWARGS)
Examples
lxml-in-pandas.py
import pandas as pd
import pandas
touch = 1
touch2 = 2
touch3 = 3
# ruleid: lxml-in-pandas
pd.read_html(touch)
# ruleid: lxml-in-pandas
pandas.read_html(touch, flavor="lxml")
kwargs = {"io": touch, "flavor":"lxml"}
# ruleid: lxml-in-pandas
pd.read_html(**kwargs)
# ok: lxml-in-pandas
pandas.read_html(touch, flavor="bs4")
# ok: lxml-in-pandas
pd.read_html(touch, flavor="html5lib")
kwargs2 = {"io": touch, "flavor":"html5lib"}
# ok: lxml-in-pandas
pd.read_html(**kwargs2)
Short Link: https://sg.run/1z1G