python.lang.security.audit.dynamic-urllib-use-detected.dynamic-urllib-use-detected

Community Favorite
profile photo of semgrepsemgrep
Author
70,993
Download Count*

Detected a dynamic value being used with urllib. urllib supports 'file://' schemes, so a dynamic value controlled by a malicious actor may allow them to read arbitrary files. Audit uses of urllib calls to ensure user data cannot control the URLs, or consider using the 'requests' library instead.

Run Locally

Run in CI

Defintion

rules:
  - id: dynamic-urllib-use-detected
    patterns:
      - pattern-not: urllib.$W("...")
      - pattern-not: urllib.request.$W("...")
      - pattern-not: $OPENER.$W("...")
      - pattern-either:
          - pattern: urllib.urlopen(...)
          - pattern: urllib.request.urlopen(...)
          - pattern: urllib.urlretrieve(...)
          - pattern: urllib.request.urlretrieve(...)
          - patterns:
              - pattern-either:
                  - pattern-inside: |
                      $OPENER = urllib.URLopener(...)
                      ...
                  - pattern-inside: |
                      $OPENER = urllib.request.URLopener(...)
                      ...
                  - pattern-inside: |
                      $OPENER = urllib.FancyURLopener(...)
                      ...
                  - pattern-inside: |
                      $OPENER = urllib.request.FancyURLopener(...)
                      ...
              - pattern-either:
                  - pattern: $OPENER.open(...)
                  - pattern: $OPENER.retrieve(...)
    message: Detected a dynamic value being used with urllib. urllib supports
      'file://' schemes, so a dynamic value controlled by a malicious actor may
      allow them to read arbitrary files. Audit uses of urllib calls to ensure
      user data cannot control the URLs, or consider using the 'requests'
      library instead.
    metadata:
      cwe:
        - "CWE-939: Improper Authorization in Handler for Custom URL Scheme"
      owasp: A01:2017 - Injection
      source-rule-url: https://github.com/PyCQA/bandit/blob/b1411bfb43795d3ffd268bef17a839dee954c2b1/bandit/blacklists/calls.py#L163
      bandit-code: B310
      asvs:
        section: "V5: Validation, Sanitization and Encoding Verification Requirements"
        control_id: 5.2.4 Dynamic Code Execution Features
        control_url: https://github.com/OWASP/ASVS/blob/master/4.0/en/0x13-V5-Validation-Sanitization-Encoding.md#v52-sanitization-and-sandboxing-requirements
        version: "4"
      category: security
      technology:
        - python
      references:
        - https://cwe.mitre.org/data/definitions/939.html
      subcategory:
        - audit
      likelihood: LOW
      impact: LOW
      confidence: LOW
      license: Commons Clause License Condition v1.0[LGPL-2.1-only]
      vulnerability_class:
        - Improper Authorization
    languages:
      - python
    severity: WARNING

Examples

dynamic-urllib-use-detected.py

# cf. https://github.com/PyCQA/bandit/blob/694dfaa370cce54ea23169123554598bad0e1be6/examples/urlopen.py

''' Example dangerous usage of urllib[2] opener functions

The urllib and urllib2 opener functions and object can open http, ftp,
and file urls. Often, the ability to open file urls is overlooked leading
to code that can unexpectedly open files on the local server. This
could be used by an attacker to leak information about the server.
'''


import urllib
import urllib2

# Python 3
import urllib.request

def test_urlopen():
    # urllib
    url = urllib.quote('file:///bin/ls')
    # ruleid:dynamic-urllib-use-detected
    urllib.urlopen(url, 'blah', 32)

    # Detect this because it can retrieve any number of args. Hard to detect with Semgrep.
    # ruleid:dynamic-urllib-use-detected
    urllib.urlretrieve('file:///bin/ls', '/bin/ls2')
    opener = urllib.URLopener()

    # This is OK because it's a constant.
    # ok:dynamic-urllib-use-detected
    opener.open('file:///bin/ls')
    # ok:dynamic-urllib-use-detected
    opener.retrieve('file:///bin/ls')
    opener2 = urllib.FancyURLopener()
    # ok:dynamic-urllib-use-detected
    opener2.open('file:///bin/ls')
    # ok:dynamic-urllib-use-detected
    opener2.retrieve('file:///bin/ls')

    # ruleid:dynamic-urllib-use-detected
    opener.open(url)
    # ruleid:dynamic-urllib-use-detected
    opener.retrieve(url)
    # ruleid:dynamic-urllib-use-detected
    opener2.open(url)
    # ruleid:dynamic-urllib-use-detected
    opener2.retrieve(url)

    # Python 3
    # ok:dynamic-urllib-use-detected
    urllib.request.urlopen('file:///bin/ls')
    # ruleid:dynamic-urllib-use-detected
    urllib.request.urlretrieve('file:///bin/ls', '/bin/ls2')
    opener = urllib.request.URLopener()
    # ok:dynamic-urllib-use-detected
    opener.open('file:///bin/ls')
    # ok:dynamic-urllib-use-detected
    opener.retrieve('file:///bin/ls')
    opener2 = urllib.request.FancyURLopener()
    # ok:dynamic-urllib-use-detected
    opener2.open('file:///bin/ls')
    # ok:dynamic-urllib-use-detected
    opener2.retrieve('file:///bin/ls')