Web Attack Techniques
What You Will Learn
- How to find hardcoded secrets in source code and repositories
- How to use BigQuery and GitHub for secret scanning
- How to enumerate web endpoints with curl
- URL parsing and path manipulation techniques
What Is It?
Web attacks cover a broad range of techniques used against web applications. This topic focuses on reconnaissance and discovery — finding secrets in source code, mapping attack surfaces, and building effective automation for web testing.
Why It Matters
Most real bugs in bug bounty programs start with thorough reconnaissance. Leaked API keys, exposed endpoints, and misconfigured paths are often found through smart scanning, not brute force.
Secret Scanning
Regex Pattern for Common Secrets
This pattern matches common secret and API key variable names in source code:
(access_key|access_token|admin_pass|admin_user|algolia_admin_key|algolia_api_key|
alias_pass|alicloud_access_key|amazon_secret_access_key|amazonaws|ansible_vault_password|
api_key|api_key_secret|api_secret|appkey|appkeysecret|application_key|appsecret|
auth_token|authorizationToken|aws_access_key_id|aws_secret|aws_secret_key|
aws_token|AWSSecretKey|client_secret|cloudflare_api_key|cloudflare_auth_key|
database_password|db_password|docker_pass|encryption_key|heroku_api_key|
sonatype_password|awssecretkey)
BigQuery — Search GitHub for Secrets
GitHub publishes a public dataset to BigQuery. Use it to search for hardcoded secrets in public repos:
SELECT path
FROM `bigquery-public-data.github_repos.contents` AS contents
JOIN `bigquery-public-data.github_repos.files` AS files
ON files.id = contents.id
WHERE REGEXP_CONTAINS(content, r"PATTERN_HERE")
Replace PATTERN_HERE with the secret pattern you are looking for.
URL and Path Analysis
Extract Unique Paths from a URL List
# Extract repo owner/name from GitHub URLs, remove query strings, deduplicate
cut -d '/' -f 4,5 < urls.txt | sed 's/?.*//g' | sort -u
Parse URL Paths with unfurl
# Print each path component on its own line
unfurl paths < urls.txt | tr '/' '\n' | sort -u
# Alternative with sed
sed 's#/#\n#g' paths.txt | sort -u
Web Endpoint Fuzzing with curl
Sequential URL Fuzzing
# Fetch URLs numbered 0 to 10, save each response to out/post_X.txt
curl --silent --fail "https://example.com/[0-10]" -o "out/post_#1.txt"
# Print URL and status code for each
curl -s -w '%{url} %{http_code}\n' https://example.com/[0-10] -o /dev/null
# Filter for successful responses
curl -s -w '%{url} %{http_code}\n' https://example.com/[0-10] -o /dev/null | grep 200
# Filter out 404s
curl -s -w '%{url} %{http_code}\n' https://example.com/[0-10] -o /dev/null | grep -v 404
Directory Fuzzing
# ffuf — fast web fuzzer
ffuf -w /usr/share/wordlists/dirb/common.txt -u https://target.com/FUZZ
# With extension filtering
ffuf -w wordlist.txt -u https://target.com/FUZZ -e .php,.html,.txt
# Parameter fuzzing
ffuf -w params.txt -u "https://target.com/search?FUZZ=test"
JavaScript Endpoint Discovery
Modern web apps expose endpoints through JavaScript files. Parse them to find hidden APIs:
# Download all JS files
gau target.com | grep "\.js$" | sort -u | xargs -I{} curl -s {} > all.js
# Extract endpoints from JS
grep -oP '(/api/[a-zA-Z0-9/_-]+)' all.js | sort -u
# Use jsluice
jsluice urls -r < all.js