Home Upgrade Search Memberlist Extras Hacker Tools Award Goals Help Wiki Follow Contact

HF Rulez the UniverseHF Rulez the Universe
FogFox
Welcome to sesame street!
programming python bruteforce

Finding Public Download Links To Private Files

Posted 03-06-2021, 05:16 PM
Let us begin with the idea. We know the many people use file-hosting sites to upload their data. Not only random dog pictures but also malware or even somethings like paid e-books. So what can I do if I want to obtain these files? Most file hosting services don't have a search function on their site.

The two easiest options to solve this issue are brute-forcing and using search engines. For my examples, I am going to use Anonfiles.


Bruteforcing Anonfiles Download Links With Python

This method is by far not efficient but interesting to play around with.
At first, let us take a look at some links: https://anonfiles.com/Nc6f3a78qc/

Seems like we have 10 random characters, these can contain uppercase letters, lowercase letters, and numbers.

We start with importing the needed modules:
Code
import requests
import random
import string

We will use requests to check the status code of the link and a combination out of random and string to generate our link list.

We create an own function called "link_generator". We define a size of 10 and the types of characters we want to use. Then we pick a random choice from our characters. We do this ten times and join them together.
Code
def link_generator(size=10, chars=string.ascii_lowercase + string.ascii_uppercase + string.digits):
return ''.join(random.choice(chars) for _ in range(size))

We need a function to load our link and check for the status code.

At first, we load our link with the requests module. We will only get the data from hour head, this will improve the speed of our script. Now let's check if the status code of the response is 200.
If we meet the requirements we want to print our link.
Code
def load_url(url):
r = requests.head(url)
if r.status_code == 200:
print(str(r.status_code) + ": " + url)

To use this function we need a list of links to check. Let us create a list of random links with our link generator function. We define an empty list called "urls" and append our main link with a random end.
Code
for i in range(100):
urls.append('https://anonfiles.com/' + link_generator())

If we now call our "load_url" function we can start brute-forcing for links.
Code
for url in urls:
load_url(url)

Let's clear it a bit up and add multi-threading to check the links faster. I will also change some variables to input so it's easier to use from your command line. This is the finished code:
Code
import requests
import random
import string
from time import time
from concurrent.futures import ThreadPoolExecutor, as_completed
import concurrent.futures

urls = []
out = []
results = []

THREADS = input("Threads: ")
AMOUNT = input("Links to scrape: ")

def link_generator(size=10, chars=string.ascii_lowercase + string.ascii_uppercase + string.digits):
return ''.join(random.choice(chars) for _ in range(size))

def load_url(url):
r = requests.head(url)
if r.status_code == 200:
print(str(r.status_code) + ": " + url)
results.append(url)

for i in range(int(AMOUNT)):
urls.append('https://anonfiles.com/' + link_generator())


start = time()
with ThreadPoolExecutor(max_workers=int(THREADS)) as executor:
future_to_url = (executor.submit(load_url, url) for url in urls)

for future in concurrent.futures.as_completed(future_to_url):
out.append(1)
print("Checked: " + str(len(out)) + " - " + "Results: " + str(len(results)),end="\r")

print(f'Time taken: {time() - start}')


Finding Anonfiles Download Links With Google

A more effective method is to search for links by using a search engine. In this example, we will use Google.

By just searching for the website we will find many useless links and no download. To solve this problem we use search operators. These are special characters and commends which are used to extend the capabilities of our search.

For our needs, the "site" operator is perfect: site:anonfiles.com
If we enter this search term into google we will find our first results.

This is pretty slow and we only get a few results per site so let us modify our link a bit. The main link for a google search is:
https://www.google.de/search?

We add our keywords we want to search for:
https://www.google.de/search?q=site%3Aanonfiles.com

Last but not least let's show 100 results per site:
https://www.google.de/search?q=site%3Aanonfiles.com&num=100

Now that we got our data prepared, let's use python to scrape all the links from our search results.

We will use some modules to make our life easier. Let's import them first:
Code
import requests
from bs4 import BeautifulSoup
import re

At first, create a function and call it whatever you like. We get our link from user input. This time we will not only get our header but the complete content. We use Beautiful Soup and "lxml" as our parser.
Code
def googleScrape():
    url = input('Link: ')
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'lxml')

By inspecting the google search page we can see that the best way to find these links is to find the "href" value. We can find these inside a tag which defines a hyperlink.
Code
    a = soup.find_all('a')
    for i in a:
        href_complete = i.get('href')

If we print out our list of links we can see that there is a small issue with them. We get a lot of unnecessary information and not clean links. With a regular expression search, we can find all of our links. Now group them in a temporary variable and split them on every "&". At last, we can print our results.
Code
        re_search = re.search("(?P<url>https?://[^\s]+)", href_complete)
        temp = re_search.group(0)
        result = temp.split('&')[0]
        print(result)

Let's try the finished code:
Code
import requests
from bs4 import BeautifulSoup
import re

def googleScrape():
    url = input('Link: ')
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'lxml')

    a = soup.find_all('a')
    for i in a:
        href_complete = i.get('href')

        re_search = re.search("(?P<url>https?://[^\s]+)", href_complete)
        temp = re_search.group(0)
        result = temp.split('&')[0]
        print(result)

googleScrape()

I was able to find some Indian movies and some Chinese manuals.


I hope you could understand the main theory behind this. Why not try it with some other download sites? Modify the code and try your luck!
03-21-2021, 05:19 AM
Good stuff! Thanks for the share!
03-08-2021, 10:31 AM
Love the blog post <3.