Web Crawling

Libraries to automate web scraping.

Newest releases

DIGITALCRIMINAL Scrape all the media from an OnlyFans account - Updated regularly
 

p-ranav saveddit is a bulk media downloader for reddit.
 

fhamborg news-please is an open source, easy-to-use news crawler that extracts structured information from almost any news website. It can recursively follow internal hyperlinks and read RSS feeds to fetch both most recent and also old, ar
 

Boris-code feapder 是一款上手简单,功能强大的Python爬虫框架,使用方式类似scrapy,方便由scrapy框架切换过来
 

srcrs 联通手机营业厅自动完成每日任务,领流量、签到获取积分等,月底流量不发愁。
 

hu619340515 搜索日志,出现“抢购成功,订单号xxxxx",代表成功抢到了,务必半小时内支付订单!程序暂时不支持自动停止,需要手动STOP!
 

Sniper970119 本程序可以爬取大众点评搜索页、详情页以及评论页中的相关信息,并将结果写入文件或数据库中。
 

melo-gonzo This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers.
 

Threezh1 JSFinder is a tool for quickly extracting URLs and subdomains from JS files on a website.
 

cocrawler CoCrawler is a versatile web crawler built using modern tools and concurrency.
 

wkunzhi Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝❤️
 

kurogai An simple script to scrappe deepweb to find pages. Will return if any of those exists and will save on a file. You should specify the name of the file to save. If you wish to estimate the amount of requests, just multiply the numb
 

meeb TubeSync is a PVR (personal video recorder) for YouTube. Or, like Sonarr but for YouTube (with a built-in download client). It is designed to synchronize channels and playlists from YouTube to local directories and update your med
 

instaloader Download pictures (or videos) along with their captions and other metadata from Instagram.
 

spotDL Download your Spotify playlists and songs along with album art and metadata (from YouTube if a match is found).
 

con-dog Automate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elements
 

justinzm 数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
 

mtimkovich Exports watched movies from Plex to the Letterboxd Import Format.
 

iliaschalkidis A collection of scripts to crawl English legal corpora 📕 from open public domains.
 

EricJMarti ⚡️ Get notified as soon as your next CPU, GPU, or game console is in stock
 

JonasCz The ultimate guide on preventing Website Scraping
 

dkkocab This is a bot that will buy a PS5 through Walmart. It's written in python and uses selenium to interact with web pages.
 

enesusta github-trending-crawler is a basic rest API that crawls github.com/trending.
 

ayoubeddafali A minimal framework to automate web Actions/Plans, and run them in a containerized fashion.
 

samuelm2 Simple, quick to set up stock notification bot for Nvidia 3080 that I used to get my 3080. Less than 250 lines of code.
 

philippnormann 🎯 Autonomously buy Nvidia Founders Edition GPUs as soon as they become available
 

AlteredSecurity 365-Stealer is the tool written in python3 which steals data from victims office365 by using access_token which we get by phishing. It steals outlook mails, attachments, OneDrive files, OneNote notes and injects macros.
 

nehalist Bot for crawling stock availability of RTX 3000 cards and tweeting about it
 

alirezamika AutoScraper: A Smart, Fast and Lightweight Automatic Web Scraper for Python
 

Lumorti A dungeon crawler designed for a quantum computer as a series of 17000 quantum gates.
 

kangvcar 支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、中国移动、中国联通、中国电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源中国博客、简书。
 

codingforentrepreneurs Scrape websites asynchronously with Python 3.8+, Asyncio, & arsenic (aka Selenium for Async).
 

SamPom100 scans every ticker on the market, gets their last 5 months of volume history, and alerts you when a stock's volume exceeds 10 standard deviations from the mean within the last 3 days