Web Crawling

Libraries to automate web scraping.

Newest releases

Sniper970119 本程序可以爬取大众点评搜索页、详情页以及评论页中的相关信息,并将结果写入文件或数据库中。
 

melo-gonzo This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers.
 

Threezh1 JSFinder is a tool for quickly extracting URLs and subdomains from JS files on a website.
 

cocrawler CoCrawler is a versatile web crawler built using modern tools and concurrency.
 

wkunzhi Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝❤️
 

kurogai An simple script to scrappe deepweb to find pages. Will return if any of those exists and will save on a file. You should specify the name of the file to save. If you wish to estimate the amount of requests, just multiply the numb
 

meeb TubeSync is a PVR (personal video recorder) for YouTube. Or, like Sonarr but for YouTube (with a built-in download client). It is designed to synchronize channels and playlists from YouTube to local directories and update your med
 

instaloader Download pictures (or videos) along with their captions and other metadata from Instagram.
 

spotDL Download your Spotify playlists and songs along with album art and metadata (from YouTube if a match is found).
 

con-dog Automate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elements
 

justinzm 数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
 

mtimkovich Exports watched movies from Plex to the Letterboxd Import Format.
 

iliaschalkidis A collection of scripts to crawl English legal corpora 📕 from open public domains.
 

EricJMarti ⚡️ Get notified as soon as your next CPU, GPU, or game console is in stock
 

JonasCz The ultimate guide on preventing Website Scraping
 

dkkocab This is a bot that will buy a PS5 through Walmart. It's written in python and uses selenium to interact with web pages.
 

enesusta github-trending-crawler is a basic rest API that crawls github.com/trending.
 

ayoubeddafali A minimal framework to automate web Actions/Plans, and run them in a containerized fashion.
 

samuelm2 Simple, quick to set up stock notification bot for Nvidia 3080 that I used to get my 3080. Less than 250 lines of code.
 

philippnormann 🎯 Autonomously buy Nvidia Founders Edition GPUs as soon as they become available
 

AlteredSecurity 365-Stealer is the tool written in python3 which steals data from victims office365 by using access_token which we get by phishing. It steals outlook mails, attachments, OneDrive files, OneNote notes and injects macros.
 

nehalist Bot for crawling stock availability of RTX 3000 cards and tweeting about it
 

alirezamika AutoScraper: A Smart, Fast and Lightweight Automatic Web Scraper for Python
 

Lumorti A dungeon crawler designed for a quantum computer as a series of 17000 quantum gates.
 

kangvcar 支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、中国移动、中国联通、中国电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源中国博客、简书。
 

codingforentrepreneurs Scrape websites asynchronously with Python 3.8+, Asyncio, & arsenic (aka Selenium for Async).
 

SamPom100 scans every ticker on the market, gets their last 5 months of volume history, and alerts you when a stock's volume exceeds 10 standard deviations from the mean within the last 3 days
 

ArchiveBox 🗃 The open source self-hosted web archive. Takes browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
 

micha3lb3n Tested environments: Windows, MAC, linux, and windows subsystem for linux (WSL)
 

piyx A Script which adds all songs from youtube playlist to a new spotify playlist.
 

lemonpaul Simple Python script, that allow to import favorite tracks, playlists, albums and artists from Yandex.Music to Spotify
 

Gerapy This is a package for supporting pyppeteer in Scrapy, also this package is a module in Gerapy.
 

gusdnd852 A collection of useful Korean crawlers (always updated) 🌐