A high-level distributed crawling framework.

Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structured data from websites. It provides simple and fast yet flexible
Category: Python / Web Crawling
Watchers: 168
Star: 1.4k
Fork: 548
Last update: Oct 14, 2021

Related Repos

bhargav794 hacktoberfest-practice-music-playlists Note:-This repo is not included in the Hacktoberfest event, as it is for practice only. If you have any trouble

novitae S T E R R A X C Y L A python program that allows you to put in an excel/csv table followers and/or following of an aimed account, with detailed infos

Drovosek01 What is this and for what This is a script that allows you to download portable installers of programs from Adobe for macOS with different versions. T

AcierP discord-account-generator An hcaptcha-solving discord account generator; capable of randomizing names, profile pictures, and verifying phone numbers.

dchoruzy Welcome to hltv-data 👋 🎮 Data from popular CS:GO website hltv.org Install pip install hltv-data Usage The public methods can be reached using HLTVCl

gautamajay52 UdemyBot - A Simple Udemy Free Courses Scrapper

truroshan Auto Slot Booking when there is a vaccine slot available at your location, by running a script on your phone.

youtuberscafe Awesome Youtubers Top 50 most subscribed youtube channels Geographical top YouTubers of Africa top Youtubers of Asia top Youtubers of Europe top Youtu

DIGITALCRIMINAL Scrape all the media from an OnlyFans account - Updated regularly

Absolute-Tinkerer CLAPI A basic API to scrape Craigslist. Most useful for viewing posts across a broad geographic area or for viewing posts within a specific timeframe.

p-ranav saveddit is a bulk media downloader for reddit.

Irish606 cuii-blocked Dieses Repository enthält eine Liste von Webseiten, die durch die Selbstregulierung Informationswirtschaft e.V. für eine DNS-Sperrung emp

fhamborg news-please is an open source, easy-to-use news crawler that extracts structured information from almost any news website. It can recursively follow internal hyperlinks and read RSS feeds to fetch both most recent and also old, archived articles.

Boris-code feapder 是一款上手简单,功能强大的Python爬虫框架,使用方式类似scrapy,方便由scrapy框架切换过来

srcrs 联通手机营业厅自动完成每日任务,领流量、签到获取积分等,月底流量不发愁。