Fetch web pages using headless Chrome, storing all fetched resources including JavaScript files. Run arbitrary JavaScript on many web pages and see the returned values
GoSpider
GoSpider - Fast web spider written in Go
Installation
go get -u github.com/jaeles-project/gospider
Features
Fast web crawling
Brute force and parse sitemap.xml
Parse robots.txt
hakrawler
What is it?
hakrawler is a Go web crawler designed for easy, quick discovery of endpoints and assets within a web application. It can be used to discover:
Forms
Endpoints
Subdomains
Re
scrapyd-go
an drop-in replacement for scrapyd that is more easy to be scalable and distributed on any number of commodity machines with no hassle, each scrapyd-go instance is a stateless microservice, all instances m
Geziyor
Geziyor is a blazing fast web crawling and web scraping framework. It can be used to crawl websites and extract structured data from them. Geziyor is useful for a wide range of purposes such as data mining, moni
Scraply
Scraply a simple dom scraper to fetch information from any html based website using jQuery like syntax and convert that info to JSON APIs
How it works?
it works by simple define some macros/e
About chromedp
Package chromedp is a faster, simpler way to drive browsers in Go using the Chrome Debugging Protocol (for Chrome, Edge, Safari, etc) without external dependencies (ie, Selenium, PhantomJS, etc).
NOTE:
Ferret
What is it?
ferret is a web scraping system. It aims to simplify data extraction from the web for UI testing, machine learning, analytics and more. ferret allows users to focus o
Colly
Lightning Fast and Elegant Scraping Framework for Gophers
Colly provides a clean interface to write any kind of crawler/scraper/spider.
With Colly you can easily extract structured data from websites, which can
A golang, web screenshot utility using Chrome Headless.
introduction
gowitness is a website screenshot utility written in Golang, that uses Chrome Headless to generate screens
GOPA, A Spider Written in Go.
Goal
Light weight, low footprint, memory requirement should < 100MB
Easy to deploy, no runtime or dependency required
Easy to use, no programming or scripts ability needed,
DO NOT USE! This project does not work anymore due to changes in Chrome. Use the Chrome DevTools protocol in chromedp instead!
headlessChrome
🤖
Support only for Ubuntu on Docker for now. Mac appears t
Colly
Lightning Fast and Elegant Scraping Framework for Gophers
Colly provides a clean interface to write any kind of crawler/scraper/spider.
With Colly you can easily extract structured data from websites, which can
dcrawl
dcrawl is a simple, but smart, multi-threaded web crawler for randomly gathering huge lists of unique domain names.
How it works?
dcrawl takes one site URL as input and detects all <a href=...&
deprecation warning
active phantomjs development has ended, in favor of using Chrome's new headless functionality (reference). Instead of using this library, consider using a go package that uses this new api such as ch
gocrawl
gocrawl is a polite, slim and concurrent web crawler written in Go.
For a simpler yet more flexible web crawler written in a more idiomatic Go style, you may want to take a look at fetchbot, a package that bu
soup
Web Scraper in Go, similar to BeautifulSoup
soup is a small web scraper package for Go, with its interface highly similar to that of BeautifulSoup.
Exported variables and functions implemented till now :
v
About
Creeper is a next-generation crawler which fetches web page by creeper script. As a cross-platform embedded crawler, you can use it for your news app, subscribe program, etc.
Warning: At present this project
goscraper
Golang package to quickly return a preview of a webpage, you can get easily its title, description & images
Usage
func main() {
s, err := goscraper.Scrape("https://www.w3.org/", 5)
if err !=
go-selenium
Introduction
Yes, yet another Selenium Web Driver library has been brought to the table. This one, however, is slightly different.
Easy to understand.
Full test coverage by unit tests a
ants-go
open source, restful, distributed crawler engine
gitter
comming up
Persistence
Dynamic Master
design of ants-go
ants
I wrote a crawler engine named ants in