miniflux

Author	SHA1	Message	Date
Frédéric Guillot	93715b542c	Revert "scraper follow the only link" This reverts commit `10207967c4`.	2022-11-14 17:45:40 -08:00
Frédéric Guillot	de1a06e3e8	Add missing check in followTheOnlyLink() that leads to a panic Bug introduced in PR #1290. Fixes #1631.	2022-11-14 16:44:02 -08:00
jebbs	10207967c4	scraper follow the only link * in some cases, what the scraper got is only a landing page, user can use scraper rules to extract the link of the landing page and follow it * it also fix the wrong scrape rule apply when the server redirects it to another host	2022-10-31 19:49:34 -07:00
hulb	01f678c3b1	add proxy arg in scraper.Fetch	2021-08-28 21:57:11 -07:00
Darius	9242350f0e	Add per feed cookies option	2021-03-22 20:27:58 -07:00
Frédéric Guillot	ec3c604a83	Add option to allow self-signed or invalid certificates	2021-02-21 13:58:52 -08:00
Frédéric Guillot	c394a61a4e	Add Prometheus exporter	2020-09-27 20:04:48 -07:00
Frédéric Guillot	16b7b3bc3e	http client: remove dependency on global config options	2020-09-27 14:37:46 -07:00
cinput	8e1ed8bef3	Return outer HTML when scraping elements	2019-12-21 21:18:31 -08:00
Frédéric Guillot	311a133ab8	Refactor manual entry scraper	2018-12-02 20:51:06 -08:00
Frédéric Guillot	3b6e44c331	Allow the scraper to parse XHTML documents Only "text/html" was authorized before.	2018-11-03 13:44:13 -07:00
Frédéric Guillot	5870f04260	Simplify feed parser and format detection - Avoid doing multiple buffer copies - Move parser and format detection logic to its own package	2018-10-14 11:46:41 -07:00
Patrick	2538eea177	Add the possibility to override default user agent for each feed	2018-09-19 18:19:24 -07:00
Frédéric Guillot	dbcc5d8a97	Use canonical imports	2018-08-24 21:56:39 -07:00
Frédéric Guillot	1eba1730d1	Move HTTP client to its own package	2018-04-28 10:51:07 -07:00
aniran	322b265d7a	Scrape parent element for iframe Current behavior: if you have an `iframe` scraper rule, `scrapContent` tries to return the inner HTML of the `iframe`, which turns up blank. New behavior: like `img` elements, if an `iframe` is matched by a scraper rule, the parent element's inner HTML (i.e. the `iframe` is returned).	2018-04-27 17:57:22 -07:00
Frédéric Guillot	3c3f397bf5	Make sure the scraper parse only HTML documents	2018-01-02 18:32:01 -08:00
Frédéric Guillot	1d8193b892	Add logger	2017-12-15 18:55:57 -08:00
Frédéric Guillot	c6d9eb3614	Improve content scraper	2017-12-13 21:30:40 -08:00
Frédéric Guillot	84d912c979	Rewrite imports	2017-12-12 21:48:13 -08:00
Frédéric Guillot	ef097f02fe	Add the possibility to enable crawler for feeds	2017-12-12 19:19:36 -08:00
Frédéric Guillot	87ccad5c7f	Add scraper rules	2017-12-10 20:51:04 -08:00
Frédéric Guillot	7a35c58f53	Add readability package to fetch original content	2017-12-10 19:01:38 -08:00

23 commits