miniflux

Author	SHA1	Message	Date
Manuel Müller	ca918bc7e3	Added scraper rule for dilbert.com and turnoff.us	2020-06-10 20:15:46 -07:00
Corey McCaffrey	25d4b9fc0c	Added scraper rule for financialsamurai.com The default rule results in blank content.	2020-05-24 13:29:28 -07:00
Corey McCaffrey	0683074b8b	Added scraper rule for TheOatmeal.com The default rule does not show the comic posted to the feed. The comic image is in a div with id "comic".	2020-05-13 21:28:00 -07:00
Corey McCaffrey	8f6c07afd6	Added scraper rule for RayWenderlich.com RayWenderlich.com is a popular developer's community for iOS and Android developers. The default rule results in "GROUP GROUP GROUP GROUP…" instead of the content posted on the blog.	2020-05-13 21:28:00 -07:00
Andrew Williams	9974e0f458	Addition of scraper rule for wdwnt.com By default fetching original content for wdwnt.com results in a snippet of the comments section, this rule captures the article content.	2020-02-28 20:24:58 -08:00
cinput	8e1ed8bef3	Return outer HTML when scraping elements	2019-12-21 21:18:31 -08:00
somini	30f22fbd78	Update scraper rule for "Le Monde"	2019-12-19 18:35:29 -08:00
Neo Ng	90064a8cf0	Update scraper rule for openingsource.org	2019-11-28 19:40:26 -08:00
Tom Matthews	8b40778ee1	Add BBC News scraping rule	2018-12-13 20:25:30 -08:00
Frédéric Guillot	6f5d93cbbe	Update scraper rule for lemonde.fr	2018-12-02 20:53:22 -08:00
Frédéric Guillot	311a133ab8	Refactor manual entry scraper	2018-12-02 20:51:06 -08:00
mapl	e47188eab2	Update scraper rule for heise.de	2018-12-01 11:49:30 -08:00
Frédéric Guillot	3b6e44c331	Allow the scraper to parse XHTML documents Only "text/html" was authorized before.	2018-11-03 13:44:13 -07:00
Frédéric Guillot	5870f04260	Simplify feed parser and format detection - Avoid doing multiple buffer copies - Move parser and format detection logic to its own package	2018-10-14 11:46:41 -07:00
Frédéric Guillot	9dc38a0803	Add missing package descriptions for GoDoc	2018-10-08 17:32:17 -07:00
Patrick	2538eea177	Add the possibility to override default user agent for each feed	2018-09-19 18:19:24 -07:00
Frédéric Guillot	df2bebaf3d	Update scraper rule for heise.de	2018-08-25 10:33:18 -07:00
Frédéric Guillot	dbcc5d8a97	Use canonical imports	2018-08-24 21:56:39 -07:00
Frédéric Guillot	1eba1730d1	Move HTTP client to its own package	2018-04-28 10:51:07 -07:00
aniran	322b265d7a	Scrape parent element for iframe Current behavior: if you have an `iframe` scraper rule, `scrapContent` tries to return the inner HTML of the `iframe`, which turns up blank. New behavior: like `img` elements, if an `iframe` is matched by a scraper rule, the parent element's inner HTML (i.e. the `iframe` is returned).	2018-04-27 17:57:22 -07:00
Frédéric Guillot	1d7fe892e1	Add scraper rule for darkreading.com	2018-01-06 13:25:12 -08:00
Frédéric Guillot	48aa0d07ef	Add more scraper rules	2018-01-04 19:32:24 -08:00
Frédéric Guillot	3c3f397bf5	Make sure the scraper parse only HTML documents	2018-01-02 18:32:01 -08:00
Frédéric Guillot	c454f67037	Add scraper rules for version2.dk and ing.dk	2017-12-27 19:44:23 -08:00
Frédéric Guillot	d4839b5597	Add more scraper rules	2017-12-27 13:36:07 -08:00
Frédéric Guillot	1d8193b892	Add logger	2017-12-15 18:55:57 -08:00
Frédéric Guillot	c6d9eb3614	Improve content scraper	2017-12-13 21:30:40 -08:00
Frédéric Guillot	84d912c979	Rewrite imports	2017-12-12 21:48:13 -08:00
Frédéric Guillot	ef097f02fe	Add the possibility to enable crawler for feeds	2017-12-12 19:19:36 -08:00
Frédéric Guillot	87ccad5c7f	Add scraper rules	2017-12-10 20:51:04 -08:00
Frédéric Guillot	7a35c58f53	Add readability package to fetch original content	2017-12-10 19:01:38 -08:00

31 commits