miniflux

Author	SHA1	Message	Date
Andrew Williams	9974e0f458	Addition of scraper rule for wdwnt.com By default fetching original content for wdwnt.com results in a snippet of the comments section, this rule captures the article content.	2020-02-28 20:24:58 -08:00
Frédéric Guillot	997e9422eb	Ignore enclosures without URL	2020-01-30 21:18:49 -08:00
Frédéric Guillot	61f0c8aa66	Allow application/xhtml+xml links as comments URL in Atom replies	2020-01-04 16:07:06 -08:00
Frédéric Guillot	bf632fad2e	Allow only absolute URLs in comments URL Some feeds are using invalid URLs (random text).	2020-01-04 15:54:16 -08:00
Kebin Liu	8cebd985a2	Use internal XML workarounds to detect feed format	2020-01-02 22:19:15 -08:00
Frédéric Guillot	ac3c936820	Make sure whitelisted URI schemes are handled properly by the sanitizer	2020-01-02 11:03:51 -08:00
Frédéric Guillot	3debf75eb9	Normalize URL query string before executing HTTP requests - Make sure query strings parameters are encoded - As opposed to the standard library, do not append equal sign for query parameters with empty value - Strip URL fragments like Web browsers	2019-12-26 15:56:59 -08:00
Frédéric Guillot	200b1c304b	Improve Dublin Core support for RDF feeds	2019-12-23 14:45:58 -08:00
Frédéric Guillot	1b33bb3d1c	Improve Podcast support (iTunes and Google Play feeds) - Add support for Google Play XML namespace - Improve existing iTunes namespace implementation	2019-12-23 13:51:42 -08:00
Frédéric Guillot	33fdb2c489	Add support for Atom 0.3	2019-12-22 22:42:00 -08:00
Frédéric Guillot	cfb6ddfcea	Add support for Atom 'replies' link relation Show comments URL for Atom feeds as per RFC 4685. See https://tools.ietf.org/html/rfc4685#section-4 Note that only the first link with type "text/html" is taken into consideration.	2019-12-22 18:03:04 -08:00
cinput	8e1ed8bef3	Return outer HTML when scraping elements	2019-12-21 21:18:31 -08:00
somini	30f22fbd78	Update scraper rule for "Le Monde"	2019-12-19 18:35:29 -08:00
Jebbs	a155ab6deb	Filter valid XML characters for UTF-8 XML documents before decoding This change should reduce "illegal character code" XML errors.	2019-12-19 18:31:52 -08:00
Frédéric Guillot	a4ebb33cd5	Trim spaces for RDF entry links	2019-12-01 15:06:01 -08:00
Frédéric Guillot	120d6ec7d8	Do no rewrite Youtube description twice in "add_youtube_video" rule This is already done before in <media:description>.	2019-11-30 22:56:06 -08:00
Frédéric Guillot	69aa650203	Add the possibility to add rules during feed creation	2019-11-29 11:27:58 -08:00
Frédéric Guillot	912a98788e	Add support of media elements for Atom feeds	2019-11-28 23:55:40 -08:00
Frédéric Guillot	f90e9dfab0	Add support of media elements for RSS 2 feeds	2019-11-28 21:33:32 -08:00
Frédéric Guillot	c43c9458a9	Add rewrite functions: convert_text_link and nl2br	2019-11-28 21:33:12 -08:00
Neo Ng	90064a8cf0	Update scraper rule for openingsource.org	2019-11-28 19:40:26 -08:00
Tony Wang	2eb2441f2b	Improve XML decoder to remove illegal characters	2019-10-22 20:32:35 -07:00
Tony Wang	5517eebafe	Add new formats to date parser	2019-10-20 09:52:18 -07:00
Frédéric Guillot	36d7732234	Disable strict XML parsing This change should improve parsing of broken XML feeds. See https://golang.org/pkg/encoding/xml/#Decoder	2019-09-18 22:45:56 -07:00
Frédéric Guillot	934385ff55	Replace Travis by GitHub Actions	2019-09-15 11:48:15 -07:00
Frédéric Guillot	8d8f78241d	Add native lazy loading for images and iframes This feature is available only in Chrome >= 76 for now. See https://web.dev/native-lazy-loading	2019-09-10 21:22:19 -07:00
Peter De Wachter	b6f3160dbc	add_mailto_subject: New rewrite function Dinosaur Comics (qwantz.com) likes to hide jokes in mailto: links, but miniflux's sanitizer strips those out.	2019-08-19 19:42:47 -07:00
Frédéric Guillot	ac45307da6	Add test case for parsing HTML entities	2019-08-15 21:42:13 -07:00
Peter De Wachter	ea2b6e3608	addImageTitle: Fix HTML injection This rewrite rule would change this: <img title="<foo>"> to this: <figure><img><figcaption><foo></figcaption></figure> The image title needs to be properly escaped.	2019-08-15 21:39:41 -07:00
Peter De Wachter	3a39d110f0	Accept HTML entities when parsing XML Every once in a while, one of my feeds would throw an XML parse error because it used ` ` or some other HTML entity. I feel Miniflux should be lenient here, and Go already has a handy hook to make this work.	2019-08-15 21:26:07 -07:00
Ilya Glotov	c840268678	Sort feed categories before serialization A function is added for feeds and its categories normalization. The test will ensure that the order is right.	2019-07-05 20:34:49 +03:00
Frédéric Guillot	129f1bf3da	Add support for OPML v1 import	2019-03-26 20:09:31 -07:00
Jeremy Apthorp	304b43cb30	Add 'allow-popups' to iframe sandbox permissions	2019-03-26 18:26:56 -07:00
Frédéric Guillot	6764a420b0	Make parser compatible with Go 1.12 See changes in strings.Map(): https://golang.org/doc/go1.12#strings	2019-02-28 21:23:33 -08:00
Frédéric Guillot	f3fc8b7072	Use feed ID instead of user ID to check entry URLs presence	2019-02-28 20:43:33 -08:00
Frédéric Guillot	ed6ae7e0d2	Use preferably the published date for Atom feeds YouTube feeds use the published date for the original creation date.	2019-01-29 20:01:36 -08:00
Peter De Wachter	0cdcec10ca	More robust Atom text handling Miniflux couldn't deal with XHTML Summary elements. - Make Summary an 'atomContent' field - Define an atomContentToString function rather than inling it three times - Also properly escape special characters in plain text fields.	2019-01-07 17:55:02 -08:00
Frédéric Guillot	56efd2eb3f	Add workaround for non GMT dates (RFC822, RFC850, and RFC1123) RFC822, RFC850, and RFC1123 are supposed to be always in GMT. This is a workaround for the one defined in PST timezone.	2018-12-26 20:24:38 -08:00
Frédéric Guillot	012138179c	Add function storage.UpdateFeedError()	2018-12-15 13:04:38 -08:00
Tom Matthews	8b40778ee1	Add BBC News scraping rule	2018-12-13 20:25:30 -08:00
Frederic Guillot	61bfb3cfa8	Make password prompt compatible with Windows	2018-12-09 17:44:33 -08:00
Frédéric Guillot	1bc8535dbb	Move image proxy filter to template functions	2018-12-02 21:09:53 -08:00
Frédéric Guillot	6f5d93cbbe	Update scraper rule for lemonde.fr	2018-12-02 20:53:22 -08:00
Frédéric Guillot	311a133ab8	Refactor manual entry scraper	2018-12-02 20:51:06 -08:00
mapl	e47188eab2	Update scraper rule for heise.de	2018-12-01 11:49:30 -08:00
Frédéric Guillot	487852f07e	Replace daemon and scheduler package with service package	2018-11-11 15:32:48 -08:00
Frédéric Guillot	3b6e44c331	Allow the scraper to parse XHTML documents Only "text/html" was authorized before.	2018-11-03 13:44:13 -07:00
Frédéric Guillot	ae1dc1a91e	Handle more encoding conversion edge cases	2018-10-29 23:00:03 -07:00
Frédéric Guillot	7d1b471d88	Add test case to check different feed encoding and HTTP headers	2018-10-29 19:04:36 -07:00
Frédéric Guillot	85d48c8a71	Add entries storage error to feed errors count	2018-10-21 11:44:29 -07:00
Frédéric Guillot	b8f874a37d	Simplify feed entries filtering - Rename processor package to filter - Remove boilerplate code	2018-10-14 22:33:19 -07:00
Frédéric Guillot	778346b0b0	Simplify feed fetcher - Add browser package to handle HTTP errors - Reduce code duplication	2018-10-14 21:43:48 -07:00
Frédéric Guillot	5870f04260	Simplify feed parser and format detection - Avoid doing multiple buffer copies - Move parser and format detection logic to its own package	2018-10-14 11:46:41 -07:00
Frédéric Guillot	9606126196	Convert text links and line feeds to HTML in YouTube channels	2018-10-08 20:47:10 -07:00
Frédéric Guillot	9dc38a0803	Add missing package descriptions for GoDoc	2018-10-08 17:32:17 -07:00
Frédéric Guillot	11dfcdd3d6	Fix typo in license header	2018-10-08 15:50:15 -07:00
Frédéric Guillot	b1e8f534ef	Simplify locale package usage (refactoring)	2018-09-22 15:04:55 -07:00
Frédéric Guillot	beb7a0cfcb	Use unique translation IDs instead of English text as key	2018-09-21 22:23:23 -07:00
Patrick	2538eea177	Add the possibility to override default user agent for each feed	2018-09-19 18:19:24 -07:00
Frédéric Guillot	df2bebaf3d	Update scraper rule for heise.de	2018-08-25 10:33:18 -07:00
Frédéric Guillot	dbcc5d8a97	Use canonical imports	2018-08-24 21:56:39 -07:00
neepl	5365f31e90	Add support for published tag in Atom feeds	2018-07-17 21:52:05 -07:00
Frédéric Guillot	a786e78aca	Add embedly.com to iframe whitelist	2018-07-10 20:56:54 -07:00
dzaikos	6d25e02cb5	New `add_dynamic_image` rewriter for JavaScript-loaded images. Searches tags for various `data-*` attributes and sets `img` tag `src` attribute appropriately. Falls back to searching `noscript` for `img` tags. Includes unit tests.	2018-07-09 01:22:48 -04:00
dzaikos	e1c56b2e53	Processor: Do rewriter before sanitizer for `entry.Content`. Addresses #163.	2018-07-06 00:17:07 -04:00
Frédéric Guillot	de1a4aad30	Add support for protocol relative YouTube URLs	2018-07-04 22:45:44 -07:00
dzaikos	7d4a195519	Sandbox iframes when sanitizing. Updated iframe unit tests. Refactored sanitizer.getExtraAttributes() to use `switch` instead of multiple `if` statements.	2018-07-03 12:55:18 -07:00
Frédéric Guillot	9c0f882ba0	Add specific 404 and 401 error messages	2018-06-30 12:42:12 -07:00
dzaikos	45d7105ed1	Refactor AddImageTitle rewriter. * Only processes images with `src` and `title` attributes (others are ignored). * Processes all images in the document (not just the first one). * Wraps the image and its title attribute in a `figure` tag with the title attribute's contents in a `figcaption` tag. Updated xkcd rewriter unit test. Added another xkcd rewriter unit test to check rendering of images without title tags.	2018-06-26 17:50:18 -04:00
dzaikos	c9131b0e89	Improve sanitizer to remove style tag contents. See #157. Refactored how blacklisted tags are handled so they're easier manage in the future.	2018-06-24 19:53:23 -07:00
Dave Z	d847b10e32	Improve sanitizer to remove script and noscript contents These tags where removed but the content was rendered as escaped HTML. See #157	2018-06-23 17:50:43 -07:00
Frédéric Guillot	bddca15b69	Add new fields for feed username/password	2018-06-19 22:58:29 -07:00
Frédéric Guillot	c719cf7df0	Rewrite iframe Youtube URLs to https://www.youtube-nocookie.com	2018-06-12 18:45:09 -07:00
Frédéric Guillot	0c2e5ff0dc	Handle feeds with dates formatted as Unix timestamp	2018-05-08 20:41:24 -07:00
Frédéric Guillot	5cacae6cf2	Add API endpoint to import OPML file	2018-04-29 18:56:40 -07:00
Frédéric Guillot	1eba1730d1	Move HTTP client to its own package	2018-04-28 10:51:07 -07:00
aniran	322b265d7a	Scrape parent element for iframe Current behavior: if you have an `iframe` scraper rule, `scrapContent` tries to return the inner HTML of the `iframe`, which turns up blank. New behavior: like `img` elements, if an `iframe` is matched by a scraper rule, the parent element's inner HTML (i.e. the `iframe` is returned).	2018-04-27 17:57:22 -07:00
aniran	920dda79b7	Add soundcloud and bandcamp iframe sources	2018-04-27 17:55:58 -07:00
Frédéric Guillot	dcbb5047b1	Add support for Dublin Core date in RDF feeds	2018-04-10 18:13:05 -07:00
Frédéric Guillot	02ba735ba9	Handle some non-english date formats	2018-04-09 21:27:15 -07:00
Frédéric Guillot	e2d02bac5a	Rename RSS parser getters	2018-04-09 20:38:12 -07:00
Frédéric Guillot	f76093690c	Get the right comments URL when having multiple namespaces	2018-04-09 20:30:55 -07:00
Frédéric Guillot	702256bcc0	Add unit test for comments url and French translation	2018-04-07 13:56:11 -07:00
Ben Brooks	538d08c16c	Add CommentsURL to entry	2018-04-07 13:50:45 -07:00
Frédéric Guillot	6ea4da3bce	Handle RSS author elements with inner HTML	2018-03-18 11:57:46 -07:00
Frédéric Guillot	482785c5e6	Convert enclosure size field to bigint	2018-03-14 20:09:06 -07:00
Frédéric Guillot	ec08f45bf5	Fix broken OPML import with Go 1.10	2018-03-14 18:50:06 -07:00
Frédéric Guillot	f110384f11	Improve parser error messages	2018-02-27 21:19:59 -08:00
Frédéric Guillot	953d0a2dc0	Support localized feed errors generated by background workers	2018-02-27 21:08:32 -08:00
Frédéric Guillot	9292d5d604	Handle Atom feeds with HTML title	2018-02-17 12:21:58 -08:00
Frédéric Guillot	dda9114692	Improve error handling for HTTP client	2018-02-08 18:16:54 -08:00
Frédéric Guillot	7b0bfd9308	Strip invalid XML characters to avoid parsing errors	2018-02-07 20:57:56 -08:00
Frédéric Guillot	c6fd9eb9b1	Remove period for feed errors	2018-02-07 19:10:36 -08:00
Frédéric Guillot	0fb87eba3f	Improve error handling when the response is empty	2018-02-07 18:47:47 -08:00
Frédéric Guillot	b78172033f	Show API URL endpoints in user interface	2018-01-31 21:57:20 -08:00
Frédéric Guillot	ffabb009b8	Do not override existing entries when the crawler is enabled	2018-01-20 14:04:19 -08:00
Frédéric Guillot	713b38e34c	Handle more encoding edge cases - Feeds with charset specified only in Content-Type header and not in XML document - Feeds with charset specified in both places - Feeds with charset specified only in XML document and not in HTTP header	2018-01-20 13:25:21 -08:00
Frédéric Guillot	3b62f904d6	Do not crawl existing entry URLs	2018-01-20 13:25:20 -08:00
Frédéric Guillot	9652dfa1fe	Add more comments (GoDoc)	2018-01-11 19:21:20 -08:00
Frédéric Guillot	1d7fe892e1	Add scraper rule for darkreading.com	2018-01-06 13:25:12 -08:00

1 2 3 4

190 commits