Commit graph

47 commits

Author SHA1 Message Date
Frédéric Guillot
7e5157f218 Rename alternative scheduler to entry_frequency 2020-05-25 15:12:47 -07:00
Shizun Ge
cead85b165
Add alternative scheduler based on the number of entries 2020-05-25 14:06:56 -07:00
Frédéric Guillot
3debf75eb9 Normalize URL query string before executing HTTP requests
- Make sure query strings parameters are encoded
- As opposed to the standard library, do not append equal sign
for query parameters with empty value
- Strip URL fragments like Web browsers
2019-12-26 15:56:59 -08:00
Frédéric Guillot
69aa650203 Add the possibility to add rules during feed creation 2019-11-29 11:27:58 -08:00
Frédéric Guillot
012138179c Add function storage.UpdateFeedError() 2018-12-15 13:04:38 -08:00
Frédéric Guillot
311a133ab8 Refactor manual entry scraper 2018-12-02 20:51:06 -08:00
Frédéric Guillot
85d48c8a71 Add entries storage error to feed errors count 2018-10-21 11:44:29 -07:00
Frédéric Guillot
b8f874a37d Simplify feed entries filtering
- Rename processor package to filter
- Remove boilerplate code
2018-10-14 22:33:19 -07:00
Frédéric Guillot
778346b0b0 Simplify feed fetcher
- Add browser package to handle HTTP errors
- Reduce code duplication
2018-10-14 21:43:48 -07:00
Frédéric Guillot
5870f04260 Simplify feed parser and format detection
- Avoid doing multiple buffer copies
- Move parser and format detection logic to its own package
2018-10-14 11:46:41 -07:00
Frédéric Guillot
9dc38a0803 Add missing package descriptions for GoDoc 2018-10-08 17:32:17 -07:00
Frédéric Guillot
b1e8f534ef Simplify locale package usage (refactoring) 2018-09-22 15:04:55 -07:00
Frédéric Guillot
beb7a0cfcb Use unique translation IDs instead of English text as key 2018-09-21 22:23:23 -07:00
Patrick
2538eea177 Add the possibility to override default user agent for each feed 2018-09-19 18:19:24 -07:00
Frédéric Guillot
dbcc5d8a97 Use canonical imports 2018-08-24 21:56:39 -07:00
Frédéric Guillot
9c0f882ba0 Add specific 404 and 401 error messages 2018-06-30 12:42:12 -07:00
Frédéric Guillot
bddca15b69 Add new fields for feed username/password 2018-06-19 22:58:29 -07:00
Frédéric Guillot
1eba1730d1 Move HTTP client to its own package 2018-04-28 10:51:07 -07:00
Frédéric Guillot
f110384f11 Improve parser error messages 2018-02-27 21:19:59 -08:00
Frédéric Guillot
953d0a2dc0 Support localized feed errors generated by background workers 2018-02-27 21:08:32 -08:00
Frédéric Guillot
dda9114692 Improve error handling for HTTP client 2018-02-08 18:16:54 -08:00
Frédéric Guillot
7b0bfd9308 Strip invalid XML characters to avoid parsing errors 2018-02-07 20:57:56 -08:00
Frédéric Guillot
c6fd9eb9b1 Remove period for feed errors 2018-02-07 19:10:36 -08:00
Frédéric Guillot
0fb87eba3f Improve error handling when the response is empty 2018-02-07 18:47:47 -08:00
Frédéric Guillot
b78172033f Show API URL endpoints in user interface 2018-01-31 21:57:20 -08:00
Frédéric Guillot
ffabb009b8 Do not override existing entries when the crawler is enabled 2018-01-20 14:04:19 -08:00
Frédéric Guillot
713b38e34c Handle more encoding edge cases
- Feeds with charset specified only in Content-Type header and not in XML document
- Feeds with charset specified in both places
- Feeds with charset specified only in XML document and not in HTTP header
2018-01-20 13:25:21 -08:00
Frédéric Guillot
3b62f904d6 Do not crawl existing entry URLs 2018-01-20 13:25:20 -08:00
Frédéric Guillot
7d278d49f1 Add content length check when refreshing feeds 2018-01-04 18:41:23 -08:00
Frédéric Guillot
ec63cbe7bb If the website URL is empty, assign the feed URL 2018-01-03 18:23:21 -08:00
Frédéric Guillot
c39f2e1a8d Rename helper packages 2018-01-02 19:15:08 -08:00
Frédéric Guillot
1d8193b892 Add logger 2017-12-15 18:55:57 -08:00
Frédéric Guillot
84d912c979 Rewrite imports 2017-12-12 21:48:13 -08:00
Frédéric Guillot
ef097f02fe Add the possibility to enable crawler for feeds 2017-12-12 19:19:36 -08:00
Frédéric Guillot
33445e5b68 Add the possibility to define rewrite rules for each feed 2017-12-11 22:16:32 -08:00
Frédéric Guillot
6f5350a497 Move packages http and url 2017-12-02 20:26:21 -08:00
Frédéric Guillot
bb8e61c7c5 Make sure golint pass on the code base 2017-11-27 21:40:05 -08:00
Frédéric Guillot
71bf7e4358 Improve API 2017-11-24 22:29:20 -08:00
Frédéric Guillot
d5838b6734 Move feed parsers packages in reader package 2017-11-20 19:17:04 -08:00
Frédéric Guillot
e91a9b4f13 Export only necessary structs in JsonFeed package 2017-11-20 18:57:54 -08:00
Frédéric Guillot
6618caca81 Use more idiomatic code for Atom parser 2017-11-20 18:50:16 -08:00
Frédéric Guillot
89307010ad Add parser for RDF feeds 2017-11-20 18:34:11 -08:00
Frédéric Guillot
aecda64030 Make sure XML feeds are always encoded in UTF-8 2017-11-20 17:12:37 -08:00
Frédéric Guillot
0e6717b7c8 Ensure that LocalizedError are returned by parsers 2017-11-20 16:11:55 -08:00
Frédéric Guillot
557cf9c21d Handle RSS entries with Atom links 2017-11-20 15:48:26 -08:00
Frédéric Guillot
cf8af56a99 Handle RSS feeds without entry links 2017-11-20 15:15:10 -08:00
Frédéric Guillot
8ffb773f43 First commit 2017-11-19 22:01:46 -08:00