Instead of having to allocate a ~100 keys map containing possibly dynamic
values (at least to the go compiler), allocate it once in a global variable.
This significantly speeds things up, by reducing the garbage
collector/allocator involvements.
Local synthetic benchmarks have shown a improvements from 38% of wall time to only
12%.
Go 1.22 introduced a new [for-range](https://go.dev/ref/spec#For_range)
construct that looks a tad better than the usual `for i := 0; i < N; i++`
construct. I also tool the liberty of replacing some
`for i := 0; i < len(myitemsarray); i++ { … myitemsarray[i] …}`
with `for item := range myitemsarray` when `myitemsarray` contains only pointers.
- Use a simple regex to parse data uri instead of a hand-rolled parser, and
document what fields are considered mandatory.
- Use case-insensitive matching to find (fav)icons, instead of doing the same
query twice with different letter cases
- Add 'apple-touch-icon-precomposed.png' as a fallback favicon
- Reorder the queries to have i`con` first, since it seems to be the most
popular one. It used to be last, meaning that pages had to be parsed
completely 4 times, instead of one now.
- Minor factorisation in findIconURLsFromHTMLDocument
- Split dates formats into those that require local times
and those who don't, so that there is no need to have a switch-case in the
for loop with around 250 iterations at most.
- Be more strict when it comes to timezones, previously invalid ones like -13
were accepted. Also add a test for this.
- Bail out early if the date is an empty string.
- make findContentUsingCustomRules' more idiomatic,
since in golang a function returning an error might
return garbage in other parameter. Moreover, ignoring
errors is bad practise.
- getPredefinedScraperRules is now running in constant-time,
instead of iterating on a list with around 50 items in it.
- Surface `localizedError` in FindSubscriptionsFromWellKnownURLs via slog
- Use an inline declaration for new subscriptions, like done elsewhere in the
file, if only for consistency's sake
- Preallocate the `subscriptions` slice when using an RSS-bridge,
it's a good practise, and it might even marginally improve
performances when adding __a lot__ of feeds via an rss-bridge instance, wooo!
- Use constant time access for maps instead of iterating on them
- Build a ~large whitelist map inline instead of constructing it item by item
(and remove a duplicate key/value pair)
- Use `slices` instead of hand-rolled loops
As per [OPML 2.0 specification]:
> Each sub-element of the body of the OPML document is a node of type rss or an outline element that contains nodes of type rss.
> Required attributes: type, text, xmlUrl.
[OPML 2.0 specification]: http://opml.org/spec2.opml#subscriptionLists
The recent HTTP client refactor in 14e25ab9fe
caused feed refreshes to no longer make conditional requests. Prior to
the refactor, `client.WithCacheHeaders` handled this. Now this function
is split into `fetcher.WithETag` and `fetcher.WithLastModified` but
these functions are only declared and never actually used. Fix this by
calling them inside `handler.RefreshFeed`.
The recent HTTP client refactor in 14e25ab9fe
introduced a bug in which the global default User-Agent is no longer
used for requests. Unless a per-feed User-Agent exists, the Go standard
library's default User-Agent is used, which looks something like
"Go-http-client/1.1". To fix this, make RequestBuilder.WithUserAgent
take an additional argument, the default User-Agent, which will be used
if there is no per-feed User-Agent (i.e. it is an empty string).
Fixes#2188Fixes#2189