Instead of having to allocate a ~100 keys map containing possibly dynamic
values (at least to the go compiler), allocate it once in a global variable.
This significantly speeds things up, by reducing the garbage
collector/allocator involvements.
Local synthetic benchmarks have shown a improvements from 38% of wall time to only
12%.
- `make([]a, b)` create a slice of `b` elements `a`
- `make([]a, b, c)` create a slice of `0` elements `a`, but reserve space for `c` of them
When using `append` on the former, it will result on a slice with `b` leading
elements, which is unlikely to be what we want. This commit replaces the two
instances where this happens with the latter construct.
Go 1.22 introduced a new [for-range](https://go.dev/ref/spec#For_range)
construct that looks a tad better than the usual `for i := 0; i < N; i++`
construct. I also tool the liberty of replacing some
`for i := 0; i < len(myitemsarray); i++ { … myitemsarray[i] …}`
with `for item := range myitemsarray` when `myitemsarray` contains only pointers.
- Use a simple regex to parse data uri instead of a hand-rolled parser, and
document what fields are considered mandatory.
- Use case-insensitive matching to find (fav)icons, instead of doing the same
query twice with different letter cases
- Add 'apple-touch-icon-precomposed.png' as a fallback favicon
- Reorder the queries to have i`con` first, since it seems to be the most
popular one. It used to be last, meaning that pages had to be parsed
completely 4 times, instead of one now.
- Minor factorisation in findIconURLsFromHTMLDocument
- Split dates formats into those that require local times
and those who don't, so that there is no need to have a switch-case in the
for loop with around 250 iterations at most.
- Be more strict when it comes to timezones, previously invalid ones like -13
were accepted. Also add a test for this.
- Bail out early if the date is an empty string.
- make findContentUsingCustomRules' more idiomatic,
since in golang a function returning an error might
return garbage in other parameter. Moreover, ignoring
errors is bad practise.
- getPredefinedScraperRules is now running in constant-time,
instead of iterating on a list with around 50 items in it.
- Surface `localizedError` in FindSubscriptionsFromWellKnownURLs via slog
- Use an inline declaration for new subscriptions, like done elsewhere in the
file, if only for consistency's sake
- Preallocate the `subscriptions` slice when using an RSS-bridge,
it's a good practise, and it might even marginally improve
performances when adding __a lot__ of feeds via an rss-bridge instance, wooo!
- `NOT (hash=ANY(%4))` can be expressed as `hash NOT IN $4`
- There is no need for a subquery operating on the same table,
moving the conditions out is equivalent.
No need for a `BETWEEN`: we want to filter on entries published in the last
week, no need to express is as "entries published between now and last week",
"entries published after last week" is enough.
- Use constant time access for maps instead of iterating on them
- Build a ~large whitelist map inline instead of constructing it item by item
(and remove a duplicate key/value pair)
- Use `slices` instead of hand-rolled loops
Given that there is always a ton of `Entry` floating around, reordering its
field to take less space is a quick/simple way to reduce miniflux' memory
consumption.
I kept the `ID` field as the first member, as I think it's the most important
one, and moving it somewhere else would drown it in other fields.
Anyway, this still provides a reduction of 32 bytes per Entry:
```console
$ fieldalignment ./client/model.go 2>&1 | grep 203
~/v2/client/model.go:203:12: struct with 280 pointer bytes could be 240
$ fieldalignment ./client/model.go 2>&1 | grep 203
~/v2/client/model.go:203:12: struct with 248 pointer bytes could be 240
$
```
The same optimisation pass could be applied to other structs, but since they
aren't present in obviously great numbers during miniflux' life cycle, it would
likely require some profiling to see if it's worth doing it.