- Refactorise the tests and add some
- Use 250 signs instead of the whole text
- Only check for Korean, Chinese and Japanese script
- Add a benchmark
- Use a more idiomatic control flow
```console
$ # main branch
$ go test -bench=.
goos: linux
goarch: amd64
pkg: miniflux.app/v2/internal/reader/readingtime
BenchmarkEstimateReadingTime-12 267 4821268 ns/op
PASS
ok miniflux.app/v2/internal/reader/readingtime 1.754s
$ # speed_up_reading_time branch
$ go test -bench=.
goos: linux
goarch: amd64
pkg: miniflux.app/v2/internal/reader/readingtime
cpu: 12th Gen Intel(R) Core(TM) i7-1265U
BenchmarkEstimateReadingTime-12 1941 653312 ns/op
PASS
ok miniflux.app/v2/internal/reader/readingtime 1.342s
$
```
If the user doesn't display reading times, there is no need to compute them.
This should speed things up a bit, since `whatlanggo.Detect` is abysmally slow.
Instead of having to allocate a ~100 keys map containing possibly dynamic
values (at least to the go compiler), allocate it once in a global variable.
This significantly speeds things up, by reducing the garbage
collector/allocator involvements.
Local synthetic benchmarks have shown a improvements from 38% of wall time to only
12%.
- `make([]a, b)` create a slice of `b` elements `a`
- `make([]a, b, c)` create a slice of `0` elements `a`, but reserve space for `c` of them
When using `append` on the former, it will result on a slice with `b` leading
elements, which is unlikely to be what we want. This commit replaces the two
instances where this happens with the latter construct.
Go 1.22 introduced a new [for-range](https://go.dev/ref/spec#For_range)
construct that looks a tad better than the usual `for i := 0; i < N; i++`
construct. I also tool the liberty of replacing some
`for i := 0; i < len(myitemsarray); i++ { … myitemsarray[i] …}`
with `for item := range myitemsarray` when `myitemsarray` contains only pointers.
- Use a simple regex to parse data uri instead of a hand-rolled parser, and
document what fields are considered mandatory.
- Use case-insensitive matching to find (fav)icons, instead of doing the same
query twice with different letter cases
- Add 'apple-touch-icon-precomposed.png' as a fallback favicon
- Reorder the queries to have i`con` first, since it seems to be the most
popular one. It used to be last, meaning that pages had to be parsed
completely 4 times, instead of one now.
- Minor factorisation in findIconURLsFromHTMLDocument
- Split dates formats into those that require local times
and those who don't, so that there is no need to have a switch-case in the
for loop with around 250 iterations at most.
- Be more strict when it comes to timezones, previously invalid ones like -13
were accepted. Also add a test for this.
- Bail out early if the date is an empty string.
- make findContentUsingCustomRules' more idiomatic,
since in golang a function returning an error might
return garbage in other parameter. Moreover, ignoring
errors is bad practise.
- getPredefinedScraperRules is now running in constant-time,
instead of iterating on a list with around 50 items in it.
- Surface `localizedError` in FindSubscriptionsFromWellKnownURLs via slog
- Use an inline declaration for new subscriptions, like done elsewhere in the
file, if only for consistency's sake
- Preallocate the `subscriptions` slice when using an RSS-bridge,
it's a good practise, and it might even marginally improve
performances when adding __a lot__ of feeds via an rss-bridge instance, wooo!
- `NOT (hash=ANY(%4))` can be expressed as `hash NOT IN $4`
- There is no need for a subquery operating on the same table,
moving the conditions out is equivalent.