Frédéric Guillot
97765b93a9
Revert "Minor internal/reader/readability/readability.go speedup"
...
This reverts commit 4db138d4b8
.
```
panic: runtime error: index out of range [-1]
goroutine 49 [running]:
miniflux.app/v2/internal/reader/readability.getArticle.func1(0x8?, 0xc000b56570)
/home/fred/repos/miniflux/v2/internal/reader/readability/readability.go:120 +0x2ac
github.com/PuerkitoBio/goquery.(*Selection).Each(0xc000b56510, 0xc000892fa8)
/home/fred/go/pkg/mod/github.com/!puerkito!bio/goquery@v1.9.0/iteration.go:10 +0x62
miniflux.app/v2/internal/reader/readability.getArticle(0xc00044f1f0, 0xc000a04a50)
/home/fred/repos/miniflux/v2/internal/reader/readability/readability.go:101 +0x15d
miniflux.app/v2/internal/reader/readability.ExtractContent({0x1005d00?, 0xc0001522d0?})
/home/fred/repos/miniflux/v2/internal/reader/readability/readability.go:91 +0x211
miniflux.app/v2/internal/reader/scraper.ScrapeWebsite(0xc000893688?, {0xc0007ce720, 0x54}, {0x0, 0x0})
/home/fred/repos/miniflux/v2/internal/reader/scraper/scraper.go:63 +0x859
miniflux.app/v2/internal/reader/processor.ProcessFeedEntries(0xc000133188, 0xc000502c40, 0xc0003e6360, 0x0)
/home/fred/repos/miniflux/v2/internal/reader/processor/processor.go:77 +0x8ea
miniflux.app/v2/internal/reader/handler.RefreshFeed(0xc000133188, 0x10cf, 0x52d5c, 0x0)
/home/fred/repos/miniflux/v2/internal/reader/handler/handler.go:301 +0x1485
miniflux.app/v2/internal/cli.refreshFeeds.func1(0x0)
/home/fred/repos/miniflux/v2/internal/cli/refresh_feeds.go:59 +0x2d7
created by miniflux.app/v2/internal/cli.refreshFeeds in goroutine 1
/home/fred/repos/miniflux/v2/internal/cli/refresh_feeds.go:50 +0x5d5
```
2024-02-29 19:06:03 -08:00
dependabot[bot]
f858ad5f26
Bump github.com/PuerkitoBio/goquery from 1.9.0 to 1.9.1
...
Bumps [github.com/PuerkitoBio/goquery](https://github.com/PuerkitoBio/goquery ) from 1.9.0 to 1.9.1.
- [Release notes](https://github.com/PuerkitoBio/goquery/releases )
- [Commits](https://github.com/PuerkitoBio/goquery/compare/v1.9.0...v1.9.1 )
---
updated-dependencies:
- dependency-name: github.com/PuerkitoBio/goquery
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-29 18:36:57 -08:00
jvoisin
e6524f925f
Simplify username generation for the tests
...
No need to generate random numbers 10 times, generate a single big-enough one.
A single int64 should be more than enough
2024-02-29 18:36:34 -08:00
Frédéric Guillot
c493f8921e
Add missing regex anchor detected by CodeQL
2024-02-28 20:50:17 -08:00
Frédéric Guillot
b2ce98da87
Add missing plurals for some languages
2024-02-28 20:38:10 -08:00
jvoisin
4db138d4b8
Minor internal/reader/readability/readability.go speedup
...
- Don't use a capturing group in `divToPElementsRegexp`
- Remove a duplicate condition
- Replace a regex with a fixed-comparison and a `Contains`
2024-02-28 20:03:14 -08:00
jvoisin
f12d5131b0
Divide the sanitization time by 3
...
Instead of having to allocate a ~100 keys map containing possibly dynamic
values (at least to the go compiler), allocate it once in a global variable.
This significantly speeds things up, by reducing the garbage
collector/allocator involvements.
Local synthetic benchmarks have shown a improvements from 38% of wall time to only
12%.
2024-02-28 20:00:13 -08:00
jvoisin
1f5c8ce353
Don't mix up capacity and length
...
- `make([]a, b)` create a slice of `b` elements `a`
- `make([]a, b, c)` create a slice of `0` elements `a`, but reserve space for `c` of them
When using `append` on the former, it will result on a slice with `b` leading
elements, which is unlikely to be what we want. This commit replaces the two
instances where this happens with the latter construct.
2024-02-28 19:57:30 -08:00
jvoisin
645a817685
Use modern for loops
...
Go 1.22 introduced a new [for-range](https://go.dev/ref/spec#For_range )
construct that looks a tad better than the usual `for i := 0; i < N; i++`
construct. I also tool the liberty of replacing some
`for i := 0; i < len(myitemsarray); i++ { … myitemsarray[i] …}`
with `for item := range myitemsarray` when `myitemsarray` contains only pointers.
2024-02-28 19:55:28 -08:00
jvoisin
f4f8342245
Remove a superfluous condition
...
No need to check if the length of `line` is positive since we're checking
afterwards that it contains the `=` sign.
2024-02-28 19:47:30 -08:00
jvoisin
543a690bfd
Close resources as soon as possible, instead of using defer() in a loop
...
So that resources can be freed as soon as they're not used anymore, instead of
waiting for the two nested loops to finish.
2024-02-28 19:47:30 -08:00
jvoisin
c4e5dad549
Remove superfluous escaping in a regex
2024-02-28 19:47:30 -08:00
jvoisin
fa12c23d79
Use strings.ReplaceAll instead of strings.Replace(…, -1)
2024-02-28 19:47:30 -08:00
jvoisin
4fe902a5d2
Use strings.EqualFold
instead of strings.ToLower(…) ==
2024-02-28 19:47:30 -08:00
jvoisin
61af08a721
Use .WriteString( instead of .Write([]byte(…
2024-02-28 19:47:30 -08:00
jvoisin
b04550e2f2
Use %q
instead of "%s"
2024-02-28 19:47:30 -08:00
jvoisin
5e5cb056c5
Make internal/worker/worker.go read-only
...
Since workers don't communicate anything back to the pool with the channel,
there is no need to have it bidirectional.
2024-02-28 19:39:03 -08:00
jvoisin
48fa64f8ec
Use a switch-case construct in internal/locale/plural.go instead of an avalanche of if-if-if-if-if
...
Less lines or code and marginally greater readability, yay!
Oh and also preallocate a map in LoadCatalogMessages just because we can.
2024-02-28 19:36:38 -08:00
jvoisin
f274394f0e
Simplify formatFileSize
...
No need to use a loop with divisions and multiplications when we have logarithms.
2024-02-28 19:32:38 -08:00
jvoisin
9a4a942cc4
Simplify durationImpl
2024-02-28 19:32:38 -08:00
jvoisin
6b3b8e8c9b
Inline some templating functions
2024-02-28 19:32:38 -08:00
jvoisin
5a7d6f8997
Make use of printer.Print when possible
2024-02-28 19:24:41 -08:00
jvoisin
b4ed17fbac
Add a printer.Print to internal/locale/printer.go
...
No need to use variadic functions with string format interpolation
to generate static strings.
2024-02-28 19:24:41 -08:00
dependabot[bot]
57476f4d59
Bump github.com/prometheus/client_golang from 1.18.0 to 1.19.0
...
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang ) from 1.18.0 to 1.19.0.
- [Release notes](https://github.com/prometheus/client_golang/releases )
- [Changelog](https://github.com/prometheus/client_golang/blob/v1.19.0/CHANGELOG.md )
- [Commits](https://github.com/prometheus/client_golang/compare/v1.18.0...v1.19.0 )
---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-27 21:25:42 -08:00
jvoisin
7660910232
Use prepared statement for intervals
2024-02-27 21:25:25 -08:00
jvoisin
b054506e3a
Use proper prepared statements for ArchiveEntries
2024-02-27 21:25:25 -08:00
jvoisin
c961c6db7d
Use proper prepared statement for updateEnclosures
2024-02-27 21:25:25 -08:00
Frédéric Guillot
0f126d4d11
Fix CodeQL workflow
2024-02-27 21:01:38 -08:00
jvoisin
b94756bbf0
Add a warning for StripTags
2024-02-27 20:41:47 -08:00
jvoisin
db6ae707ef
Add some tests for add_image_title
...
I'm not sure if the behaviour is expected, but I didn't manage to
get the content injection to work in my browser, so I guess it's alright?
2024-02-27 20:41:15 -08:00
Frédéric Guillot
97feec8ebf
Add more URL validation in media proxy
2024-02-26 20:29:40 -08:00
jvoisin
bce21a9f91
Remove github.com/google/uuid
...
Replace it with a hand-rolled implementation. Heck, an UUID isn't even a
requirement, according to [omnivore](https://docs.omnivore.app/integrations/api.html#saving-a-url-with-the-api )'s
documentation, any "unique id" would do.
2024-02-26 18:31:12 -08:00
jvoisin
06e256e5ef
Simplify internal/reader/icon/finder.go
...
- Use a simple regex to parse data uri instead of a hand-rolled parser, and
document what fields are considered mandatory.
- Use case-insensitive matching to find (fav)icons, instead of doing the same
query twice with different letter cases
- Add 'apple-touch-icon-precomposed.png' as a fallback favicon
- Reorder the queries to have i`con` first, since it seems to be the most
popular one. It used to be last, meaning that pages had to be parsed
completely 4 times, instead of one now.
- Minor factorisation in findIconURLsFromHTMLDocument
2024-02-26 18:18:04 -08:00
jvoisin
040938ff6d
Small refactoring of internal/reader/date/parser.go
...
- Split dates formats into those that require local times
and those who don't, so that there is no need to have a switch-case in the
for loop with around 250 iterations at most.
- Be more strict when it comes to timezones, previously invalid ones like -13
were accepted. Also add a test for this.
- Bail out early if the date is an empty string.
2024-02-26 18:08:04 -08:00
dependabot[bot]
21da7f77f5
Bump golang.org/x/crypto from 0.19.0 to 0.20.0
...
Bumps [golang.org/x/crypto](https://github.com/golang/crypto ) from 0.19.0 to 0.20.0.
- [Commits](https://github.com/golang/crypto/compare/v0.19.0...v0.20.0 )
---
updated-dependencies:
- dependency-name: golang.org/x/crypto
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-26 18:01:00 -08:00
jvoisin
c2d2f31438
Improve a bit internal/reader/scraper/scraper.go
...
- make findContentUsingCustomRules' more idiomatic,
since in golang a function returning an error might
return garbage in other parameter. Moreover, ignoring
errors is bad practise.
- getPredefinedScraperRules is now running in constant-time,
instead of iterating on a list with around 50 items in it.
2024-02-26 18:00:23 -08:00
jvoisin
5b2558bf92
Miscellaneous improvements to internal/reader/subscription/finder.go
...
- Surface `localizedError` in FindSubscriptionsFromWellKnownURLs via slog
- Use an inline declaration for new subscriptions, like done elsewhere in the
file, if only for consistency's sake
- Preallocate the `subscriptions` slice when using an RSS-bridge,
it's a good practise, and it might even marginally improve
performances when adding __a lot__ of feeds via an rss-bridge instance, wooo!
2024-02-26 17:52:21 -08:00
jvoisin
ecd59009fb
Add a couple of new possible locations for feeds
...
- Hugo likes to generate index.xml
- feed.atom and feed.rss are used by enterprise-scale/old-school gigantic CMS
2024-02-26 17:43:51 -08:00
jvoisin
4a943b722d
Add a couple of fuzzers
2024-02-26 17:23:49 -08:00
Frédéric Guillot
9d1b1e19d4
Google Reader: Do not return a 500 error when no items is returned
2024-02-25 21:17:49 -08:00
Frédéric Guillot
7a8061fc72
Fix regression introduced in PR #2402
2024-02-25 20:45:34 -08:00
jvoisin
bca84bac8b
Use an update-where for MarkCategoryAsRead instead of a subquery
2024-02-25 17:50:30 -08:00
jvoisin
66e0eb1bd6
Reformat's ArchiveEntries's query for consistency's sake
...
And replace the `=ANY` with an `IN`
2024-02-25 17:50:30 -08:00
jvoisin
26d189917e
Simplify cleanupEntries' query
...
- `NOT (hash=ANY(%4))` can be expressed as `hash NOT IN $4`
- There is no need for a subquery operating on the same table,
moving the conditions out is equivalent.
2024-02-25 17:50:30 -08:00
jvoisin
ccd3955bf4
Format GetReadTime's query for consistency's sake
2024-02-25 17:50:30 -08:00
jvoisin
8a2cc3a344
Reformat the query in GetEntryIDs
...
To make it more consistent with how all the other are formatted
2024-02-25 17:50:30 -08:00
jvoisin
647fa025f8
Simplify WeeklyFeedEntryCount
...
No need for a `BETWEEN`: we want to filter on entries published in the last
week, no need to express is as "entries published between now and last week",
"entries published after last week" is enough.
2024-02-25 17:50:30 -08:00
jvoisin
1955350318
Build the map inline in CountAllFeeds()
...
No need to build an empty map to then add more fields in it one by one.
2024-02-25 17:50:30 -08:00
jvoisin
04916a57d2
Simplify CleanOldUserSessions' query
...
No need for a subquery, filtering on `created_at` directly is enough.
2024-02-25 17:50:30 -08:00
jvoisin
0adac5c6f7
Minor code simplification in internal/ui/view/view.go
...
No need to create the map item by item when we
can create it in one go.
2024-02-25 17:31:44 -08:00