Commit graph

1726 commits

Author SHA1 Message Date
jvoisin
3d0126be0b Speed the sanitizer up a bit, again
- allow youtube urls to start with `www`
- use `strings.Builder` instead of a `bytes.Buffer`
- use a `strings.NewReader` instead of a `bytes.NewBufferString`
- sprinkles a couple of `continue` to make the code-flow more obvious
- inline calls to `inList`, and put their parameters in the right order
- simplify isPixelTracker
- simplify `isValidIframeSource`, by extracting the hostname and comparing it
  directly, instead of using the full url and checking if it starts with
  multiple variations of the same one (`//`, `http:`, `https://` multiplied by
  ``/`www.`)
- add a benchmark
2024-03-05 19:31:50 -08:00
dependabot[bot]
eda2e2f3f5 Bump golang.org/x/oauth2 from 0.17.0 to 0.18.0
Bumps [golang.org/x/oauth2](https://github.com/golang/oauth2) from 0.17.0 to 0.18.0.
- [Commits](https://github.com/golang/oauth2/compare/v0.17.0...v0.18.0)

---
updated-dependencies:
- dependency-name: golang.org/x/oauth2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-05 15:39:07 -08:00
jvoisin
111e3f2106 Reuse a Reader instead of copying to a buffer when parsing an atom feed 2024-03-04 17:36:10 -08:00
dependabot[bot]
c1ec77a42c Bump golang.org/x/net from 0.21.0 to 0.22.0
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.21.0 to 0.22.0.
- [Commits](https://github.com/golang/net/compare/v0.21.0...v0.22.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-04 16:48:02 -08:00
jvoisin
3339d9d3d7 Preallocate memory when exporting to OPML
This should marginally increase performance when export a large amount of feeds
to OPML.
2024-03-03 20:34:37 -08:00
jvoisin
8d80e9103f Delay call of view.New after logging the user in
There is no need to do extra work like creating a session and its associated
view until the user has been properly identified and as many possibly-failing sql request have been successfully run.
2024-03-03 20:32:15 -08:00
jvoisin
d55b410800 Use constant-time comparison for anti-csrf tokens
This is probably completely overkill, but since anti-csrf tokens are secrets,
they should be compared against untrusted inputs in constant time.
2024-03-03 20:28:13 -08:00
jvoisin
9fe99ce7fa Simplify and optimize genericProxyRewriter
- Reduce the amount of nested loops: it's preferable to search the whole page
  once and filter on it (even with filters that should always be false),
  than searching it again for every element we're looking for.
- Factorize the proxying conditions into a `shouldProxy` function to reduce the
  copy-pasta.
2024-03-03 20:25:47 -08:00
Thiago Perrotta
b8df6c31a0 sort integrations alphabetically 2024-03-03 20:19:42 -08:00
Frédéric Guillot
abdd5876a1 Move search form to a dedicated page 2024-03-01 16:56:15 -08:00
Frédéric Guillot
1b5edfc00a Add unit test to ensure each translation has the correct number of plurals 2024-02-29 20:44:08 -08:00
jvoisin
347740dce1 Speed up removeUnlikelyCandidates
`.Not` returns a brand new Selection, copied element by element.
2024-02-29 19:38:43 -08:00
jvoisin
ab85d4d678 Improve EstimateReadingTime's speed by a factor 7
- Refactorise the tests and add some
- Use 250 signs instead of the whole text
- Only check for Korean, Chinese and Japanese script
- Add a benchmark
- Use a more idiomatic control flow

```console
$ # main branch
$ go test -bench=.
goos: linux
goarch: amd64
pkg: miniflux.app/v2/internal/reader/readingtime
BenchmarkEstimateReadingTime-12              267           4821268 ns/op
PASS
ok      miniflux.app/v2/internal/reader/readingtime     1.754s
$ # speed_up_reading_time branch
$ go test -bench=.
goos: linux
goarch: amd64
pkg: miniflux.app/v2/internal/reader/readingtime
cpu: 12th Gen Intel(R) Core(TM) i7-1265U
BenchmarkEstimateReadingTime-12             1941            653312 ns/op
PASS
ok      miniflux.app/v2/internal/reader/readingtime     1.342s
$
```
2024-02-29 19:24:15 -08:00
jvoisin
31ac62f410 Don't compute reading-time when unused
If the user doesn't display reading times, there is no need to compute them.
This should speed things up a bit, since `whatlanggo.Detect` is abysmally slow.
2024-02-29 19:14:17 -08:00
Frédéric Guillot
97765b93a9 Revert "Minor internal/reader/readability/readability.go speedup"
This reverts commit 4db138d4b8.

```
panic: runtime error: index out of range [-1]

goroutine 49 [running]:
miniflux.app/v2/internal/reader/readability.getArticle.func1(0x8?, 0xc000b56570)
        /home/fred/repos/miniflux/v2/internal/reader/readability/readability.go:120 +0x2ac
github.com/PuerkitoBio/goquery.(*Selection).Each(0xc000b56510, 0xc000892fa8)
        /home/fred/go/pkg/mod/github.com/!puerkito!bio/goquery@v1.9.0/iteration.go:10 +0x62
miniflux.app/v2/internal/reader/readability.getArticle(0xc00044f1f0, 0xc000a04a50)
        /home/fred/repos/miniflux/v2/internal/reader/readability/readability.go:101 +0x15d
miniflux.app/v2/internal/reader/readability.ExtractContent({0x1005d00?, 0xc0001522d0?})
        /home/fred/repos/miniflux/v2/internal/reader/readability/readability.go:91 +0x211
miniflux.app/v2/internal/reader/scraper.ScrapeWebsite(0xc000893688?, {0xc0007ce720, 0x54}, {0x0, 0x0})
        /home/fred/repos/miniflux/v2/internal/reader/scraper/scraper.go:63 +0x859
miniflux.app/v2/internal/reader/processor.ProcessFeedEntries(0xc000133188, 0xc000502c40, 0xc0003e6360, 0x0)
        /home/fred/repos/miniflux/v2/internal/reader/processor/processor.go:77 +0x8ea
miniflux.app/v2/internal/reader/handler.RefreshFeed(0xc000133188, 0x10cf, 0x52d5c, 0x0)
        /home/fred/repos/miniflux/v2/internal/reader/handler/handler.go:301 +0x1485
miniflux.app/v2/internal/cli.refreshFeeds.func1(0x0)
        /home/fred/repos/miniflux/v2/internal/cli/refresh_feeds.go:59 +0x2d7
created by miniflux.app/v2/internal/cli.refreshFeeds in goroutine 1
        /home/fred/repos/miniflux/v2/internal/cli/refresh_feeds.go:50 +0x5d5
```
2024-02-29 19:06:03 -08:00
dependabot[bot]
f858ad5f26 Bump github.com/PuerkitoBio/goquery from 1.9.0 to 1.9.1
Bumps [github.com/PuerkitoBio/goquery](https://github.com/PuerkitoBio/goquery) from 1.9.0 to 1.9.1.
- [Release notes](https://github.com/PuerkitoBio/goquery/releases)
- [Commits](https://github.com/PuerkitoBio/goquery/compare/v1.9.0...v1.9.1)

---
updated-dependencies:
- dependency-name: github.com/PuerkitoBio/goquery
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-29 18:36:57 -08:00
jvoisin
e6524f925f Simplify username generation for the tests
No need to generate random numbers 10 times, generate a single big-enough one.
A single int64 should be more than enough
2024-02-29 18:36:34 -08:00
Frédéric Guillot
c493f8921e Add missing regex anchor detected by CodeQL 2024-02-28 20:50:17 -08:00
Frédéric Guillot
b2ce98da87 Add missing plurals for some languages 2024-02-28 20:38:10 -08:00
jvoisin
4db138d4b8 Minor internal/reader/readability/readability.go speedup
- Don't use a capturing group in `divToPElementsRegexp`
- Remove a duplicate condition
- Replace a regex with a fixed-comparison and a `Contains`
2024-02-28 20:03:14 -08:00
jvoisin
f12d5131b0 Divide the sanitization time by 3
Instead of having to allocate a ~100 keys map containing possibly dynamic
values (at least to the go compiler), allocate it once in a global variable.
This significantly speeds things up, by reducing the garbage
collector/allocator involvements.

Local synthetic benchmarks have shown a improvements from 38% of wall time to only
12%.
2024-02-28 20:00:13 -08:00
jvoisin
1f5c8ce353 Don't mix up capacity and length
- `make([]a, b)` create a slice of `b` elements `a`
- `make([]a, b, c)` create a slice of `0` elements `a`, but reserve space for `c` of them

When using `append` on the former, it will result on a slice with `b` leading
elements, which is unlikely to be what we want. This commit replaces the two
instances where this happens with the latter construct.
2024-02-28 19:57:30 -08:00
jvoisin
645a817685 Use modern for loops
Go 1.22 introduced a new [for-range](https://go.dev/ref/spec#For_range)
construct that looks a tad better than the usual `for i := 0; i < N; i++`
construct. I also tool the liberty of replacing some
`for i := 0; i < len(myitemsarray); i++ { … myitemsarray[i] …}`
with  `for item := range myitemsarray` when `myitemsarray` contains only pointers.
2024-02-28 19:55:28 -08:00
jvoisin
f4f8342245 Remove a superfluous condition
No need to check if the length of `line` is positive since we're checking
afterwards that it contains the `=` sign.
2024-02-28 19:47:30 -08:00
jvoisin
543a690bfd Close resources as soon as possible, instead of using defer() in a loop
So that resources can be freed as soon as they're not used anymore, instead of
waiting for the two nested loops to finish.
2024-02-28 19:47:30 -08:00
jvoisin
c4e5dad549 Remove superfluous escaping in a regex 2024-02-28 19:47:30 -08:00
jvoisin
fa12c23d79 Use strings.ReplaceAll instead of strings.Replace(…, -1) 2024-02-28 19:47:30 -08:00
jvoisin
4fe902a5d2 Use strings.EqualFold instead of strings.ToLower(…) == 2024-02-28 19:47:30 -08:00
jvoisin
61af08a721 Use .WriteString( instead of .Write([]byte(… 2024-02-28 19:47:30 -08:00
jvoisin
b04550e2f2 Use %q instead of "%s" 2024-02-28 19:47:30 -08:00
jvoisin
5e5cb056c5 Make internal/worker/worker.go read-only
Since workers don't communicate anything back to the pool with the channel,
there is no need to have it bidirectional.
2024-02-28 19:39:03 -08:00
jvoisin
48fa64f8ec Use a switch-case construct in internal/locale/plural.go instead of an avalanche of if-if-if-if-if
Less lines or code and marginally greater readability, yay!
Oh and also preallocate a map in LoadCatalogMessages just because we can.
2024-02-28 19:36:38 -08:00
jvoisin
f274394f0e Simplify formatFileSize
No need to use a loop with divisions and multiplications when we have logarithms.
2024-02-28 19:32:38 -08:00
jvoisin
9a4a942cc4 Simplify durationImpl 2024-02-28 19:32:38 -08:00
jvoisin
6b3b8e8c9b Inline some templating functions 2024-02-28 19:32:38 -08:00
jvoisin
5a7d6f8997 Make use of printer.Print when possible 2024-02-28 19:24:41 -08:00
jvoisin
b4ed17fbac Add a printer.Print to internal/locale/printer.go
No need to use variadic functions with string format interpolation
to generate static strings.
2024-02-28 19:24:41 -08:00
dependabot[bot]
57476f4d59 Bump github.com/prometheus/client_golang from 1.18.0 to 1.19.0
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.18.0 to 1.19.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/v1.19.0/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.18.0...v1.19.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-27 21:25:42 -08:00
jvoisin
7660910232 Use prepared statement for intervals 2024-02-27 21:25:25 -08:00
jvoisin
b054506e3a Use proper prepared statements for ArchiveEntries 2024-02-27 21:25:25 -08:00
jvoisin
c961c6db7d Use proper prepared statement for updateEnclosures 2024-02-27 21:25:25 -08:00
Frédéric Guillot
0f126d4d11 Fix CodeQL workflow 2024-02-27 21:01:38 -08:00
jvoisin
b94756bbf0 Add a warning for StripTags 2024-02-27 20:41:47 -08:00
jvoisin
db6ae707ef Add some tests for add_image_title
I'm not sure if the behaviour is expected, but I didn't manage to
get the content injection to work in my browser, so I guess it's alright?
2024-02-27 20:41:15 -08:00
Frédéric Guillot
97feec8ebf Add more URL validation in media proxy 2024-02-26 20:29:40 -08:00
jvoisin
bce21a9f91 Remove github.com/google/uuid
Replace it with a hand-rolled implementation. Heck, an UUID isn't even a
requirement, according to [omnivore](https://docs.omnivore.app/integrations/api.html#saving-a-url-with-the-api)'s
documentation, any "unique id" would do.
2024-02-26 18:31:12 -08:00
jvoisin
06e256e5ef Simplify internal/reader/icon/finder.go
- Use a simple regex to parse data uri instead of a hand-rolled parser, and
  document what fields are considered mandatory.
- Use case-insensitive matching to find (fav)icons, instead of doing the same
  query twice with different letter cases
- Add 'apple-touch-icon-precomposed.png' as a fallback favicon
- Reorder the queries to have i`con` first, since it seems to be the most
  popular one. It used to be last, meaning that pages had to be parsed
  completely 4 times, instead of one now.
- Minor factorisation in findIconURLsFromHTMLDocument
2024-02-26 18:18:04 -08:00
jvoisin
040938ff6d Small refactoring of internal/reader/date/parser.go
- Split dates formats into those that require local times
  and those who don't, so that there is no need to have a switch-case in the
  for loop with around 250 iterations at most.
- Be more strict when it comes to timezones, previously invalid ones like -13
  were accepted. Also add a test for this.
- Bail out early if the date is an empty string.
2024-02-26 18:08:04 -08:00
dependabot[bot]
21da7f77f5 Bump golang.org/x/crypto from 0.19.0 to 0.20.0
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.19.0 to 0.20.0.
- [Commits](https://github.com/golang/crypto/compare/v0.19.0...v0.20.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-26 18:01:00 -08:00
jvoisin
c2d2f31438 Improve a bit internal/reader/scraper/scraper.go
- make findContentUsingCustomRules' more idiomatic,
  since in golang a function returning an error might
  return garbage in other parameter. Moreover, ignoring
  errors is bad practise.
- getPredefinedScraperRules is now running in constant-time,
  instead of iterating on a list with around 50 items in it.
2024-02-26 18:00:23 -08:00