Consider the feed http://planet.haskell.org/atom.xml
- This is a UTF-8 encoded XML file
- No encoding declaration in the XML header
- No Unicode byte order mark
- Served with HTTP Content-Type "text/xml" (no charset parameter)
Miniflux lets charset.NewReader handle this. The charset package
implements the HTML5 character encoding algorithm, which, in this
situation, defaults to windows-1252 encoding if there are no UTF-8
characters in the first 1000 bytes. So for this feed, we get the wrong
encoding.
I inserted an explicit "utf8.Valid()" check, which fixes this problem.